From Theory to Practice: What I Learned Building an AI-First App

In the past, I’ve written about a new paradigm in software development: one where we move away from UI-driven workflows and defer application orchestration to AI. The core idea is that AI agents can interpret user intent through natural language and directly orchestrate backend services, third-party APIs, and vendor tools to fulfill requests. Instead of building countless UI screens and custom integrations for every possible workflow, we let AI handle the coordination layer. This approach should reduce development time by eliminating much of the UI building process, improve user experience through natural interaction, and reduce app proliferation by allowing direct integration of tools and vendors rather than forcing everything through browser-based interfaces.

To put this theory to the test, I decided to transform one of my personal projects, famshift, into a purely AI-based application. Famshift is a small app I built to help my family manage basic household tasks like chore management, ride planning, event scheduling, and shopping lists. This project provided the perfect environment to explore the practical implications of an AI-first approach.

When you remove browser constraints, you reveal user needs you never knew existed

Unsurprisingly, my results were nuanced. The core hypotheses from my previous post were validated—development velocity increased dramatically and users interacted with the system in ways traditional UIs couldn’t accommodate. When you remove browser constraints, you reveal user needs you never knew existed. However, when AI handles orchestration, you’re no longer dictating exactly how users interact with your system—you’re enabling a range of possible interactions. The practical reality of stewarding rather than controlling user experience demands new approaches to testing, cost management, performance optimization.

Building the AI First Version

To put this theory to a real test, I simplified the user experience of famshift into two core interfaces: a chat panel and a rendering panel. This is similar to what you might see in a tool like Claude, except that the rendering panel is always available and shows something based on context (desktop only, mobile will render things inline). The chat panel allows users to enter commands and queries using natural language, while the rendering panel creates custom visualizations like dashboards or graphics based on those queries. Instead of building UI experiences, we let AI generate the perfect view for the user on demand. A list of chores can be presented as a table or a grid of cards. We can render the view to sort by deadline or importance. We can ask AI to apply high contrast and larger text if that is our need. The view can be generated that is deeply tailored to the needs of the user.

To get started quickly, I used a LangChain template with Vercel, which comes with pre-built generative AI front end components. I added a database for persistent storage of all household data. All database operations are encapsulated within a dedicated service layer, which is then used by the API and the agent’s tools.

I built a single ReAct agent with a set of service enabled tools to perform all operations. When a user types a message, it hits the agent, which then iteratively uses the system prompt and available tools to satisfy the request. I also built a few minimal UI components for initial onboarding and a few advanced features that I’ll discuss below.

Unlocking User Experience

Natural language offers a much more flexible and fluid way to interact with your data. In a typical browser application, you can only perform actions that are explicitly offered by the interface. Natural language, however, offers many ways to accomplish the same goal and even more importantly accomplish them in a streamlined manner.

A common task in famshift is creating a shopping list. In a traditional app, if you wanted to create a shopping list for the weekend, then you’d likely navigate to a dedicated page where you manage this data. You will have an option to create a list, set some sort of a target date and add items to it. You would add items one by one and optionally set other parameters like quantity or size. All of this would be reliant on the app-provided experience.

In the AI-based system, we don’t have one prescribed way to accomplish this. I can ask the agent to add an item to a list, and it can figure out if a new list needs to be created or use an existing one based on the context. I can ask my agent to create a list with 5-10 items in one statement, which AI will orchestrate for me.

The other cool aspect of using an agent to orchestrate basic commands is that it can choose sensible defaults. Eggs usually come in a dozen, so an agent can infer that and specify this when adding an item on your behalf. This natural interaction is much harder to construct with a traditional browser app.

Understanding user needs has always been one of the most difficult challenges in software development. In large organizations, this is especially difficult, as user feedback is passed through multiple layers of “telephone game” before it is defined, prioritized, and implemented.

Instead of building interfaces that constrain users to our mental models, we create systems that adapt to theirs

An AI-first system can help alleviate this by allowing users to orchestrate available tools in ways that are hard to predict. For example, analytics for a household would likely be a secondary feature in a traditional app. But with an AI-first system, much of this functionality can be available out of the box.

Users can ask simple queries about historical data, and the AI agent can use its tools to look at the history and perform basic aggregations. In a traditional browser app, you would have to pre-build all of this functionality.

Did you know that your users might want to estimate the cost of a shopping list? I didn’t. But with an AI first app, this feature is available out of the box. The example below shows user estimating the cost of all of the items in their shopping list. It’s a nifty feature, but given how notoriously bad AI can be with math and to avoid any hallucinations, we should add a tool for a calculator and one that allows us to get current prices for items.

This represents a fundamental inversion of traditional software design. Instead of building interfaces that constrain users to our mental models, we create systems that adapt to theirs. When users describe what they want in natural language, AI orchestrates existing tools to deliver functionality we never explicitly built—bypassing the entire cycle of feature planning, design, and development. The result is discovering user needs we never knew existed, fulfilled without ever writing specific code for them.

The need for new patterns

With the new design we need solutions for new interaction patterns. I will discuss a few of the areas that I had to solve for here, but there are definitely other areas I have yet to tackle.

In the original browser-based app, I built an extensive onboarding workflow with a multi-step wizard to set up a household and manage invitations. With the AI-first approach, I opted for a notification system where the AI chat bot prompts the user to get set up.

This new experience is very streamlined. When a user logs in, they’re greeted with a message notifying them that some setup is needed. Since the AI has access to the user’s information, it can often suggest very reasonable defaults. Having AI engage the user on a subject is one of the UX patterns that I find very useful with this system.

The rendering panel was another addition that required a new UX pattern. While the original idea was to have AI render all of the views dynamically based on request from the user, I found that there are certain limitations. The rendering-only experience works well for read only cases. For example, I introduced a way to dynamically render basic AI generated markdown, json and html that doesn’t involve any backend operations.

However, there are some traditional UI experiences that require “editing” that are ultimately easier than asking AI to make changes. For a shopping list, for example, it can be faster to click a checkbox multiple times to mark an item as collected than it is to ask an agent to do so. So I opted to create a UI component that handles the check list and let AI decide, based on context, when a user might want to use it. For example, if I am at the store on my shopping run then I might want to use the check list. I do believe that with addition of voice commands, this equation might change. Voice commands may remove the need to find the item on the list and streamline this process even further. But for now, I conclude that some experiences must be optimized to the use case.

While I considered dynamically generating an editing experience for the shopping list, I quickly ran into security concerns. I was not willing to entertain running dynamically generated code in a browser with access to backend operations.

The Challenges

One interesting consequence of letting AI orchestrate everything is the non-deterministic nature of the app’s behavior. There were several cases where the AI went “off-script.” In my onboarding flow, for example, I found that sometimes the agent automatically created a household without first prompting the user. On its own it is not a big issue. As we continued to ask AI agent questions, we can learn more about the set up and ask it to make changes that we ultimately want. What it highlights however, is that we may not be in full control of every permutation of experience that is possible.

Coming from traditional application development, this seems really scary. The idea that we can release the application to the user where we are not in control of the experience and cannot offer it deterministically is quite challenging. In my opinion there is a change of relationship in such system. Instead of creators of very specific experience, the development team turns into stewards of it.

Instead of creators of very specific experience, the development team turns into stewards of it.

Traditional application forces the view of application developers on the user. We struggle to find balance of understanding the user and offering them ways to interact with the application in the way that it was “intended”. AI first system allows the user to interact with it in the way that the user wants, which can be different per user. But it comes with a consequence. We cannot predict the individual interaction of each user.

Instead of attempting to apply control, I believe that we should release it. We must embrace being stewards. Discover the most dangerous paths that are possible, ensure that no damage can be done, but otherwise release the control to let users interact with the system the way they want to. In order to accomplish this we must take advantage of existing practices like carefully constructed system prompts, testing and develop and embrace new ones like synthetic testing via AI generated prompts. I will explore these practices in more detail later.

Performance, Cost and Security

Putting every transaction through an AI agent is going to make your users wait longer per operation than what you would expect from a browser app. Simple operations like looking up a list of items can take 9-12 seconds of wait time, whereas in the browser app waiting for a page to load a list of items is a sub-second operation. Using natural language, however, offers certain shortcuts that can make more complex operations much faster. The aforementioned example of creating a shopping list and adding a set of items with their sensible defaults can be done in one user operation. Under those circumstances the performance of the AI based app starts to show considerable advantages. My recommendation is to design experiences that balance trade offs of both approaches. Give your users quick access to most common operations straight from the AI chat that is deterministic and bypasses the agent. In reality if the operation doesn’t require orchestration, there is no need to involve the agent.

Cost is another area of concern. Again, if we put every transaction through the agent then the cost is going to increase significantly. Doing some simple math, I suspect that in famshift, the additional cost to operate AI based app is roughly $4 – $10 per user. It’s a big range, because there are new ways to optimize for the cost. Aside from input caching, AI models incentivize reducing the output size, so configuring a more succinct and structured output from AI model can offer advantages. There is also a benefit to breaking down the agent into multiple agents. Purely from the perspective of cost, a single agent would pass around a large system prompt every transaction. The system prompt is large because we have to define the behavior for all parts and processes of our application. Breaking down into multiple agents, allows us to have a more specific system prompt that focuses on their area of responsibility. Shopping agent will not need system instructions for event management, ride sharing, etc. Finally, we should not forget about fundamental practices like rate limiting. You don’t want your users to overwhelm your system and skyrocket your costs.

Detailed Cost Breakdown

With a natural language interface, you must also be careful about what a user might try to do or say. Prevent access to data of other users or other form of prompt inection. A major defense is to use services for all tools, which allows you to limit what’s possible. For example, never allow direct SQL access to your database. I put various scenarios to the test. I tried accessing another user’s information using direct IDs to see if I could trick the AI and I generally found it handled the requests well. But, of course, more advanced system-breaking attempts must be included in the test suite we discussed earlier.

Summary

This experiment with AI-first architecture taught me that we’re at an inflection point in how we think about software development.

On the positive side, development velocity was remarkable. Features that would have taken weeks to build in a traditional UI emerged naturally from the AI’s ability to orchestrate existing tools. The flexibility of user experience is profound —I can interact with the system in ways that feels natural rather than being constrained by predetermined workflows.

But the challenges are equally real. The non-deterministic nature of AI behavior requires a fundamental shift in mindset from controlling user experience to stewarding it. Performance and cost considerations mean this approach isn’t suitable for every application, at least not yet. And the need for comprehensive testing of conversational patterns introduces complexity that traditional UI testing doesn’t address.

What excites me most is that this feels like early days. As AI models become faster and cheaper, and as we develop better patterns for hybrid AI-traditional UI systems, I believe we’ll see more applications adopting this approach. The question isn’t whether AI will change how we build software—it’s how quickly we can learn to build responsibly with these new capabilities.

For now, I’m convinced enough by the experiment to continue down this path. Once I add voice support and polish the mobile experience, my family will be the real test of whether an AI-first household management app can replace traditional interfaces.