The Macro: The AI Agent Space Is Crowded and Most of It Is Vaporware
I am going to say something that might get me in trouble. Most AI agent products do not work. Not in the way they are marketed, at least. The demos look incredible. The landing pages promise autonomous workflows that handle complex business processes end to end. And then you sign up, connect your tools, and discover that the agent can do about three things reliably and hallucinates on everything else.
The hype around AI agents has been building for two years and the gap between promise and reality remains enormous. Adept raised $350 million to build an AI that uses software the way humans do. Inflection raised $1.3 billion for a personal AI assistant. Neither product has become a daily-use tool for the average knowledge worker. The smaller players, the Bardeen and Zapier AI and Make AI cohort, have bolted agent features onto existing automation platforms with mixed results.
The fundamental challenge is not intelligence. The models are smart enough. The challenge is reliability at the edges. An agent that correctly processes 95% of tasks sounds great until you realize that the 5% failure rate means one in twenty actions goes wrong. In business operations, one wrong action can create a cascade of problems. Send the wrong Slack message to the wrong channel. Update the wrong field in a CRM record. Pull the wrong data for a financial report. The cost of fixing agent errors can exceed the cost of doing the work manually.
So the market is large, the demand is real, and the existing solutions mostly disappoint. That is either an opportunity or a warning depending on whether the next product actually works.
The companies I would watch in this space are Relevance AI, which is building agent workflows for specific business functions, and Lindy AI, which takes a similar approach to Doe with multi-agent orchestration. Both are serious products with real traction. The question for any new entrant is what they do differently and why it matters.
The Micro: Forty Integrations, SOC 2, and a Price Point That Demands Results
Doe is an AI operations platform that deploys agents to handle repetitive business tasks. You connect your tools, describe what you want done, and the agents execute across your stack. The pitch is “Work 5.0,” which is a branding choice I will set aside. The substance is more interesting than the branding.
Adrian Barbir is CEO and Richard Ou is CTO. They are part of Y Combinator’s Summer 2025 batch, based in San Francisco. The company was founded in January 2024, which gives them about 18 months of development time before their current launch push. That is a meaningful runway for an agent product because agent reliability requires iteration time. You cannot ship a reliable agent system in three months.
The integration count is 40+, which puts Doe in the upper range of agent platforms at this stage. The specific tools matter more than the count, and I could not verify the full list from the website. But broad integration coverage is table stakes for an agent product because agents are only as useful as the systems they can access.
The compliance story is notable. SOC 2 Type II, HIPAA compliance, SSO, and SCIM support. For a startup this early, that is an unusually heavy investment in enterprise readiness. It signals that they are selling upmarket from day one. The $500 per user per month price point confirms this. They are not trying to be a $20 per month productivity tool. They are selling to teams where $500 per user is cheaper than the hours that user currently spends on manual operations work.
That math works if the agents actually save five or more hours per week. A knowledge worker at a mid-market company costs $60 to $100 per hour fully loaded. Five hours saved is $300 to $500 per week in labor value. At $500 per month, the product pays for itself in the first week if it delivers on the promise.
The dedicated tools, analytics, spreadsheets, and deep research, suggest that Doe is building proprietary capabilities alongside the integration layer. That is the right approach. Pure integration plays become commodity fast. Proprietary capabilities create moats.
What I cannot evaluate from the outside is the failure rate. How often do the agents make mistakes? How are errors surfaced and corrected? What happens when an agent encounters an edge case it has never seen before? These are the questions that separate real agent products from demo-ware. The enterprise compliance story tells me the team understands that errors in a business context have consequences. Whether the product handles those consequences gracefully is the open question.
The Verdict
Doe is making a bet that the AI agent market is ready for a premium product that actually works. The pricing, the compliance infrastructure, and the integration breadth all point toward a team that is building for enterprise adoption rather than consumer experimentation.
I respect the price point. In a market full of free trials and $20 per month subscriptions, charging $500 per user forces a conversation about value. Either the product saves enough time to justify the cost or it does not. There is no middle ground at that price. The customer will know within a month whether they are getting their money’s worth.
In thirty days, I want to know the average number of tasks per user per day. If users are running ten or more agent tasks daily, the product is sticky. If usage drops after the first week, the agents are not reliable enough to trust with real work.
In sixty days, the question is expansion within accounts. The first user at a company is the experiment. The second, third, and fourth users are the validation. If companies are buying additional seats, the product works. If they are not, one person tried it and the team decided it was not worth the cost.
In ninety days, I want to see case studies with specific time savings. Not “our customers save hours every week” but “this specific customer automated these specific workflows and saved this many hours.” Specificity is the antidote to agent hype. Doe’s price point demands it.