The Macro: AI Writes Code Faster Than Humans Can Test It
There is a dirty secret in modern software development. AI coding assistants like Cursor, Windsurf, and Devin have made it dramatically faster to ship features. They have also made it dramatically faster to ship bugs. The velocity gains from AI-assisted development do not automatically come with quality gains. If anything, the ratio has gotten worse.
The testing stack has not kept up. Most companies still rely on some combination of unit tests written by developers (who skip them when deadlines are tight), integration tests maintained by QA teams (who are increasingly understaffed), and manual testing by product managers (who click through the happy path and call it done). Cypress and Playwright automate browser testing, but someone still has to write and maintain the test scripts. Selenium has been around for nearly two decades and still breaks in creative ways. BrowserStack and LambdaTest handle cross-browser compatibility but not the kind of exploratory testing that catches unexpected user behavior.
The result is predictable. Bugs ship. Users find them. Support tickets pile up. The company patches the issue and moves on, hoping nobody churned in the meantime. This cycle has always existed, but AI-accelerated development has compressed the timeline. Features that used to take a sprint now take a day. The testing that used to take a sprint still takes a sprint. The gap between shipping speed and testing speed is widening.
What the industry actually needs is testing that scales the same way code generation scales. Not more test scripts. Not more QA headcount. Testing that is itself AI-native, that can explore an application the way a curious, slightly confused, occasionally malicious human would.
The Micro: Two Dropouts Who Can Actually Code
Kavan Doctor and Aaron Chew founded Synthetic Society out of Y Combinator’s Summer 2025 batch. Doctor is an MIT CS dropout who was ranked number one in mathematics in Canada. Chew is a Columbia CS dropout, a USACO finalist, and was ranked in the top twenty programmers in the United States. He was the first hire at fun.xyz and a Z-Fellow. The team is two people.
I want to pause on those credentials for a second. Not because dropout founder stories are inherently interesting (they are not, they are overused as narrative devices), but because this particular product requires deep technical chops in both AI and software engineering. Synthetic Society is not a wrapper around an LLM that generates test cases. It deploys swarms of synthetic users that mimic real user behavior. That means understanding how humans actually interact with software: the hesitations, the wrong clicks, the back button mashing, the form submissions with unexpected characters, the workflows that no product manager anticipated.
The platform catches bugs, UX flaws, and broken flows before real users experience them. The pitch is straightforward: instead of writing test scripts that verify the happy path, you release synthetic users into your application and let them find the unhappy paths. They click things in the wrong order. They enter data that does not match your validation assumptions. They navigate to pages through routes you did not think anyone would use.
This approach sits in a category that is emerging but not yet crowded. Rainforest QA tried crowd-sourced testing and pivoted. Testim (acquired by Tricentis) used AI for test creation but still required human oversight for test design. QA Wolf offers human QA engineers as a service. Synthetic Society is attempting something different: fully autonomous testing agents that do not need humans to tell them what to test.
The competitive positioning matters. If you are a development team using Cursor or Windsurf to write code five times faster, your testing bottleneck is not the execution of tests. It is the creation of tests. It is knowing what to test. Synthetic Society is betting that AI agents can figure out what to test on their own, which is a harder problem but a far more valuable one to solve.
The Verdict
I think the timing here is almost perfect. The AI code generation wave has created a real, urgent need for AI-native testing, and the existing testing infrastructure was already creaking before that wave hit. Cypress and Playwright are good tools, but they solve a different problem. They verify that known workflows still work. Synthetic Society is trying to discover unknown workflows that do not work. That is a fundamentally different value proposition.
The risk is obvious: two people, ambitious product, crowded adjacent space. If Cypress or Playwright bolt on AI-powered exploratory testing, or if Testim’s new parent Tricentis builds something similar with enterprise distribution, Synthetic Society could get squeezed. The window for an independent player to establish this category is open right now, but it will not stay open forever.
Thirty days, I want to see how many paying customers they have and what the retention looks like. Sixty days, whether the synthetic users are actually catching bugs that slip past existing test suites, with real case studies and numbers. Ninety days, the question is whether this becomes a standalone product or a feature that testing platforms absorb. If the bug detection rate is significantly higher than traditional automated testing, Synthetic Society has a real business. If it is marginally better, it is a feature, not a company.