← July 11, 2026 edition

theinterface

Frontier lab for visual simulation

Interface Is Building Neural Humans for Visual Simulation and the Timing Could Not Be Better

AIVisual SimulationGamingEntertainment

The Macro: Visual Simulation Is the Next Frontier for AI

I want to talk about why visual simulation is about to become one of the most important categories in AI, and why it has been surprisingly quiet until now.

The AI industry spent the last three years obsessed with language. LLMs dominated the conversation, the funding, and the talent. Image generation got its moment with Stable Diffusion, Midjourney, and DALL-E. Video generation is having its moment now with Runway, Pika, and Luma. But all of these tools generate flat outputs. They make pictures or clips. They do not build interactive, physics-aware environments where digital humans move, react, and behave like real people.

That is a fundamentally different problem. Generating a realistic image of a person is impressive. Generating a realistic person who walks through a room, picks up an object, and responds to changes in the environment in real time is several orders of magnitude harder. It requires modeling appearance, physics, behavior, and interaction simultaneously. The gaming industry has been doing this with hand-crafted assets and animation systems for decades, but the cost is staggering. A AAA game character can take months of work from modelers, riggers, animators, and engineers.

The convergence point is neural rendering. Instead of manually building 3D models and animation rigs, you train neural networks to model how things look and move. NVIDIA has been investing heavily in this with their Omniverse platform. Epic Games is pushing the boundaries with MetaHuman and Unreal Engine 5. Unity has neural rendering research projects. But these are all features within larger platforms, not dedicated products built from the ground up for neural visual simulation.

The applications extend well beyond gaming. Autonomous vehicle companies need realistic simulated environments for testing. Robotics companies need simulated worlds for training manipulation policies. Architecture and design firms need interactive walkthroughs. Film and TV production is moving toward virtual production stages. The total addressable market for high-fidelity visual simulation is enormous and growing.

The Micro: McKinsey Meets Citadel at MIT

Interface calls itself a frontier lab for visual simulation. They are training neural systems that model the appearance, behavior, and interaction of people and objects. The first model is in development with early access signups open, which means the product is pre-launch. That is worth stating plainly. This is a company to watch, not a product to evaluate.

Max Raven is CEO and Peyton Shields is CTO. They are a two-person team based in New York, part of Y Combinator’s Summer 2025 batch. Max previously managed global AI transformation projects at McKinsey, which means he understands how large enterprises think about adopting new technology. Peyton was a quant developer at Citadel building risk infrastructure, which means he has built systems that need to be fast, reliable, and mathematically precise. Both are MIT graduates.

The McKinsey-plus-Citadel founding combination is unusual for a visual simulation company. Most teams in this space come from computer graphics, gaming, or academic research labs. Max and Peyton come from consulting and quantitative finance. That could mean they see the market opportunity more clearly than the technical insiders, or it could mean they underestimate the depth of the computer graphics challenges. The fact that they are hiring a Senior ML Researcher for visual simulation at $120K to $250K with meaningful equity suggests they know they need domain expertise and are willing to pay for it.

The research focus matters. Interface is not building a product for consumers to play with. They are building neural models for world-model research. That positions them as infrastructure for other companies building games, simulations, and virtual environments. The platform play is potentially more valuable than the application play, but it also takes longer to mature and requires more capital.

New York as a base is interesting. Most AI startups cluster in San Francisco. New York has a stronger connection to the entertainment, media, and finance industries, all of which are potential customers for high-fidelity visual simulation. If Interface is going after film production, advertising, and financial simulation use cases, New York is the right place to be.

The Verdict

Interface is early. The first model has not launched. There is no product to test, no traction to evaluate, and no revenue to discuss. What there is: a clearly defined research direction, a credible founding team, and a market that is large and growing.

The risk is execution timeline. Neural visual simulation is genuinely hard. Building systems that model appearance and behavior at the fidelity required for professional applications takes time, compute, and specialized talent. A two-person team with a quant and a consultant at the helm will need to hire fast and well. The ML researcher role they have listed is critical.

In thirty days, I want to see the first model. What does it actually produce? How does the fidelity compare to NVIDIA Omniverse outputs or Epic MetaHuman quality? Sixty days, the question is whether they have a design partner. A gaming studio, a film production company, or a simulation company that is actively testing Interface’s models in a real workflow. Ninety days, I want to know their compute strategy. Training neural simulation models is expensive, and a two-person startup does not have the GPU budget that NVIDIA or Epic can throw at the problem. How they solve the compute constraint will determine how fast they can iterate. This is a long-horizon bet on a team and a research direction. If the models work, the market is waiting. If they do not, two years of runway disappears with nothing to show. I think the direction is right. The execution is everything.