The Macro: The AI Industry Has a Size Problem
I want to push back on something the AI industry has accepted as gospel: that bigger models are always better. The entire trajectory of the field for the past three years has been about scale. More parameters. More training data. More compute. GPT-4 is bigger than GPT-3. Claude 3.5 is bigger than Claude 3. Every frontier lab is in an arms race to train the largest model on the most GPUs.
This works fine if your deployment target is a data center with unlimited electricity and cooling. It works less fine if you want AI to run on a phone. It works not at all if you want AI on a smartwatch, a hearing aid, a security camera, or any of the billions of edge devices where latency, bandwidth, and power consumption actually matter.
The edge AI market is large and growing fast. Qualcomm, MediaTek, and other chip makers are building dedicated AI accelerators into mobile processors. On-device inference is already happening for basic tasks like face detection and voice wake words. But running a high-quality text-to-speech model, a language model, or an image classifier on a device with limited memory and no cloud connection requires a fundamentally different approach to model architecture.
There are companies working on model compression. Techniques like quantization, pruning, and knowledge distillation can shrink existing large models. But compression has limits. You lose quality. A quantized version of a cloud-scale model is almost always worse than a model that was designed from the ground up to be small. The analogy is the difference between cramming a desktop operating system onto a phone versus building iOS from scratch. The purpose-built approach wins every time.
Picovoice has built a solid business around on-device voice AI. Whisper.cpp made speech recognition work locally. But the space is still wide open for teams that can build new model architectures optimized for edge deployment rather than just shrinking existing ones.
The Micro: Meta Research Alumni Building Models That Fit in Your Pocket
Stellon Labs is an AI research lab building tiny frontier models designed specifically for edge devices like smartphones, wearables, and embedded systems. Their first public release, an open-source text-to-speech model called KittenTTS, hit 8,000 GitHub stars and 45,000 model downloads within two weeks of launch.
Rohan Joshi and Divam Gupta are the founders. Rohan is a former Meta Research engineer and CMU School of Computer Science alumnus, which is exactly the pedigree you want for someone building novel model architectures. They are based in San Francisco and part of Y Combinator’s Summer 2025 batch.
Those KittenTTS numbers deserve attention. 8,000 GitHub stars in two weeks is not a vanity metric for an AI model. It means developers tried the model, found it genuinely useful, and starred the repo. 45,000 downloads means people are actually integrating it into projects. For context, most open-source AI models from well-funded startups are happy to hit 1,000 stars in their first month. KittenTTS did eight times that in half the time.
The open-source strategy is smart for multiple reasons. First, it builds credibility in a space where claims about model quality are cheap and benchmarks are easily gamed. When developers can download KittenTTS and test it themselves on their own hardware, the quality speaks for itself. Second, open-source adoption creates a pipeline for commercial products. Developers who use KittenTTS for free in side projects become enterprise buyers when their companies need edge AI solutions with support, fine-tuning, and SLAs.
The “tiny frontier” positioning is precise and differentiated. They are not building the biggest model. They are not building the cheapest cloud API. They are building models that deliver frontier-level quality at a fraction of the size. That is a genuinely different technical challenge and a genuinely different market from what the major AI labs are pursuing.
The company website is minimal by design, which tracks with a research-first culture. There is a careers page (hosted on Notion), a contact email, and not much else. The GitHub organization, KittenML, is where the real product lives. This is a team that ships code, not marketing pages.
The Verdict
Stellon Labs has the best early traction signal of any company I have looked at recently. 8,000 stars and 45,000 downloads in two weeks for an open-source model is not luck. It is a product that fills a real gap. Developers want high-quality AI models that run locally, and nobody else is building them with this level of focus on size optimization.
The business model question is the obvious one. Open-source research labs have historically struggled to monetize. Stability AI proved that massive GitHub adoption does not automatically translate into revenue. Hugging Face figured it out through enterprise hosting and model hub infrastructure. Stellon Labs will need to find their own version of that commercial layer.
The technical moat is real, though. Building small models that maintain frontier quality is a hard research problem. It is not something a competitor can replicate just by throwing more GPUs at the problem. In fact, throwing more GPUs at the problem is the opposite of what you need to do. That gives Stellon Labs a structural advantage against well-funded competitors who are optimized for the scale-up approach.
In thirty days, I want to see whether KittenTTS adoption is accelerating or plateauing. Sixty days, I want to see a second model release. One hit model could be lightning in a bottle. Two confirms a repeatable research capability. Ninety days, the question is whether enterprise conversations are happening. If hardware companies, IoT manufacturers, or mobile developers are reaching out about licensing or integration, the commercial model is starting to take shape. If it is still just GitHub stars, they need to accelerate on the business side.