OnDeck AI Analyzes Video Without Training a Model, and That Changes Everything

AIComputer VisionInfrastructureVideo Analytics

The Macro: Computer Vision Has a Setup Problem

I want to talk about why traditional computer vision is slow and expensive and why that matters for every industry that uses cameras. The standard workflow for deploying a computer vision system goes like this. You install cameras. You collect footage. You label thousands of images by hand, drawing bounding boxes around every object you want the system to recognize. You train a custom model on that labeled data. You test it. You discover it does not work well in different lighting conditions or camera angles. You collect more data. You label more images. You retrain. This cycle takes months and costs six figures before the system does anything useful.

This is why computer vision adoption has been so slow outside of a few well-funded verticals. Self-driving cars can afford the labeling and training costs. Warehouse automation can justify it. But what about a port operator who wants to monitor vessel traffic? A fish farm that wants to count fish? An oil rig that needs to detect safety violations? These are real use cases with real budgets, but the setup cost of traditional computer vision prices them out.

The market is large. Grand View Research estimates the global computer vision market at over $20 billion and growing at 19 percent annually. But most of that revenue is concentrated in a handful of verticals where companies can afford the implementation cost. The long tail of computer vision use cases, the thousands of businesses that would benefit from video analysis but cannot justify a six-month ML project, remains largely untapped.

Scale AI, Labelbox, and V7 have tried to reduce the labeling bottleneck. Roboflow makes model training more accessible. Landing AI targets manufacturing inspection. But all of these approaches still assume you need to train a custom model. What if you did not?

The Micro: A National Geographic Explorer and a Cambridge ML Researcher

OnDeck AI is a vision analysis engine that uses vision language models to analyze video footage without any training data or model development. You point it at a camera feed, describe what you want to detect in natural language, and it starts identifying objects, behaviors, and events immediately. No labeling. No training. No waiting.

Alexander Dungate is a founder with a BSc in computer science and biology. He is also a National Geographic Explorer, which is not the typical background for an enterprise AI founder but makes sense when you learn that OnDeck’s early use cases include analyzing footage of autonomous vessels and wildlife monitoring. Sepand Dyanatkar is the CTO, with a Masters in machine learning from Cambridge and prior roles at the European Space Agency, Amazon, and swarm robotics research. The team is five people, based in Vancouver, part of Y Combinator’s Summer 2025 batch.

The technical approach is built on vision language models, which are multimodal AI systems that understand both images and text. Instead of training a model to recognize a specific type of hard hat in a specific factory setting, you tell the VLM “identify workers not wearing hard hats” and it generalizes across camera angles, lighting conditions, and environments. The team published a NeurIPS workshop paper showing that VLM methods outperform traditional computer vision on niche tasks where training data is limited.

That last point is important. Traditional CV wins when you have 100,000 labeled images of the exact thing you want to detect. VLMs win when you have zero labeled images and need the system to work on day one. For the long tail of computer vision use cases, you almost never have 100,000 labeled images. You have cameras and a problem.

OnDeck’s current deployments span autonomous vessels, robotics research, security monitoring, port behavior analysis, and offshore oil and gas operations. These are exactly the verticals where the traditional CV setup cost was prohibitive. A port operator is not going to spend six months training a custom model to detect unauthorized dock access. But they will pay for a system that works the day it is installed.

The product is live and the team is actively deploying with enterprise customers. The website is built but light on marketing content, which tracks with a company focused on direct enterprise sales rather than inbound leads.

The Verdict

OnDeck AI is making a bet that vision language models will replace custom-trained CV models for most real-world video analysis tasks. I think that bet is correct for the long tail of the market. The companies that have already invested millions in custom CV pipelines are not switching anytime soon. But the thousands of companies that have cameras and no CV system because the setup cost was too high? Those are OnDeck’s customers.

The risk is accuracy. VLMs are generalists. Custom-trained models are specialists. For safety-critical applications like detecting gas leaks on an oil rig, the question is whether a generalist model is reliable enough. The NeurIPS paper is encouraging, but academic benchmarks and production reliability are different things. Enterprise customers in oil and gas and maritime do not tolerate false negatives.

In thirty days, I want to see deployment metrics. How many camera feeds is OnDeck processing in production, and what is the false positive rate? Sixty days, the question is whether customers are expanding from pilot deployments to full facility coverage. A port operator testing OnDeck on one dock is interesting. The same operator rolling it out across twelve docks is a business. Ninety days, I want to understand the pricing model. If OnDeck charges per camera per month, the economics need to undercut the annualized cost of a custom CV deployment by at least 80 percent to drive adoption. The technical approach is sound. The founders have the right backgrounds. The market gap is obvious. Now they need to prove that VLMs are reliable enough for the industries that need this most.

Visit Official Site →

← Back to July 24, 2026 edition

OnDeck AI Analyzes Video Without Training a Model, and That Changes Everything

The Macro: Computer Vision Has a Setup Problem

The Micro: A National Geographic Explorer and a Cambridge ML Researcher

The Verdict

More on this