← July 17, 2026 edition

novaflow

AI data analyst for biology labs

Novaflow Wants to Replace the Grad Student Who Makes All Your Lab's Charts

AIBiotechData ScienceResearch Tools

The Macro: Biology Labs Are Drowning in Data They Cannot Visualize

I spent a summer in a computational biology lab during college. The most common sound was not pipettes clicking or centrifuges whirring. It was someone swearing at R. Specifically, someone swearing at ggplot2 because the axis labels were overlapping for the third time that afternoon.

This is the reality of modern biology research. The experiments generate data. The data needs to be analyzed. The analysis needs to be visualized. And the visualization needs to meet the exacting standards of journals like Nature, Cell, and Science, which have specific requirements for figure formatting, resolution, color palettes, and statistical annotations. The science might take weeks. Getting the figures right for the paper takes weeks more.

The tools biologists use for this work are almost comically bad for people who are not programmers. R with ggplot2 is the gold standard, and it is powerful, but it requires real programming skill. Python with matplotlib or seaborn is the alternative, and it requires the same. GraphPad Prism is the point-and-click option, and it is $300 per year with limited flexibility. Excel is still used in labs more than anyone wants to admit, and it produces figures that reviewers reject on sight.

There are roughly 5 million life scientists worldwide who need to visualize experimental data regularly. Most of them were trained in biology, not computer science. The mismatch between the tools available and the skills of the people using them is one of the most persistent productivity drains in academic research.

AI data analysis tools have been proliferating, but almost all of them target business analytics. Julius AI, Hex, and Equals are building for product managers and operations teams. The few tools that touch scientific data, like DeepNote or Posit Cloud, still require you to write code. Nobody has built a clean, no-code path from raw experimental data to publication-ready scientific figures with the kind of domain specificity that biology actually requires.

That specificity matters. A business chart and a scientific figure are fundamentally different objects. Scientific figures need error bars, significance annotations, proper statistical tests, and formatting that conforms to journal guidelines. A tool that makes great bar charts for quarterly revenue reports is useless for a paper submission to PNAS.

The Micro: Upload Your Data, Ask a Question, Get a Figure

Novaflow is an AI data analyst built specifically for biology labs. You upload experimental data, ask a question in plain English, and the product returns publication-ready visualizations. No code. No wrestling with R syntax. No spending an afternoon figuring out why your legend is appearing inside the plot area.

Aman Agarwal is CEO and Amulya Balakrishnan is CTO. They are based in San Francisco, part of Y Combinator’s Summer 2025 batch, and running a two-person team. The small team size is relevant here because it tells you the product is early, but the YC backing and the specificity of the vertical suggest this is more than a side project.

The product flow is direct. Upload a CSV or Excel file of experimental results. Type a question like “Show me the dose-response curve for compound A across all time points with error bars and significance markers.” Get back a formatted figure that looks like it belongs in a journal.

What makes this different from asking ChatGPT to write you some matplotlib code is the domain layer. Novaflow understands biology-specific data structures, experimental designs, and visualization conventions. It knows that a Western blot quantification should be presented differently than a flow cytometry histogram. It knows that a p-value of 0.03 gets one asterisk and a p-value of 0.001 gets three. These are the kinds of details that take grad students months to internalize and that general-purpose AI tools consistently get wrong.

The “publication-ready” claim is the one I would stress-test hardest. Journal figure requirements are extremely specific and vary by publication. Nature wants 300 DPI TIFFs with specific font sizes. Cell wants figures that fit column widths precisely. If Novaflow can genuinely output figures that pass journal review without manual editing, that is a tremendous value proposition. If the figures are 80% of the way there and still need cleanup in Illustrator, the product is a nice-to-have rather than a must-have.

I want to be specific about the competitive gap. GraphPad Prism is the closest incumbent, and it has been the default tool for biostatistics and figure creation in biology labs for decades. It is entrenched. But it is also aging, expensive, and requires significant learning time. If Novaflow can match Prism’s statistical rigor while eliminating the learning curve, the switching cost for labs is low because most scientists do not love Prism. They tolerate it.

The Verdict

Novaflow is solving a real problem for a large audience that has been poorly served by existing tools. The idea of typing a question and getting back a journal-quality figure is compelling enough that I think it sells itself to any grad student who has lost a weekend to formatting issues.

The risk is precision. Science is not an industry where “pretty close” works. If the statistical annotations are wrong, if the error bars use the wrong metric, if the figure does not meet a specific journal’s submission requirements, the product creates more work than it saves because now you have to check everything the AI did and fix the parts it got wrong. That is worse than just doing it yourself.

In thirty days, I want to see the product handle messy data. Lab data is never clean. It has missing values, outliers, inconsistent column names, and the kind of formatting chaos that happens when three different grad students contribute to the same spreadsheet. If Novaflow can parse messy data gracefully, it works. If it requires perfectly formatted inputs, adoption will stall.

In sixty days, the question is whether labs adopt it as a team tool or an individual tool. The real stickiness comes when an entire lab standardizes on Novaflow for their figure pipeline. Individual adoption means individual churn. Lab-wide adoption means institutional retention.

In ninety days, I want to see how many papers cite figures generated by Novaflow. That is the ultimate validation metric. A figure in a published paper is proof that the product met the standard. Everything else is marketing.