The Macro: Drug Discovery Has an Expensive Guessing Problem
I want to put a number in your head. It costs roughly $2.6 billion to bring a single drug to market. That number has been climbing for decades, and a huge chunk of it goes toward experiments that fail. Most drug candidates do not make it through clinical trials. Most mRNA sequences designed for therapeutic use do not behave the way researchers expect them to. The gap between computational prediction and wet-lab reality is where billions of dollars go to die.
RNA has become one of the most important molecules in modern drug development. The COVID vaccines were mRNA-based. RNA therapeutics are being developed for cancer, rare diseases, and genetic disorders. But designing effective mRNA sequences is still largely empirical. You synthesize candidates, test them in the lab, measure expression levels and stability, throw out the ones that do not work, and iterate. This process is slow, expensive, and generates enormous amounts of waste.
The promise of foundation models for biology is that you can simulate some of these experiments computationally before committing to the bench. Protein structure prediction had its AlphaFold moment. Genomics has seen a wave of large language models trained on DNA sequences. RNA has been comparatively underserved, partly because the data is messier, partly because RNA biology is more complex than sequence alone, involving splice variants, secondary structures, and context-dependent behavior that makes pure sequence models insufficient.
Companies like Insilico Medicine, Recursion Pharmaceuticals, and Isomorphic Labs (a spinoff from DeepMind, so I will not dwell on it) are all working on AI for drug discovery. But most of them are working at the protein or small-molecule level. The RNA-specific foundation model space is genuinely sparse, which makes what Blank Bio is doing notable.
The Micro: Orthrus, mRNABench, and the Open-Source Bet
Blank Bio was founded by Jonny Hsu, Philip Fradkin, and Ian Shi. Their academic roots show. Hsu and Fradkin come from the University of Toronto and the Vector Institute, two institutions that have produced an outsized share of the machine learning talent behind modern AI. Shi brings a Memorial Sloan Kettering connection. The research pedigree is real.
Their core technology is built on two published papers. Orthrus is their RNA foundation model, designed to process transcript-level biology rather than just gene-level counts. The distinction matters more than it might sound. Standard RNA analysis treats each gene as a single data point and discards the splice variants, mutations, and expression patterns that determine how a gene actually behaves in a specific cell type or disease state. Orthrus captures those signals. The second paper, mRNABench, is a curated benchmark for mRNA property prediction, which is the kind of unsexy infrastructure work that makes the field better for everyone.
The open-source strategy is the decision I find most interesting. Blank Bio has made their models available publicly, and Sanofi and GSK are already using them. In biotech, where intellectual property is guarded ferociously and most computational tools are locked behind enterprise contracts, open-sourcing your core model is a bold move. It builds trust, accelerates adoption, and creates a community of researchers who improve the model through use and feedback. The trade-off is that competitors can use it too. Blank Bio is betting that the model itself is not the moat. The moat is the data flywheel, the proprietary improvements, and the platform built on top.
Their tagline on the website says “RNA intelligence for precision medicine,” and the applications they list span patient stratification, enhanced diagnostics, target discovery, therapeutic design, and biosecurity monitoring. That is a wide aperture for a startup. I suspect the near-term revenue comes from pharma partnerships where Blank Bio helps design and optimize mRNA sequences computationally before they go to the lab. The claim is that their models get “90% of the way there” toward clinical applications, which is one of those statements that is either incredibly exciting or dangerously misleading depending on what the last 10% looks like.
The biosecurity monitoring mention is worth flagging. RNA-based pathogen detection and surveillance is a growing field, and foundation models that can rapidly characterize novel RNA sequences have obvious applications in pandemic preparedness. This is not their primary business today, but it is the kind of capability that attracts government funding and defense interest.
The Verdict
Blank Bio is doing something genuinely important. RNA foundation models are underdeveloped relative to their potential impact on drug discovery and precision medicine. The team has the research credentials and the published work to back up the pitch. The open-source strategy is smart for building adoption, and having Sanofi and GSK as early users is meaningful validation.
At 30 days, I want to see how the pharma engagement model works. Are they licensing the model, offering consulting services around it, or building a SaaS platform? The business model determines whether this is a venture-scale company or a research lab that sells services.
At 60 days, the question is whether the computational predictions actually reduce wet-lab costs for their partners. If a pharma company can cut their mRNA candidate screening time by 50% using Blank Bio’s models, that is a product. If the predictions are directionally useful but still require the same number of bench experiments, that is a nice-to-have.
At 90 days, I want to know if they are building proprietary datasets on top of partner collaborations. The model is open. The data that flows through it in commercial use should not be. That data is the real asset.
This is the kind of company where the impact, if it works, is measured in lives saved and years of drug development shaved off. I am rooting for it.