Tahoe Therapeutics reposted this
Today we’re introducing Tahoe-x1 (Tx1), a 3 billion parameter single-cell foundation model that learns unified representations of genes, cells, and drugs, open-sourced on Hugging Face. The same 15-person team (10 until a couple of weeks ago) that built the Tahoe-100M dataset to address the data challenge in scaling AI models in cell biology, has now built Tx1, the largest and first compute-efficient model at this scale trained on perturbation-rich data. And true to the original spirit, we are releasing it open source with open weights (see comments). Built on our Tahoe-100M dataset, Tx1 is over 10× more efficient to train than most other cell-state models. We’re releasing Tx1 together with new benchmarks we designed to assess performance in cancer-relevant and drug-discovery tasks, where Tx1 achieves state-of-the-art results. Tx1 makes it possible, for the first time, to systematically search for better architectures at the billion-parameter scale and explore whether the scaling laws that transformed language and protein modeling can now do the same for cell biology.