A new AI model called Evo-2 has been released, trained on 128,000 genomes spanning from single-celled bacteria to humans. The model can write whole chromosomes and small genomes from scratch, make sense of existing DNA, including complex non-coding gene variants linked to diseases, and analyze sequences up to 1 million base pairs.
Evo-2 was co-developed by researchers at the Arc Institute and Stanford University, in collaboration with chip maker NVIDIA. The model is available for scientists through web interfaces or can be downloaded as freely available software code, data, and parameters.
The developers aim to create a platform that others can adapt to their own uses, which they call an “app store” for biology. According to Patrick Hsu, a bioengineer at the Arc Institute, the model has the potential to revolutionize the field of genetics.
While other scientists are impressed with Evo-2’s capabilities, some have expressed concerns about its performance in independent benchmarks. However, experts like Anshul Kundaje, a computational genomicist at Stanford University, praise the model’s engineering and predictive power, particularly in analyzing non-coding mutations linked to diseases such as breast cancer.
The model was tested on complex genomes, including that of the woolly mammoth, and demonstrated its ability to decipher DNA regulatory grammar. Christina Theodoris, a computational biologist at the Gladstone Institutes, calls Evo-2 “a significant step” in understanding DNA complexity.
Evo-2’s unprecedented computing power and training data make it the largest biological AI model yet released, with 9.3 trillion DNA letters encompassed across its genome dataset.
Source: https://www.nature.com/articles/d41586-025-00531-3