EVOLVED 2024 - Code, Compile, Cure: Where Software Engineering Meets Therapeutic Design.

This article was co-authored by NVIDIA PMM Lead Vega Shah. Special thanks to Lux Capital Summer Associate Matthew Nemeth, Kim Tran of Princeton University, and Christoph Krettler, Senior Research Data Scientist at Enveda Biosciences, for organizing the hackathon.
Last fall we brought together over 200 machine learning researchers, software engineers, and computational biologists together to push the frontier of method development in computational biology. This year’s Evolved hackathon doubled in size from last year and saw a few new companies formed off the back of it.
Congrats to all the teams that took part, with a special shout out to the following teams:
1st Place: EvoCapsid
- Use cases: Addresses a major challenge in gene therapy—avoiding immunogenic toxicity. By leveraging ESM3, EvoCapsid aims to evolve safer RNA editing proteins and gene delivery mechanisms, paving the way for new therapeutic avenues.
- What: leveraged ESM3’s lack of viral coding sequences to re-design gene delivery proteins with lower immunogenicity and evolve RNA editing proteins
- Awards: ESM’s Creativity Prize, Adaptyv’s Translational Prize.
2nd Place: MSEffect
- Use cases: Accurately identifying chemicals from tandem MS/MS using models like MSEffect’s will help illuminate the “dark chemical space,” allowing us to profile chemicals in the quest to make new drugs (Lux portco Enveda Bio), custom design fragrances, detect fake fabrics (Lux portco Osmo AI) and more.
- What: Combined probabilistic models with pre-trained MS/MS transformer models to predict the spectra of unknown compounds.
- Award: Enveda’s Seedling Prize.
3rd Place: GenPlasmid
- Use cases: Expressive, functional design of promoters and genomic parts would transform molecular biology research and finally enable the design of electronics-inspired cell circuits that the field has dreamed of. Read their own vision for their work here.
- What: Compiled OpenPlasmid and fine-tuned gLM2 to show that the fine-tuned model produced more effective promoter sequences in a generative fashion.
- Award: Polaris’ New Artifact Prize.
Special Mention: Brainstorm Therapeutics
- Use Cases: Brainstorm Therapeutics leverages brain organoids—spheres of neurons and glia derived from human-induced pluripotent stem cells—to more accurately model the human brain. This technology enables high-throughput drug screening, helping to accelerate and de-risk the discovery of new treatments for complex neurological disorders.
- What: Brainstorm Therapeutics presents a midbrain organoid platform developed from Parkinson’s patients. This platform is integrated with a Foundation Model-based network analysis protocol, built using the open-source NVIDIA BioNeMo framework, to support progress in neurodegenerative research. Learn more about their pipeline here.
- Award: NVIDIA Prize
What motivates us to run this every year? Since AlphaFold2 achieved experimental accuracy in protein structure prediction in 2020, biology has experienced a surge of new computational tools that significantly accelerate the pace and efficiency of research. In 2024 we are witnessing these models advance beyond structure prediction toward straight up functional design.
Looking at the difference in the past two years highlights just how stark the progress in computational biology has been: for example - in 2022, researchers tested ~15,000 computationally designed binders to obtain 10 hits (3); now in 2024, a group computationally designed binders with an average hit rate of 46%. That’s the equivalent of testing 22 binders to get 10 hits. This is an improvement by a factor of 10^3 (4) in two years. What we’re seeing is not a single, isolated leap, but rather a series of exponential advancements in our capabilities.
While today’s drug discovery platforms leverage computational methods for new target discovery, the sobering reality in biotech is that advancing an asset to an Investigational New Drug (IND) application requires very little of this. With the pendulum swing of today’s biotech market largely disbanding the notion of a ‘platform,’ an even more pessimistic stance is that current computational methods alone offer minimal value. Although these methods have the potential to dramatically expand the possible conformational space, this expansion today realistically does not yet help drug developers select the right biological target and overcome the complex realities of drug development.
While these limitations are real today - what we’ve seen from machine learning advances in other areas—such as natural language processing with chat-based AI agents or robotics with Physical Intelligence—that if we don’t incentivize R&D beyond immediate, near term, bottlenecks like the biotech drug development process (e.g., target selection for IND, clinical trial stratification, single-asset arbitrage), we risk the potential upside of long-term breakthroughs in foundational method development. When these computational methods eventually become powerful enough they eventually have the potential to provide significantly greater value, something that became obvious when ChatGPT shocked over 300 million people with the powerful capabilities of language models.
There’s three key directional arrows that give us confidence that this is taking place in biology today:
1. Rapid rise of AI architectures into the biological realm | Method development in biology has quickly become “too fast for scientists to keep track of which model to use”. In protein structure prediction and design alone, we’ve seen AlphaFold2 in 2020; RoseTTAFold in 2021; ESM-2 in 2022; Genie, Chroma, and RFDiffusion in 2023; and AlphaFold3, Chai-1, NeuralPLexer3 Beta, and ESM3 in 2024 (Fig 1). In single cell analysis, our throughput is expanding too - with an uptick in the number of cells measured per physical assay over the years (Fig 2). This year - marked by the entrance of AI agents in realms like coding, finance workflow automation, spreadsheets, and more - we expect to see the innings of entirely novel drug discovery and development workflows, built from scratch with automation and integrations in mind.


From Sakhet Choudhary (image sourcing credits to Gabe Dolsten)
2. Influx of software engineering talent | As model predictions approach experimental standards and reduce design costs, and other fields like-self driving vehicles have already experienced clear success of ML there’s an influx of trained talent entering biology - realising that the tools they are building are beginning to be able to work.
3. New biological applications | Increasingly powerful models open up vast new areas of biological applications. With biologists and computer scientists actively building products and companies around new capabilities around combinatorial spaces that historically may have been too intractable for humans to navigate —for example, Lux portfolio company Dyno (similar to EvoCapsid) is leveraging advancements in biologics and sequence design to computationally guide the design of new gene therapy vectors.
We envision a world where developing life-saving treatments is as streamlined, rapid, and cost-effective as modern software development and yet today there are very few places that incentivize the actual development of new computational methods for biology, and the engineers behind it. While these methods still require years before we can run thousands of complex biological experiments in-silico, this future is not out of reach.
Through Evolved, we set out to equip engineers with the resources—datasets, base models, and compute infrastructure— unavailable to many outside large biotech platforms and play the first role in guiding engineers towards this future.
- Models, data: Early access to EvolutionaryScale’s ESM3 (98B-parameter protein model) and Enveda Biosciences’ proprietary MS/MS datasets + challenges to improve MS/MS predictions.
- Infrastructure: ~$300K in computing resources, prizes, and APIs from Modal, TogetherAI, LatchBio, AWS, NVIDIA, and others.
If this resonates and you want to learn more, or take part in this year’s hackathon with a team- join our Discord (with over 1200+ engineers).