Democratizing AI for Biology
An open-source project that can be used and improved by academics and companies alike.
Utilize the existing pre-trained weights to get quickly get started fine-tuning your model.
A permissively licensed model that allows commercial and non-commercial use.
Provides the tools used to train the model under the same license.
Optimized performance for use on state-of-the-art and widely available GPUs.
A supercomputer scale, distributed training, PyTorch-based training framework
OpenFold is a non-profit AI research and development consortium developing free and open-source software tools for biology and drug discovery. Our mission is to bring the most powerful software ever created -- AI systems with the ability to engineer the molecules of life -- to everyone. These tools can be used by academics, biotech and pharmaceutical companies, or students learning to create the medicines of tomorrow, to accelerate basic biological research, and bring new cures to market that would be impossible to discover without AI.
In biology, structure and function are inextricably linked. Understanding the mechanisms of biological systems, their engineering, and how to affect them therefore implies a need to know and understand their structure. The consortium is creating state-of-the-art AI-based protein modeling tools that can predict molecular structures with atomic accuracy, making this level of precision accessible in open source for both research and commercial applications for the first time. Researchers around the world will be able to use, improve, and contribute to this "predictive molecular microscope.”
This work aims to:
DeepMind's release of AlphaFold3 as a proprietary model has sent the scientific community racing to produce open-source alternatives. The OpenFold team has already begun coding an open-source version of AlphaFold3 that it hopes to complete this year. Creating open-source infrastructures of software that is fundamental to the ability to do drug discovery will help ensure accessibility to all scientists and promote human health.
OpenFold has been developed to enhance protein structure prediction using AI and supercomputing power. OpenFold offers faster, memory-efficient predictions compared to alternative software, potentially transforming drug discovery and our understanding of disease. This advancement is open-source and trainable, allowing researchers to customize models with their own data.
The OpenFold AI Research Consortium, a non-profit focused on developing open-source AI tools for drug discovery, has expanded by adding six new members: Astex, Biogen, Congruence, Polaris Quantum, Psivant, and SandboxAQ. These companies bring expertise in fields like quantum computing, small molecule therapeutics, and AI-driven drug design, enhancing OpenFold's mission to advance healthcare innovation. The consortium aims to accelerate the development of transformative medicines through collaborative efforts and cutting-edge technologies.
Pharma companies are investing in consortiums such as the Seattle-based OpenFold to promote development of open-source AI models for biology and drug discovery. Proprietary data may be a differentiator for companies tapping into an increasing number of open-source AI models according to panelists at the 2024 Life Science Innovation Northwest meeting.
OpenFold, a non-profit artificial intelligence (AI) research consortium, today announced the release of two new tools: 1) SoloSeq, which integrates a new protein Large Language Model (LLM) with its OpenFold structure prediction software, and 2) OpenFold-Multimer software, which creates higher quality models of protein/protein complexes than OpenFold alone.
The OpenFold group, a non-profit artificial intelligence (AI) research consortium of biotech and tech firms whose goal is to develop free and open-source software tools for biology and drug discovery, is announcing the funding of new large-scale protein studies at Prof. Gabriel Rocklin’s laboratory at Northwestern University. Read the full story in Business Wire.
OpenFold, a non-profit artificial intelligence (AI) research consortium whose goal is to develop free and open-source software tools for biology and drug discovery, today announced the addition of three new industry members: UCB, NVIDIA, and Valence Labs (powered by Recursion). Read the full story in Business Wire.
OpenFold, a non-profit artificial intelligence (AI) research consortium whose goal is to develop free and open source software tools for biology and drug discovery, today announced the addition of four new industry members: Bayer, Dassault Systèmes, CHARM Therapeutics, and BaseCamp Research Ltd. Read the full story in Business Wire.
A set of leading academic and industry partners are announcing the formation of OpenFold, a non-profit artificial intelligence (AI) research consortium of organizations whose goal is to develop free and open source software tools for biology and drug discovery. OpenFold is a project of the Open Molecular Software Foundation (OMSF), a non-profit organization advancing molecular sciences by building communities for open source research software development. Read the full story in Business Wire.
Gustaf Ahdritz is the PhD student in the Al Quraishi lab at Columbia University and one of the lead developers of OpenFold. He shares his thoughts in this TechCrunch piece on the value of open source and community in science, AI models and why OpenFold came about when AlphaFold2 is already out there.
Our model, inspired by AlphaFold2, is available here with training documentation. Our inference code is available under a permissive license.
Our model, inspired by AlphaFold-Multimer, is available here. Development of next-generation model weights is currently underway.
Our OpenFold-SoloSeq inference code is available here. OpenFold-SoloSeq predicts protein structures equivalent to the OpenFold model with no MSA input required.
OpenFold-SmallMolecule is currently under development by our employees and contributors.
OpenStability will provide enhanced mega-scale protein stability data. This project is underway in Prof. Gabriel Rocklin's laboratory and includes an extension to larger protein domains.
All entities are welcome to join the consortium -- corporate, non-profit, and academic. Members contribute financially or technically and play a key role in choosing new directions and high-priority projects. Please note that joining the consortium as a member is primarily directed at organizations and requires committing resources through a participation agreement. If you are interested in becoming a member, please reach out to info-at-openfold.io and we will follow up with more details.
The Consortium also benefits from support provided by different non-member organizations and individuals with aligned missions -- if you have a research idea or seek other collaboration opportunities, please reach out to info-at-openfold.io.
OpenFold is an open source project and anyone can help contribute directly through a pull request or issue submission.
Visit OpenFold GitHubOur goal is to develop an open ecosystem of accelerated AI for Biology tools in order to catalyze innovation, starting with state-of-the-art and permissively licensed protein structure prediction training and inference pipelines and models.
The founding members of the consortium span academia, technology corporations, startups, and pharmaceutical companies. The launching group includes the AlQuraishi lab at Columbia University, Arzeda, Cyrus Biotechnology and Outpace Bio with many more to come, so stay tuned and get in touch! The OpenFold Consortium is hosted by the Open Molecular Software Foundation.
All are welcome! Member organizations are expected to commit resources to our shared goals and mission. Many types of contributions are welcome, including but not limited to in-kind, FTE, or monetary support. If this is you and your organization, please get in touch by reaching out to info-at-openfold.io.
Our OpenFold suite of protein structure prediction tools is available today on our GitHub! OpenFold is our core platform designed for high-accuracy protein folding predictions. Also available is OpenFold-SoloSeq, which extends OpenFold capabilities by eliminating the need to pre-compute Multiple Sequence Alignments. Additionally, OpenFold-Multimer is available and enables modeling of protein-protein interactions and multimeric complexes. At the OpenFold Consortium, development of OpenFold-Ligand is underway to support the prediction of protein-ligand interactions. Explore our comprehensive training and inference code to leverage these open source tools for your scientific interests. OpenFold training data is available in the Registry of Open Data on AWS (RODA) thanks to their Open Data Sponsorship Program.
Yes! Our developers provide extensive documentation including guides on setting up and training each of the tools in our OpenFold suite. Documentation can be found here.