OpenFold

Democratizing AI for Biology

Timeline

  • 1 Month
    Lorem ipsum dolor sit amet, consectetur adipiscing elit.
  • 3 Months
    Lorem ipsum dolor sit amet, consectetur adipiscing elit.
  • 6 Months
    Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Why OpenFold?

Open Source

An open-source project that can be used and improved by academics and companies alike.

Weights Available

Utilize the existing pre-trained weights to get quickly get started fine-tuning your model.

Permissive License

A permissively licensed model that allows commercial and non-commercial use.

Training Pipeline

Provides the tools used to train the model under the same license.

Optimized for Performance

Optimized performance for use on state-of-the-art and widely available GPUs.

PyTorch-Based

A supercomputer scale, distributed training, PyTorch-based training framework

What is OpenFold?

Mission Statement

OpenFold is a non-profit AI research and development consortium developing free and open-source software tools for biology and drug discovery. Our mission is to bring the most powerful software ever created -- AI systems with the ability to engineer the molecules of life -- to everyone. These tools can be used by academics, biotech and pharmaceutical companies, or students learning to create the medicines of tomorrow, to accelerate basic biological research, and bring new cures to market that would be impossible to discover without AI.

Structure Prediction

In biology, structure and function are inextricably linked. Understanding the mechanisms of biological systems, their engineering, and how to affect them therefore implies a need to know and understand their structure. The consortium is creating state-of-the-art AI-based protein modeling tools that can predict molecular structures with atomic accuracy, making this level of precision accessible in open source for both research and commercial applications for the first time. Researchers around the world will be able to use, improve, and contribute to this "predictive molecular microscope.”

Goals

This work aims to:

  • Develop a permissively licensed model competitive with the performance of state-of-the-art models.
  • Provide the entire training & inference stack and training datasets under the same permissive license
  • Optimize the performance of this model for use on state-of-the-art and widely available GPUs.

News

Who will make AlphaFold3 open source? Scientists race to crack AI model

DeepMind's release of AlphaFold3 as a proprietary model has sent the scientific community racing to produce open-source alternatives. The OpenFold team has already begun coding an open-source version of AlphaFold3 that it hopes to complete this year. Creating open-source infrastructures of software that is fundamental to the ability to do drug discovery will help ensure accessibility to all scientists and promote human health.

AI, Computation, and the Folds of Life

OpenFold has been developed to enhance protein structure prediction using AI and supercomputing power. OpenFold offers faster, memory-efficient predictions compared to alternative software, potentially transforming drug discovery and our understanding of disease. This advancement is open-source and trainable, allowing researchers to customize models with their own data.

OpenFold AI Research Consortium Welcomes Six New Members: Astex, Biogen, Congruence, Polaris Quantum, Psivant, and SandboxAQ

The OpenFold AI Research Consortium, a non-profit focused on developing open-source AI tools for drug discovery, has expanded by adding six new members: Astex, Biogen, Congruence, Polaris Quantum, Psivant, and SandboxAQ. These companies bring expertise in fields like quantum computing, small molecule therapeutics, and AI-driven drug design, enhancing OpenFold's mission to advance healthcare innovation. The consortium aims to accelerate the development of transformative medicines through collaborative efforts and cutting-edge technologies.

AI is boosting drug discovery and development — and sparking questions about proprietary data

Pharma companies are investing in consortiums such as the Seattle-based OpenFold to promote development of open-source AI models for biology and drug discovery. Proprietary data may be a differentiator for companies tapping into an increasing number of open-source AI models according to panelists at the 2024 Life Science Innovation Northwest meeting.

OpenFold Biotech AI Research Consortium releases SoloSeq and Multimer, an integrated protein Large Language Model with 3D structure generation

OpenFold, a non-profit artificial intelligence (AI) research consortium, today announced the release of two new tools: 1) SoloSeq, which integrates a new protein Large Language Model (LLM) with its OpenFold structure prediction software, and 2) OpenFold-Multimer software, which creates higher quality models of protein/protein complexes than OpenFold alone.

OpenFold AI Research Consortium Announces Funding of Protein Data Collection at Prof. Gabriel Rocklin’s Laboratory at Northwestern University

The OpenFold group, a non-profit artificial intelligence (AI) research consortium of biotech and tech firms whose goal is to develop free and open-source software tools for biology and drug discovery, is announcing the funding of new large-scale protein studies at Prof. Gabriel Rocklin’s laboratory at Northwestern University. Read the full story in Business Wire.

OpenFold AI Research Consortium Welcomes Three New Members: UCB, NVIDIA and Valence Labs

OpenFold, a non-profit artificial intelligence (AI) research consortium whose goal is to develop free and open-source software tools for biology and drug discovery, today announced the addition of three new industry members: UCB, NVIDIA, and Valence Labs (powered by Recursion). Read the full story in Business Wire.

OpenFold Welcomes Bayer, Dassault, CHARM Therapeutics and BaseCamp Research

OpenFold, a non-profit artificial intelligence (AI) research consortium whose goal is to develop free and open source software tools for biology and drug discovery, today announced the addition of four new industry members: Bayer, Dassault Systèmes, CHARM Therapeutics, and BaseCamp Research Ltd. Read the full story in Business Wire.

Announcing OpenFold

A set of leading academic and industry partners are announcing the formation of OpenFold, a non-profit artificial intelligence (AI) research consortium of organizations whose goal is to develop free and open source software tools for biology and drug discovery. OpenFold is a project of the Open Molecular Software Foundation (OMSF), a non-profit organization advancing molecular sciences by building communities for open source research software development. Read the full story in Business Wire.

Interview with Gustaf Ahdritz in TechCrunch

Gustaf Ahdritz is the PhD student in the Al Quraishi lab at Columbia University and one of the lead developers of OpenFold. He shares his thoughts in this TechCrunch piece on the value of open source and community in science, AI models and why OpenFold came about when AlphaFold2 is already out there.

Our Team

Academic Partners

Prof. Mohammed AlQuraishi
Professor Mohammed AlQuraishi, PhD
Columbia University
Prof. Gabriel Rocklin
Professor Gabriel Rocklin, PhD
Northwestern University

Employees and Contributors

Jennifer Wei, PhD
Jennifer Wei, PhD
Senior Deep Learning Software Engineer
Mallory Tollefson, PhD
Mallory Tollefson, PhD
Business Development Manager
Zachary Baker
Zachary Baker
Community Manager
Nazim Bouatta, PhD
Nazim Bouatta, PhD
Senior Research Fellow
Lukas Jarosch
Lukas Jarosch
Graduate Student
Christina Floristean
Christina Floristean
Graduate Student
Gergo Nikolenyi
Gergo Nikolenyi
Graduate Student
Seohyun Kim, PhD
Seohyun Kim, PhD
Postdoctoral Research Scholar
Itamar Shamir, PhD
Itamar Shamir, PhD
Postdoctoral Research Scholar
Yeqing Lin
Yeqing Lin
Graduate Student
Colin Kalicki
Colin Kalicki
Graduate Student
Vinay Swamy
Vinay Swamy
Graduate Student
Ted Litberg, PhD
Ted Litberg, PhD
Postdoctoral Research Scholar
Guojie Zhong
Guojie Zhong
Graduate Student
Andrea Roncoli
Andrea Roncoli
Graduate Student

Executive Committee

Alexandre Zanghellini, PhD
Alexandre Zanghellini, PhD
Co-Founder and CEO at Arzeda Corp
Christina Taylor, PhD
Christina Taylor, PhD
Senior Science Fellow and Computational Molecular Design Lead at Bayer Crop Science
Lucas Nivon, PhD
Lucas Nivon, PhD
CEO and Co-Founder at Cyrus Biotechnology
Brian Weitzner, PhD
Brian Weitzner, PhD
Director of Computational and Structural Biology at Outpace Bio
Woody Sherman, PhD
Woody Sherman, PhD
Chief Innovation Officer at Psivant Therapeutics
Peter Clark, PhD
Peter Clark, PhD
Vice President, Computational Drug Design at Novo Nordisk

Supported Projects

OpenFold

Our model, inspired by AlphaFold2, is available here with training documentation. Our inference code is available under a permissive license.

OpenFold-Multimer

Our model, inspired by AlphaFold-Multimer, is available here. Development of next-generation model weights is currently underway.

OpenFold-SoloSeq

Our OpenFold-SoloSeq inference code is available here. OpenFold-SoloSeq predicts protein structures equivalent to the OpenFold model with no MSA input required.

OpenFold-SmallMolecule

OpenFold-SmallMolecule is currently under development by our employees and contributors.

OpenStability

OpenStability will provide enhanced mega-scale protein stability data. This project is underway in Prof. Gabriel Rocklin's laboratory and includes an extension to larger protein domains.

Members

How to Contribute

Join the Consortium

All entities are welcome to join the consortium -- corporate, non-profit, and academic. Members contribute financially or technically and play a key role in choosing new directions and high-priority projects. Please note that joining the consortium as a member is primarily directed at organizations and requires committing resources through a participation agreement. If you are interested in becoming a member, please reach out to info-at-openfold.io and we will follow up with more details.

Collaborate

The Consortium also benefits from support provided by different non-member organizations and individuals with aligned missions -- if you have a research idea or seek other collaboration opportunities, please reach out to info-at-openfold.io.

Contribute through Code

OpenFold is an open source project and anyone can help contribute directly through a pull request or issue submission.

Visit OpenFold GitHub

FAQ

What is the mission of the OpenFold Consortium?

Our goal is to develop an open ecosystem of accelerated AI for Biology tools in order to catalyze innovation, starting with state-of-the-art and permissively licensed protein structure prediction training and inference pipelines and models.

Who are the members?

The founding members of the consortium span academia, technology corporations, startups, and pharmaceutical companies. The launching group includes the AlQuraishi lab at Columbia University, Arzeda, Cyrus Biotechnology and Outpace Bio with many more to come, so stay tuned and get in touch! The OpenFold Consortium is hosted by the Open Molecular Software Foundation.

Who can join and how can we join the mission?

All are welcome! Member organizations are expected to commit resources to our shared goals and mission. Many types of contributions are welcome, including but not limited to in-kind, FTE, or monetary support. If this is you and your organization, please get in touch by reaching out to info-at-openfold.io.

Where are code and models available?

Our OpenFold suite of protein structure prediction tools is available today on our GitHub! OpenFold is our core platform designed for high-accuracy protein folding predictions. Also available is OpenFold-SoloSeq, which extends OpenFold capabilities by eliminating the need to pre-compute Multiple Sequence Alignments. Additionally, OpenFold-Multimer is available and enables modeling of protein-protein interactions and multimeric complexes. At the OpenFold Consortium, development of OpenFold-Ligand is underway to support the prediction of protein-ligand interactions. Explore our comprehensive training and inference code to leverage these open source tools for your scientific interests. OpenFold training data is available in the Registry of Open Data on AWS (RODA) thanks to their Open Data Sponsorship Program.

Is documentation available for the OpenFold software suite?

Yes! Our developers provide extensive documentation including guides on setting up and training each of the tools in our OpenFold suite. Documentation can be found here.

Founders

Supported by