Summary
The earliest stages of drug discovery entail finding a “chemical probe” – a small molecule that can bind to a protein and change how that protein behaves. Chemical probes help scientists explore the roles that proteins play in the body in health and disease, and can eventually serve as starting points for developing new medicines.
Today, finding chemical probes usually involves testing large chemical libraries in the laboratory to identify “hits” – promising compounds that could potentially lead to a chemical probe. However, this approach is slow and costly, and therefore only has addressed just a small number of proteins that make up the human proteome. As a result, most proteins remain unexplored, limiting our ability to understand their functions and develop new therapies.
Enter LIGAND-AI, which aims to draw on artificial intelligence (AI) and machine learning (ML) to speed up the search for promising hits. Instead of relying solely on traditional experimental methods, the project will facilitate the development of computer models that can make accurate predictions, reducing the time and cost needed to explore proteins that currently have no chemical probes.
Creating AI/ML tools requires high-quality, accessible data, and lots of it. The project’s first task will therefore be to build an open, standardised dataset of protein-ligand interactions. To do this, the team will screen over 2,000 proteins against billions of molecules – an effort that will generate over 2 trillion datapoints, making it by far the biggest dataset of its kind. The proteins used for this dataset will represent a broad cross-section of the human proteome, and will be chosen with input from researchers, industry partners, patient groups and data specialists.
Using this extensive dataset, LIGAND-AI will develop and refine computational models that can predict new protein-ligand interactions. These models will be continually improved by comparing predictions to high-quality experimental results and by engaging the wider AI community to source and test the AI/ML models.
For its part, the project plans to use its models to identify new ligands for over 500 proteins that are implicated in different biological functions and diseases, such as rare diseases, women’s health, neurological disorders, and cancer. These newly-identified ligands will expand the number of proteins that scientists can study and may provide valuable starting points for future drug development.
Once the project outputs have undergone a thorough quality check, all project results – including datasets, protocols and computational tools – will be made available to the wider research community via existing repositories, platforms and providers.
By committing to open science, LIGAND-AI hopes to both speed up and democratise small molecule drug discovery. This will, in turn, advance health research and boost Europe’s competitiveness. The dataset and tools will empower universities, SMEs, large companies, and patient organisations to study diseases, run drug discovery programmes, and potentially launch spin-out companies. The project will also create a space where SMEs can test and validate their technologies within a trusted consortium of client companies from the pharmaceutical and medical device sectors.
Crucially, the project outputs will advance the exploration of the human proteome and support more efficient, data-driven approaches to discovering new small-molecule drugs and related research tools.
Participants
Show participants on mapEFPIA including Vaccines Europe
- Astrazeneca AB, Sodertaelje, Sweden
- Astrazeneca Uk Limited, Cambridge, United Kingdom
- Chemspace Limited Liability Company, Kyiv, Ukraine
- HitGen Inc, Chengdu, China (People's Republic of)
- Novo Nordisk A/S, Bagsvaerd, Denmark
- Nuvisan Icb GMBH, Berlin, Germany
- Pfizer Inc, New York City, United States
- Pfizer Pharma GMBH, Berlin, Germany
- Pfizer R&D UK Limited, Sandwich, United Kingdom
- Vernalis (R&D) Limited, Cambridge, United Kingdom
Universities, research organisations, public bodies, non-profit groups
- European Molecular Biology Laboratory, Heidelberg, Germany
- Fundacio Privada Institut D'Investigacio Oncologica De Vall-Hebron (Vhio), Barcelona, Spain
- Johann Wolfgang Goethe-Universitaet Frankfurt Am Main, Frankfurt am Main, Germany
- Structural Genomics Consortium Lbg, London, United Kingdom
- Universidade Estadual De Campinas, Campinas Sp, Brazil
- University College London, London, United Kingdom
- University Health Network, Toronto, Canada
MedTech Europe
- Abcam Limited, Cambridge, United Kingdom
- Ibm Israel - Science And Technology LTD, Petach Tikva, Israel
- Thermo Fisher Scientific (Bremen) GMBH, Bremen, Germany
Contributing partners
- Enamine Germany GMBH, Frankfurt Am Main, GermanySME
- The Hospital for Sick Children, Toronto, Canada
| Participants | |
|---|---|
| Name | EU funding in € |
| European Molecular Biology Laboratory | 1 462 498 |
| Fundacio Privada Institut D'Investigacio Oncologica De Vall-Hebron (Vhio) | 238 125 |
| Johann Wolfgang Goethe-Universitaet Frankfurt Am Main | 11 350 000 |
| Structural Genomics Consortium Lbg | 2 677 741 |
| University College London | 3 700 000 |
| University Health Network | 10 469 546 |
| Total Cost | 29 897 910 |