The vast amounts of data generated during drug discovery are not always used optimally, meaning scientists miss out on opportunities to make new discoveries using old data. One challenge lies in the fact that much of this data is owned by companies, and making it readily available to others would affect their competitiveness.
The MELLODDY project aims to establish a machine learning platform that would make it possible to learn from multiple sets of proprietary data while respecting their highly confidential nature, as data and asset owners will retain control of their information throughout the project.
Through this innovative, blockchain-based solution, the pharmaceutical companies in the project aim to demonstrate the feasibility of this approach with an unprecedented volume of competitive data in the form of over a billion drug-development-relevant data points, and hundreds of terabytes of image data that annotate the biological effects of more than 10 million small molecules. The platform would also take a federated machine learning approach, meaning that the learning effort is not centralised but spread over different, physically separated partners.
The hope is that this solution will deliver insights that will advance drug development by making it easier to identify which small molecules show the most promise for further research.
Achievements & News
Machine learning promises to make drug discovery faster, better and cheaper, but it requires access to vast datasets of molecules and their properties. While every pharma company can apply machine learning algorithms to their own data, the true power of this technology comes from combining the (usually ultra-confidential) datasets of several companies to fuel the algorithms.
The MELLODDY project is working on using machine learning to make the most of the combined power of these highly valuable datasets without sharing them, exposing them, or even moving them from where they’re housed.
Because companies’ data is too valuable to risk sharing, MELLODDY is applying a technique called federated learning: this allows datasets to remain behind their firewall, stored independently from each other. With this method, algorithms go back and forth between subsets of each company’s data and the central server, which prevents anyone from knowing which company’s data adds to the central model. This exposes the algorithm to a much wider range of data than any one company has in-house. All this is done while keeping sensitive data safely ensconced within each company’s own infrastructure.
Earlier this year, MELLODDY announced that it had managed to carry out the first successful federated learning run using this new predictive modelling platform.
Find out more
- Read the article in full
ParticipantsShow participants on map
- Amgen Research (Munich) GMBH, Munchen, Germany
- Astellas Pharma Europe BV, Leiden, Netherlands
- Astrazeneca AB, Södertälje, Sweden
- Bayer Aktiengesellschaft, Leverkusen, Germany
- Boehringer Ingelheim Internationalgmbh, Ingelheim, Germany
- Glaxosmithkline Research And Development LTD., Brentford, Middlesex, United Kingdom
- Institut De Recherches Servier, Suresnes, France
- Janssen Pharmaceutica Nv, Beerse, Belgium
- Merck Kommanditgesellschaft Auf Aktien, Darmstadt, Germany
- Novartis Pharma AG, Basel, Switzerland
Universities, research organisations, public bodies, non-profit groups
- Budapesti Muszaki Es Gazdasagtudomanyi Egyetem, Budapest, Hungary
- Katholieke Universiteit Leuven, Leuven, Belgium
Small and medium-sized enterprises (SMEs) and mid-sized companies (<€500 m turnover)
- Iktos, Paris, France
- Kubermatic GMBH, Hamburg, Germany
- Owkin France, Paris, France
- Substra, Nantes, France
Non EFPIA companies
- Nvidia Switzerland AG, Zurich, Switzerland
|Name||EU funding in €|
|Budapesti Muszaki Es Gazdasagtudomanyi Egyetem||1 253 297|
|Katholieke Universiteit Leuven||1 145 006|
|Kubermatic GMBH||1 716 178|
|Owkin France||2 683 635|
|Total Cost||8 000 000|