DP Technology DevDay 2024 Showcases Large Science Models and Announces Open Science Initiative

BEIJING, April 29, 2024 /PRNewswire/ — In recent years, the rapid development of artificial intelligence has introduced new possibilities across numerous scientific disciplines. As an AI for Science pioneer, DP Technology is continually collaborating with partners to explore the transformative impact AI can bring to science. During its DevDay held in Beijing on April 12th, DP Technology showcased a series of large science models, including the DPA large atomic model[1], Uni-Mol 3D molecular model[2], Uni-Fold protein folding model[3], Uni-RNA ribonucleic acid model[4], and Uni-SMART large language model for multimodal scientific literature[5] among others.

DPA

The rapid development of artificial intelligence (AI) is driving significant changes in the field of atomic modeling, simulation, and design. Inspired by recent advancements of large language models, DP aspires to develop a similar foundational model for the atomic domain. Developed by DP and collaborators, DPA is a large pre-trained model for interatomic potential with attention mechanism. The recently released DPA-2 model addresses the limitations of single-source DFT data reliance in other pre-trained atomistic models. DPA-2 covers ~100 elements in the periodic table. In a perovskite study, Liu Shi’s team at Westlake University utilised the pre-trained DPA increased the efficiency of force field development by 100x. 


DPA-2 is also used in drug discovery. The latest version of Uni-FEP (free energy perturbation) can now be powered by DPA’s pre-trained inter-atomic potential. Uni-FEP now utilizes the DPA-2 pre-trained model to optimize classical force field parameters on-the-fly, providing enhanced free energy predictions. This results in improved R^2 values and reduced RMSE.


Uni-Mol

Uni-Mol, a pre-trained 3D molecular representation learning model (ICLR ’23), now boasts an improved accuracy in predicting these binding poses with over 77% of ligands achieving an RMSD value under 2.0 Å and over 75% passing all quality checks. This marks a substantial leap from the 62% accuracy of the previous version, also eclipsing other known methods. It effectively tackled common challenges like chirality inversions and steric clashes, ensuring that predictions are not just accurate but also chemically viable.

Based on Uni-Mol, VD-Gen[6], developed by DP and collaborators, is capable of directly generating molecules with high binding affinity within the protein pocket. VD-Gen accurately predicts the elemental types and fine-grained atomic coordinates of the generated molecules without the need to coarse-grain the atomic coordinates into a grid, offering higher precision compared to three-dimensional grid-based methods. Furthermore, VD-Gen can efficiently generate all types of atoms and their coordinates simultaneously, outperforming autoregressive generation models in performance without being affected by the order of generation.


 


Uni-QSAR[7], built on the Uni-Mol model, is an innovative tool for automated prediction of molecular properties. It can rapidly and cost-effectively assess ADMET properties during the early stages of drug development. This method utilizes the three-dimensional structural information of molecules, combined with computational chemistry and bioinformatics tools, to predict the behavior of drug molecules in the body. DP demonstrated benchmarks include 22 ADMET public datasets from the TDC Benchmark and 30 activity datasets from the MoleculeACE Benchmark with Chemprop, DeepAutoQSAR, and DeepPurpose as baselines. Uni-QSAR achieved the best performance in 21 out of 22 tasks in the TDC ADMET Benchmark tests and in 26 out of 30 tasks in the MoleculeACE benchmark tests.


Uni-RNA

Uni-RNA is pre-trained on approximately one billion high-quality RNA sequences, covering virtually all RNA space. By fine-tuning the model across a broad range of downstream tasks, Uni-RNA achieved leading results in all three RNA domains: RNA structure prediction, mRNA sequence property prediction, and RNA function prediction.

Through a research conducted by DP, it is found that out of 10 RNA sequences generated by Uni-RNA, each one surpassed the performance level of the commercially available vaccine sequences from Moderna, while being generally comparable and sometimes exceeding the level of BioNTech’s commercial mRNA vaccine sequence. This demonstrates that models like Uni-RNA not only hold immense value for academic research but also possess significant potential for industrial research and development applications.


Uni-Fold

Protein structure modeling is a prerequisite for structure-based drug development. Once a preliminary structure is obtained, further refining and optimizing key structural regions is crucial for ensuring the accuracy of subsequent research.

Uni-Fold is the first protein structure prediction tool to fully open-source its training and inference code. It supports the structural prediction of polymeric protein systems and achieves top industry accuracy in prediction results under the same training datasets.

Uni-SMART

Uni-SMART (Science Multimodal Analysis and Research Transformer) tackles the urgent need for new solutions that can fully understand and analyze multimodal content in scientific literature. Indeed, many LLMs can already ingest PDF, but they often struggle to digest and interpret the rich information encapsulated within charts, graphs, and molecular structures embedded within those documents.

Through rigorous quantitative evaluation, Uni-SMART demonstrates significant performance gain in interpreting and analyzing multimodal contents in scientific documents, such as tables, charts, molecular structures, and chemical reactions, compared with other leading tools, such as GPT-4 and Gemini.

Industrial Software for Drug Discovery, Battery Development and beyond

Advancing AI for Science, DP Technology has developed a suite of industry applications based on its large science models and advanced algorithms. This suite includes the innovative Bohrium® Scientific Research Space, Hermite® Computational Drug Design Platform, RiDYMO® Dynamics Platform, and Piloteye® Battery Design Automation Platform. Together, these platforms support a robust foundation for industrial innovation within an open ecosystem for AI in science, fostering advancements in key areas such as drug discovery, energy, materials science, and information technology.

Open Science initiative

At DevDay, DP Technology joined forces with industry leaders such as CATL, Yunnan Baiyao, Alibaba Cloud, Tencent Cloud, Volcano Engine, China Unicom etc, to initiate an AI for Science open science ecosystem. This cross-industry collaboration aims to integrate the strengths of each party in artificial intelligence, cloud computing, and industry applications to propel innovation. The initiative aims to accelerate the open-source development of datasets, algorithms, code and pre-trained models.

Sun Weijie, founder and CEO of DP Technology, stated, “The launch of large science models is our firm commitment to advancing scientific and industrial innovation. With this series of scientific large models, we are not only able to accelerate the process of scientific research and product development but also increase the success rate of R&D, bringing disruptive impacts to drug discovery, battery development and beyond.”

About DP Technology

DP Technology is a global leader in the “AI for Science” research paradigm,  where AI learns scientific principles and data, then tackles key challenges in  scientific research and industrial R&D.

DP’s commitment to interdisciplinary research has led to the creation of the “DP Particle Universe,” an array of pre-trained large science models  designed to bridge foundational research with practical industrial applications.  DP’s software suite includes the Bohrium® Scientific Research Space, Hermite® Computational Drug Design Platform, RiDYMO® Dynamics Platform, and  Piloteye® Battery Design Automation Platform. Together, these platforms form a robust foundation for industrial innovation and an open ecosystem for AI in  science, fostering advancements in key areas such as drug discovery, energy,  material science, and information technology.

More:https://www.dp.tech/en

Business:[email protected]

Media:[email protected]

Reference:

[1] https://arxiv.org/abs/2312.15492 

[2] https://openreview.net/forum?id=6K2RM6wVqKu 

[3] https://www.biorxiv.org/content/10.1101/2022.08.04.502811v3.full 

[4] https://www.biorxiv.org/content/10.1101/2023.07.11.548588v1 

[5] https://arxiv.org/pdf/2403.10301.pdf 

[6] https://arxiv.org/abs/2302.05847 

[7] https://arxiv.org/abs/2304.12239