Residential College | false |
Status | 已發表Published |
GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47 | |
Shan, Wenying1,2; Chen, Lvqi1; Xu, Hao3,4; Zhong, Qinghao5; Xu, Yinqiu6; Yao, Hequan1![]() ![]() ![]() | |
2023 | |
Source Publication | Frontiers in Chemistry
![]() |
ISSN | 2296-2646 |
Volume | 11 |
Abstract | Identifying compound–protein interaction plays a vital role in drug discovery. Artificial intelligence (AI), especially machine learning (ML) and deep learning (DL) algorithms, are playing increasingly important roles in compound-protein interaction (CPI) prediction. However, ML relies on learning from large sample data. And the CPI for specific target often has a small amount of data available. To overcome the dilemma, we propose a virtual screening model, in which word2vec is used as an embedding tool to generate low-dimensional vectors of SMILES of compounds and amino acid sequences of proteins, and the modified multi-grained cascade forest based gcForest is used as the classifier. This proposed method is capable of constructing a model from raw data, adjusting model complexity according to the scale of datasets, especially for small scale datasets, and is robust with few hyper-parameters and without over-fitting. We found that the proposed model is superior to other CPI prediction models and performs well on the constructed challenging dataset. We finally predicted 2 new inhibitors for clusters of differentiation 47(CD47) which has few known inhibitors. The ICs of enzyme activities of these 2 new small molecular inhibitors targeting CD47-SIRPα interaction are 3.57 and 4.79 μM respectively. These results fully demonstrate the competence of this concise but efficient tool for CPI prediction. |
Keyword | Artificial Intelligence Compound-protein Interaction Prediction Gcforest Small-molecule Cd47 Inhibitors Word2vec |
DOI | 10.3389/fchem.2023.1292869 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Chemistry |
WOS Subject | Chemistry, Multidisciplinary |
WOS ID | WOS:001092371800001 |
Publisher | FRONTIERS MEDIA SAAVENUE DU TRIBUNAL FEDERAL 34, LAUSANNE CH-1015, SWITZERLAND |
Scopus ID | 2-s2.0-85175858746 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Health Sciences |
Corresponding Author | Yao, Hequan; Lin, Kejiang; Li, Xuanyi |
Affiliation | 1.Department of Medicinal Chemistry, School of Pharmacy, China Pharmaceutical University, Nanjing, China 2.Faculty of Health Sciences, University of Macau, Macao 3.Institute of Chemical Industry of Forest Products, Chinese Academy of Forestry, Nanjing, China 4.National Engineering Laboratory for Biomass Chemical Utilization, Nanjing, China 5.School of Humanities and Social Sciences, The Chinese University of Hong Kong, Shenzhen, China 6.Department of Pharmacy, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China |
First Author Affilication | Faculty of Health Sciences |
Recommended Citation GB/T 7714 | Shan, Wenying,Chen, Lvqi,Xu, Hao,et al. GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47[J]. Frontiers in Chemistry, 2023, 11. |
APA | Shan, Wenying., Chen, Lvqi., Xu, Hao., Zhong, Qinghao., Xu, Yinqiu., Yao, Hequan., Lin, Kejiang., & Li, Xuanyi (2023). GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47. Frontiers in Chemistry, 11. |
MLA | Shan, Wenying,et al."GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47".Frontiers in Chemistry 11(2023). |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment