Residential College | false |
Status | 已發表Published |
In silico prediction of toxic action mechanisms of phenols for imbalanced data with Random Forest learner | |
Jing Chen1; Yuan Yan Tang1,2; Bin Fang1; Chang Guo1 | |
2012-05-01 | |
Source Publication | Journal of Molecular Graphics and Modelling |
ISSN | 1093-3263 |
Volume | 35Pages:21-27 |
Abstract | With an increasing need for the rapid and effective safety assessment of compounds in industrial and civil-use products, in silico toxicity exploration techniques provide an economic way for environmental hazard assessment. The previous in silico researches have developed many quantitative structure-activity relationships models to predict toxicity mechanisms for last decade. Most of these methods benefit from data analysis and machine learning techniques, which rely heavily on the characteristics of data sets. For Tetrahymena pyriformis toxicity data sets, there is a great technical challenge - data imbalance. The skewness of data class distribution would greatly deteriorate the prediction performance on rare classes. Most of the previous researches for phenol mechanisms of toxic action prediction did not consider this practical problem. In this work, we dealt with the problem by considering the difference between the two types of misclassifications. Random Forest learner was employed in cost-sensitive learning framework to construct prediction models based on selected molecular descriptors. In computational experiments, both the global and local models obtained appreciable overall prediction accuracies. Particularly, the performance on rare classes was indeed promoted. Moreover, for practical usage of these models, the balance of the two misclassifications can be adjusted by using different cost matrices according to the application goals. |
Keyword | Cost-sensitive Phenols Qsar Random Forest Toxic Action Mechanisms |
DOI | 10.1016/j.jmgm.2012.01.002 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Biochemistry & Molecular Biology ; Computer Science ; Crystallography ; Mathematical & Computational Biology |
WOS Subject | Biochemical Research Methods ; Biochemistry & Molecular Biology ; Computer Science, Interdisciplinary Applications ; Crystallography ; Mathematical & Computational Biology |
WOS ID | WOS:000304513400003 |
Publisher | ELSEVIER SCIENCE INC, 360 PARK AVE SOUTH, NEW YORK, NY 10010-1710 USA |
Scopus ID | 2-s2.0-84859802343 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | University of Macau |
Affiliation | 1.College of Computer Science, Chongqing University, Chongqing 400030, China 2.Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Taipa, Macau, China |
Recommended Citation GB/T 7714 | Jing Chen,Yuan Yan Tang,Bin Fang,et al. In silico prediction of toxic action mechanisms of phenols for imbalanced data with Random Forest learner[J]. Journal of Molecular Graphics and Modelling, 2012, 35, 21-27. |
APA | Jing Chen., Yuan Yan Tang., Bin Fang., & Chang Guo (2012). In silico prediction of toxic action mechanisms of phenols for imbalanced data with Random Forest learner. Journal of Molecular Graphics and Modelling, 35, 21-27. |
MLA | Jing Chen,et al."In silico prediction of toxic action mechanisms of phenols for imbalanced data with Random Forest learner".Journal of Molecular Graphics and Modelling 35(2012):21-27. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment