Research Papers:
Prediction of the aquatic toxicity of aromatic compounds to tetrahymena pyriformis through support vector regression
PDF | HTML | Supplementary Files | How to cite
Metrics: PDF 2541 views | HTML 3102 views | ?
Abstract
Qiang Su1, Wencong Lu2, Dongshu Du1,3, Fuxue Chen1, Bing Niu1,4 and Kuo-Chen Chou4,5,6
1College of Life Science, Shanghai University, Shanghai 200444, China
2Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China
3Department of Life Science, Heze University, Shandong 274500, China
4Gordon Life Science Institute, Boston, MA 02478, USA
5Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
6Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Correspondence to:
Bing Niu, email: bniu@gordonlifescience.org, bingniu@shu.edu.cn
Fuxue Chen, email: chemfuxue@staff.shu.edu.cn
Dongshu Du, email: sdhzdds@163.com
Keywords: aromatic compounds, tetrahymena pyriformis, QSAR, genetic algorithm, mRMR
Received: March 10, 2017 Accepted: March 30, 2017 Published: April 13, 2017
ABSTRACT
Toxicity evaluation is an extremely important process during drug development. It is usually initiated by experiments on animals, which is time-consuming and costly. To speed up such a process, a quantitative structure-activity relationship (QSAR) study was performed to develop a computational model for correlating the structures of 581 aromatic compounds with their aquatic toxicity to tetrahymena pyriformis. A set of 68 molecular descriptors derived solely from the structures of the aromatic compounds were calculated based on Gaussian 03, HyperChem 7.5, and TSAR V3.3. A comprehensive feature selection method, minimum Redundancy Maximum Relevance (mRMR)-genetic algorithm (GA)-support vector regression (SVR) method, was applied to select the best descriptor subset in QSAR analysis. The SVR method was employed to model the toxicity potency from a training set of 500 compounds. Five-fold cross-validation method was used to optimize the parameters of SVR model. The new SVR model was tested on an independent dataset of 81 compounds. Both high internal consistent and external predictive rates were obtained, indicating the SVR model is very promising to become an effective tool for fast detecting the toxicity.

PII: 17210