Oncotarget

Research Papers:

iPhos-PseEn: Identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier

Wang-Ren Qiu, Xuan Xiao, Zhao-Chun Xu and Kuo-Chen Chou _

PDF  |  HTML  |  Supplementary Files  |  How to cite

Oncotarget. 2016; 7:51270-51283. https://doi.org/10.18632/oncotarget.9987

Metrics: PDF 2172 views  |   HTML 3395 views  |   ?  


Abstract

Wang-Ren Qiu1,2, Xuan Xiao1,3, Zhao-Chun Xu1, Kuo-Chen Chou3,4,5

1Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China

2Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia, MO, USA

3Gordon Life Science Institute, Boston, MA, USA

4Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia

5Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China

Correspondence to:

Wang-Ren Qiu, email: [email protected]

Xuan Xiao, email: [email protected]

Kuo-Chen Chou, email: [email protected]

Keywords: protein phosphorylation, pseudo components, random forests, ensemble classifier

Received: April 05, 2016     Accepted: May 23, 2016     Published: June 13, 2016

ABSTRACT

Protein phosphorylation is a posttranslational modification (PTM or PTLM), where a phosphoryl group is added to the residue(s) of a protein molecule. The most commonly phosphorylated amino acids occur at serine (S), threonine (T), and tyrosine (Y). Protein phosphorylation plays a significant role in a wide range of cellular processes; meanwhile its dysregulation is also involved with many diseases. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of S, T, or Y, which ones can be phosphorylated, and which ones cannot? To address this problem, we have developed a predictor called iPhos-PseEn by fusing four different pseudo component approaches (amino acids’ disorder scores, nearest neighbor scores, occurrence frequencies, and position weights) into an ensemble classifier via a voting system. Rigorous cross-validations indicated that the proposed predictor remarkably outperformed its existing counterparts. For the convenience of most experimental scientists, a user-friendly web-server for iPhos-PseEn has been established at http://www.jci-bioinfo.cn/iPhos-PseEn, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.


Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 9987