Oncotarget

Research Papers:

iHyd-PseCp: Identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC

Wang-Ren Qiu, Bi-Qian Sun, Xuan Xiao _, Zhao-Chun Xu and Kuo-Chen Chou

PDF  |  HTML  |  Supplementary Files  |  How to cite

Oncotarget. 2016; 7:44310-44321. https://doi.org/10.18632/oncotarget.10027

Metrics: PDF 2450 views  |   HTML 4395 views  |   ?  


Abstract

Wang-Ren Qiu1,2, Bi-Qian Sun1, Xuan Xiao1,3, Zhao-Chun Xu1, Kuo-Chen Chou3,4,5

1Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China

2Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia, MO, USA

3Gordon Life Science Institute, Boston, MA, USA

4Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia

5Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China

Correspondence to:

Xuan Xiao, email: [email protected]

Keywords: PTMs, hydroxyproline, hydroxylysine, sequence-coupling model, general PseAAC

Received: April 20, 2016     Accepted: May 29, 2016     Published: June 14, 2016

ABSTRACT

Protein hydroxylation is a posttranslational modification (PTM), in which a CH group in Pro (P) or Lys (K) residue has been converted into a COH group, or a hydroxyl group (-OH) is converted into an organic compound. Closely associated with cellular signaling activities, this type of PTM is also involved in some major diseases, such as stomach cancer and lung cancer. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of P or K, which ones can be hydroxylated, and which ones cannot? With the explosive growth of protein sequences in the post-genomic age, the problem has become even more urgent. To address such a problem, we have developed a predictor called iHyd-PseCp by incorporating the sequence-coupled information into the general pseudo amino acid composition (PseAAC) and introducing the “Random Forest” algorithm to operate the calculation. Rigorous jackknife tests indicated that the new predictor remarkably outperformed the existing state-of-the-art prediction method for the same purpose. For the convenience of most experimental scientists, a user-friendly web-server for iHyd-PseCp has been established at http://www.jci-bioinfo.cn/iHyd-PseCp, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.


Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 10027