Oncotarget

Research Papers:

Ovarian cancer variant rs2072590 is associated with HOXD1 and HOXD3 gene expression

PDF |  HTML  |  Supplementary Files  |  How to cite

Oncotarget. 2017; 8:103410-103414. https://doi.org/10.18632/oncotarget.21902

Metrics: PDF 1480 views  |   HTML 1961 views  |   ?  

Liyuan Guo, Yan Peng, Lei Sun, Xia Han, Juan Xu and Dongwei Mao _

Abstract

Liyuan Guo1,*, Yan Peng2,*, Lei Sun3,*, Xia Han4, Juan Xu4 and Dongwei Mao4

1Department of Gynecological Oncology, Cancer Hospital of Harbin Medical University, Harbin, China

2Disease Prevention Center, First Affiliated Hospital, Heilongjiang University of Chinese Medicine, Harbin, China

3Department of Gynecology and Obstetrics, The Fourth Hospital of Harbin Medical University, Harbin, China

4Shenzhen Hospital of Guangzhou University of Chinese Medicine, Shenzhen, China

*These authors contributed equally to this work

Correspondence to:

Dongwei Mao, email: [email protected]

Keywords: ovarian cancer, genome-wide association study, gene expression

Received: July 17, 2017     Accepted: September 21, 2017     Published: October 13, 2017

ABSTRACT

Ovarian cancer (OC) is a common cancer in women and the leading cause of deaths from gynaecological malignancies in the world. In addition to the candidate gene approach to identify OC susceptibility genes, the genome-wide association study (GWAS) methods have reported new variants that are associated with OC risk. The minor allele of rs2072590 at 2q31 was associated with an increased OC risk, and was primarily significant for serous subtype. The OC risk-associated SNP rs2072590 lies in non-coding DNA downstream of HOXD3 and upstream of HOXD1, and it tags SNPs in the HOXD3 3′ UTR. We think that the non-coding rs2072590 variant may contribute to OC susceptibility by regulating the gene expression of HOXD1 and HOXD3. In order to investigate this association, we performed a bioinformatics analysis by a functional annotation of rs2072590 variant using RegulomeDB (version 1.1), HaploReg (version 4.1), and PhenoScanner (version 1.1). Using HaploReg, we identified 19 genetic variants tagged by rs2072590 variant with with r2 >= 0.8. Using RegulomeDB, we identified that three genetic variants are likely to affect TF binding + any motif + DNase Footprint + DNase peak. Other genetic variants are likely to affect TF binding + DNase peak. Using PhenoScanner (version 1.1), we identified that these 19 genetic variants could significantly regulate the expression of nearby genes, especially the HOXD1 and HOXD3 in human ovary tissue.


INTRODUCTION

Ovarian cancer (OC) is a common cancer in women and the leading cause of deaths from gynaecological malignancies in the world [1]. Like other human complex diseases, OC is caused by the combination of genetic variants and environmental factors, including the familial BRCA1 and BRCA2 mutations and common genetic variants of lower penetrance [1]. In addition to the candidate gene approach to identify OC susceptibility genes, the genome-wide association study (GWAS) methods have also reported new variants that are associated with OC risk [1].

However, the exact genetic mechanisms for these OC susceptibility variants are still unclear [2]. It is reported that the potential associations between gene expression and OC risk alleles may connect risk variants to their putative target genes/transcripts and biological pathways [2]. The minor allele of rs2072590 at 2q31 was associated with an increased OC risk (OR = 1.16, 95% CI 1.12–1.21, p = 4.5 × 10−14), and was primarily significant for serous subtype (OR = 1.20, 95% CI 1.14–1.25, p = 3.8×10−14) [3]. The 2q31 locus contains a family of homeobox (HOX) genes involved in regulating embryogenesis and organogenesis [3]. Altered expression of HOX genes has been reported in many cancers [3]. The OC risk-associated SNP rs2072590 lies in non-coding DNA downstream of HOXD3 and upstream of HOXD1, and it tags SNPs in the HOXD3 3′ UTR [3].

We think that the non-coding rs2072590 variant may contribute to OC susceptibility by regulating the gene expression of HOXD1 and HOXD3. In order to investigate this association, we conducted a functional annotation of rs2072590 variant using RegulomeDB (version 1.1) [4], HaploReg (version 4.1) [5], and PhenoScanner (version 1.1) [6].

RESULTS

LD analysis using HaploReg

Using the LD information from the 1000 Genomes Project (EUR), we got 19 genetic variants tagged by rs2072590 variant with with r2 >= 0.8. These 19 genetic variants are located around the HOXD4, HOXD3, AC009336.24 and HOXD-AS1. Here, we give the detailed information including the LD information about these variants in Table 1.

Table 1: rs2072590 and variants with r2 > = 0.8

SNP

chromosome

pos (hg38)

LD (r²)

LD (D’)

Ref

Alt

Gene

Functional annotation

rs4972504

2

176153998

0.89

0.98

T

C

HOXD4

rs2551802

2

176157430

0.85

0.93

C

G

HOXD3

rs2252895

2

176159192

0.96

0.98

A

G

HOXD3

rs2252894

2

176159194

0.88

0.96

G

C

HOXD3

rs2857538

2

176159533

0.98

1

C

T

HOXD3

rs2857540

2

176161970

0.98

0.99

G

T

HOXD3

rs2113559

2

176166371

0.97

0.99

A

G

HOXD3

intronic

rs717852

2

176166895

0.98

1

C

T

HOXD3

intronic

rs2249131

2

176167367

0.98

1

C

T

HOXD3

intronic

rs2857532

2

176168555

0.98

1

A

G

HOXD3

intronic

rs1051929

2

176172026

1

1

T

C

HOXD3

synonymous

rs711830

2

176172583

1

1

A

G

HOXD3

3′-UTR

rs1318778

2

176173103

1

1

C

G

HOXD3

rs1549334

2

176174469

1

1

G

A

HOXD3

rs6433571

2

176174850

0.98

1

G

T

HOXD3

rs2072590

2

176177905

1

1

A

C

AC009336.24

intronic

rs6755766

2

176178477

0.96

1

T

C

AC009336.24

intronic

rs6755777

2

176178498

0.99

1

T

G

AC009336.24

intronic

rs1562315

2

176180754

0.98

1

T

A

HOXD-AS1

intronic

AFR, African samples; AMR, Ad Mixed American samples; ASN, East Asian samples; EUR, European samples; LD, linkage disequilibrium; SNP, single nucleotide polymorphism; Ref = reference allele; Alt = altered allele.

Functional annotation using RegulomeDB

RegulomeDB was used to annotate these 19 genetic variants with known and predicted regulatory elements. The results showed that three genetic variants including rs1562315, rs2551802 and rs6433571 likely to affect TF binding + any motif + DNase Footprint + DNase peak, as described in Table 2. Other genetic variants are likely to affect TF binding + DNase peak. More detailed results are described in Table 2.

Table 2: Functional annotation results using RegulomeDB

SNP

chromosome

pos (hg38)

Ref

Alt

Regulome DB Score

rs1562315

2

176180754

T

A

2b

rs2551802

2

176157430

C

G

2b

rs6433571

2

176174850

G

T

2b

rs1051929

2

176172026

T

C

4

rs1318778

2

176173103

C

G

4

rs1549334

2

176174469

G

A

4

rs2072590

2

176177905

A

C

4

rs2249131

2

176167367

C

T

4

rs2252894

2

176159194

G

C

4

rs2252895

2

176159192

A

G

4

rs2857538

2

176159533

C

T

4

rs2857540

2

176161970

G

T

4

rs6755766

2

176178477

T

C

4

rs6755777

2

176178498

T

G

4

rs711830

2

176172583

A

G

4

rs2113559

2

176166371

A

G

5

rs2857532

2

176168555

A

G

5

rs4972504

2

176153998

T

C

5

rs717852

2

176166895

C

T

5

1a, eQTL + TF binding + matched TF motif + matched DNase Footprint + DNase peak; 1b, eQTL + TF binding + any motif + DNase Footprint + DNase peak; 1c, eQTL + TF binding + matched TF motif + DNase peak; 1d, eQTL + TF binding + any motif + DNase peak; 1e, eQTL + TF binding + matched TF motif; 1f, eQTL + TF binding / DNase peak; 2a, TF binding + matched TF motif + matched DNase Footprint + DNase peak; 2b, TF binding + any motif + DNase Footprint + DNase peak; 2c, TF binding + matched TF motif + DNase peak; 3a, TF binding + any motif + DNase peak; 3b, TF binding + matched TF motif; 4, TF binding + DNase peak; 5, TF binding or DNase peak; 6, other.

Functional annotation using PhenoScanner

Using PhenoScanner (version 1.1), we identified that these 19 genetic variants could significantly regulate the expression of nearby genes including HOXD-AS1, HOXD3, HOXD1, HOXD4, ATP5G3, HOXD9, HOXD11, KIAA1715, MTX2, LINC01116, HOXD-AS2, HOXD8, and HOXD10 in 32 human tissues. These tissues include Adipose subcutaneous, Adipose visceral omentum, Artery tibial, Brain cerebellar hemisphere, Brain hippocampus, Brain nucleus accumbens basal ganglia, Brain putamen basal ganglia, Breast mammary tissue, Cells transformed fibroblasts, Colon sigmoid, Colon transverse, Esophagus gastroesophageal junction, Esophagus mucosa. Esophagus muscularis, Heart atrial appendage, Lung, Lymphoblastoid cell lines, Muscle skeletal, Nerve tibial, Ovary, Pancreas, Peripheral blood, Skin, Skin not sun exposed suprapubic, Skin sun exposed lower leg, Small intestine terminal ileum, Spleen, Stomach, Testis, Thyroid, Uterus and Whole blood. Interestingly, these genetic variants could significantly regulate the gene expression of HOXD1 and HOXD3 in human ovary tissue, as described in Table 3. More detailed results in 32 human tissues are described in Supplementary Table 1.

Table 3: 19 genetic variants and gene expression in human ovary tissue

DISCUSSION

Overall, the GWAS methods have reported new variants that are associated with OC risk [1]. However, the exact genetic mechanisms for these OC susceptibility variants are still unclear [2]. Evidence shows that the potential associations between gene expression and OC risk alleles may connect risk variants to their putative target genes/transcripts and biological pathways [2]. Zhao et al. selected seven OC risk variants including rs3814113 on 9p22, rs2072590 on 2q31, rs2665390 on 3q25, rs10088218, rs1516982, rs10098821 on 8q24, and rs2363956 on 19p13 [2]. They evaluated the associations between gene expression and OC risk alleles using the whole genome mRNA expression data in 121 lymphoblastoid cell lines from 74 non-related familial ovarian cancer patients, and 47 non-cancer unrelated family controls [2]. They identified two cis-associations between rs10098821 and c-Myc, and rs2072590 and HS.565379.

The OC risk-associated SNP rs2072590 lies in non-coding DNA downstream of HOXD3 and upstream of HOXD1, and it tags SNPs in the HOXD3 3′ UTR [3]. However, Zhao et al. did not report any significant association between rs2072590 and HOXD1 or HOXD3. We think that the non-coding rs2072590 variant may contribute to OC susceptibility by regulating the gene expression of HOXD1 and HOXD3. Here, we conducted a functional annotation of rs2072590 variant using RegulomeDB (version 1.1) [4], HaploReg (version 4.1) [5], and PhenoScanner (version 1.1) [6].

Using HaploReg, we identified 19 genetic variants tagged by rs2072590 variant with with r2 >= 0.8. Using RegulomeDB, we identified that three genetic variants are likely to affect TF binding + any motif + DNase Footprint + DNase peak. Other genetic variants are likely to affect TF binding + DNase peak. Using PhenoScanner (version 1.1), we identified that these 19 genetic variants could significantly regulate the expression of nearby genes, especially the HOXD1 and HOXD3 in human ovary tissue.

In addition to the OC, some other comprehensive functional annotation of human complex diseases have also been conducted including colorectal cancer [7, 8], prostate cancer [911], breast cancer [12], multiple sclerosis [13], and Alzheimer’s disease [14]. Collectively, we think that our results provide further insight into the genetic architecture of inherited susceptibility to OC, as did in previous studies [714].

MATERIALS AND METHODS

LD analysis using HaploReg

HaploReg is a tool for exploring annotations of the noncoding genome at variants on haplotype blocks [5]. HaploReg includes LD information from the 1000 Genomes Project, chromatin state and protein binding annotation from the Roadmap Epigenomics and the Encyclopedia of DNA Elements (ENCODE) projects, sequence conservation across mammals, the effect of SNPs on regulatory motifs, and the effect of SNPs on gene expression from eQTL studies [5]. We used HaploReg (version 4.1) to identify the rs2072590 tagged variants using the LD information from the 1000 Genomes Project (EUR) with r2 > = 0.8 [5].

Functional annotation using RegulomeDB

RegulomeDB (version 1.1) is a database that annotates SNPs with known and predicted regulatory elements in the intergenic regions of the human genome [4]. Known and predicted regulatory DNA elements include regions of DNAase hypersensitivity, binding sites of transcription factors, and promoter regions that have been biochemically characterized to regulation transcription [4]. RegulomeDB (version 1.1) includes the public datasets from Gene Expression Omnibus (GEO), the ENCODE project, and published literature [4].

Functional annotation using PhenoScanner

PhenoScanner (version 1.1) is a curated database holding publicly available results from large-scale GWAS [6]. The motivation for creating this tool is to facilitate “phenome scans”, the cross-referencing of genetic variants with a broad range of phenotypes, to help aid the understanding of disease pathways and biology [6]. The catalogue currently contains nearly 3 billion associations and over 10 million unique SNPs [6]. The results are aligned across traits to the same effect and non-effect alleles for each SNP [6].

CONFLICTS OF INTEREST

The authors declare no competing financial interests.

REFERENCES

1. Chen K, Ma H, Li L, Zang R, Wang C, Song F, Shi T, Yu D, Yang M, Xue W, Dai J, Li S, Zheng H, et al. Genome-wide association study identifies new susceptibility loci for epithelial ovarian cancer in Han Chinese women. Nat Commun. 2014; 5:4682.

2. Zhao H, Shen J, Wang D, Guo Y, Gregory S, Medico L, Hu Q, Yan L, Odunsi K, Lele S, Liu S. Associations between gene expression variations and ovarian cancer risk alleles identified from genome wide association studies. PLoS One. 2012; 7:e47962.

3. Goode EL, Chenevix-Trench G, Song H, Ramus SJ, Notaridou M, Lawrenson K, Widschwendter M, Vierkant RA, Larson MC, Kjaer SK, Birrer MJ, Berchuck A, Schildkraut J, et al. A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24. Nat Genet. 2010; 42:874–9.

4. Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, Cherry JM, Snyder M. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012; 22:1790–7.

5. Ward LD, Kellis M. HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res. 2016; 44:D877–81.

6. Staley JR, Blackshaw J, Kamat MA, Ellis S, Surendran P, Sun BB, Paul DS, Freitag D, Burgess S, Danesh J, Young R, Butterworth AS. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics. 2016; 32:3207–9.

7. Lu X, Cao M, Han S, Yang Y, Zhou J. Colorectal cancer risk genes are functionally enriched in regulatory pathways. Sci Rep. 2016; 6:25347.

8. Yao L, Tak YG, Berman BP, Farnham PJ. Functional annotation of colon cancer risk SNPs. Nat Commun. 2014; 5:5114.

9. Jiang J, Jia P, Shen B, Zhao Z. Top associated SNPs in prostate cancer are significantly enriched in cis-expression quantitative trait loci and at transcription factor binding sites. Oncotarget. 2014; 5:6168–77. https://doi.org/10.18632/oncotarget.2179.

10. Hazelett DJ, Rhie SK, Gaddis M, Yan C, Lakeland DL, Coetzee SG, Henderson BE, Noushmehr H, Cozen W, Kote-Jarai Z, Eeles RA, Easton DF, Haiman CA, et al. Comprehensive functional annotation of 77 prostate cancer risk loci. PLoS Genet. 2014; 10:e1004102.

11. Lu Y, Zhang Z, Yu H, Zheng SL, Isaacs WB, Xu J, Sun J. Functional annotation of risk loci identified through genome-wide association studies for prostate cancer. Prostate. 2011; 71:955–63.

12. Rhie SK, Coetzee SG, Noushmehr H, Yan C, Kim JM, Haiman CA, Coetzee GA. Comprehensive functional annotation of seventy-one breast cancer risk Loci. PLoS One. 2013; 8:e63925.

13. Disanto G, Kjetil Sandve G, Ricigliano VA, Pakpoor J, Berlanga-Taylor AJ, Handel AE, Kuhle J, Holden L, Watson CT, Giovannoni G, Handunnetthi L, Ramagopalan SV. DNase hypersensitive sites and association with multiple sclerosis. Hum Mol Genet. 2014; 23:942–8.

14. Jiang Q, Jin S, Jiang Y, Liao M, Feng R, Zhang L, Liu G, Hao J. Alzheimer’s Disease Variants with the Genome-Wide Significance are Significantly Enriched in Immune Pathways and Active in Immune Cells. Mol Neurobiol. 2017; 54:594–600.


Creative Commons License All site content, except where otherwise noted, is licensed under a Creative Commons Attribution 4.0 License.
PII: 21902