Identification of the Bona fide Differentially Methylated Gene Markers among Cancers  

Hongbo Liu1 , Zhe Li2 , Jing Ding3 , Jie Liu3 , Yan Zhang1
1 College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
2 College of Pharmacy, Harbin Medical University, Harbin, 150081, China
3 The Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, China
Author    Correspondence author
Computational Molecular Biology, 2013, Vol. 3, No. 2   doi: 10.5376/cmb.2013.03.0002
Received: 02 Aug., 2013    Accepted: 19 Aug., 2013    Published: 28 Oct., 2013
© 2013 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:

Liu et al., 2013, Identification of the Bona fide Differentially Methylated Gene Markers among Cancers, Computational Molecular Biology, Vol.3, No.2 6-15 (doi: 10.5376/cmb.2013.03.0002)


DNA methylation plays important roles in the development of cancers. Previous studies have identified the differentially methylated sites (DMSs) between cancer and normal control. However, the methylation variations across multiple cancers have not been revealed. In this study, we identified DMSs among six human cancers (C-DMSs) and DMSs among five normal control tissues (T-DMSs). It is revealed that C-DMSs are highly overlapped with T-DMSs. By excluding the T-DMRs from C-DMRs, 4159 bona fide C-DMSs were selected as methylation variations across multiple cancers. Further analysis confirmed the roles of bona fide C-DMSs in regulation of cancer-related gene expression difference. Moreover, the genes related with these bona fide C-DMSs showed enrichment in the biological processes such as cell membrane components, cell adhesion, cell migration, immune response and cell proliferation, and also the pathways in cancer and bladder cancer. And twenty-eight genes are targeted by hsa-miR-323 which participates in tumorigenesis. In the end, we identified potential cancer-related genes by extracting protein interaction sub-network. This study provides a new framework for mining the potential cancer-specific methylation markers and oncogenes.

DNA methylation; Bona fide C-DMSs; Methylation variation; Ancer-specific methylation markers

DNA methylation plays an important role in the development of cancers (Esteller, 2008). Cancer is a complex collection of diseases that differ on basis of the tissue of origin. Most of cancer deaths are due to the metastasis of cancer cells from its original site to another area of the body (Rodenhiser, 2009; Bhatia et al., 2012). Besides genetic contributors to metastasis, there are also epigenetic alterations involved in cancer metastasis. DNA methylation of promoters in some genes take part in a wide variety of essential molecular pathways related with metastasis (Heng et al., 2010). The recent study by Fang et al. characterized the methylomes of breast cancers with diverse metastatic behavior (Zhang et al., 2006). However, the cancer-specific alterations and their effects on carcinogenesis and metastasis remain obscure.

The investigation of cancer-specific alterations in DNA methylation enables the mining of the hallmarks of human malignancies. Previous studies have identified the differentially methylated regions (DMRs) between cancer and normal control by bioinformatics tools such MethMarker (Schuffler et al., 2009). For instance, Costello et al. identified aberrantly methylated CpG islands in tumors and tumor-type specific methylation patterns (Costello et al., 2000). In addition, further analysis about colon cancer by Irizarry et al. proved the existence of methylation alterations in CpG island shores (Irizarry et al., 2009).
They also found the cancer-specific DMRs between colon cancer and matched normal mucosa overlap DMRs among three normal tissues (brain, liver and spleen) significantly. Furthermore, Hansen et al. identified colon cancer-specific differentially DNA-methylated regions that may contribute to tumor heterogeneity (Hansen et al., 2011). Identification of more cancer-specific abnormal methylation markers should be beneficial for mining of therapeutic and diagnostic indicators as DNA methylation is somatically heritable and reversible.

High-throughput methylation profiling technologies makes it possible to quest the methylation variations among multiple cancers. Illumina Human Methylation 27 BeadChip allows researchers to interrogate the methylation status of more than 27000 highly informative CpG sites spanning 14,475 genes including 1,126 cancer-related genes (He et al., 2007). This high-density panel lets researchers profile up to 12 samples in parallel, which makes it adequate for case-control studies. Thus, this technology has been widely used to profile the methylation patterns of cancers and their normal control tissues (Calin and Croce, 2006; Wang et al., 2007; Yoon and De Micheli, 2005; Weber et al., 2005).

However, there has not been a comprehensive understanding of the location and function of DMRs among different cancers (C-DMRs). Thus in this study, we focused on following two questions by analyzing the methylation states of more than 27,000 CpG sites located in gene promoters in six different cancers and five corresponding normal controls. First, where is the methylation variation among multiple cancers? Taking into account DMRs among normal tissues (T-DMRs) which may play a role in cellular identity and the regulation of tissue-specific genome function (Rakyan et al., 2008), we analyzed the relationships between C-DMRs and T-DMRs and identified the bona fide C-DMRs. Second, what are function roles of these methylation variations among multiple cancers? To this end, we carried out a comprehensive study in regulatory mechanism, functional annotation and protein interactions on the genes related with bona fide C-DMRs.

1 Results
1.1 DNA methylation discriminates human tissue types
In order to analyze the methylation patterns in different human cancers and their corresponding normal tissue, we obtained methylation states of 27543 CpGs in 297 samples from six cancers and five matched normal control tissues (Materials and Methods). To view the methylation patterns in different cancers and tissues, we performed hierarchical clustering using Euclidean distance. The hierarchical clustering in all 297 samples shows the similar methylation pattern among the samples representing the same tissue or cancer. The hierarchical clustering based on the mean methylation levels among all the replicate samples per tissue/cancer also perfectly discriminated among different tissue types, regardless of the normal or disease status (Figure 1A). 



Figure 1 Methylation pattern of 27543 CpGs in various cancers and tissues


For example, there are three main methylation clusters: the first one encompassing the normal plasma, multiple myeloma cancer and plasma cell leukemia, the second one encompassing normal brain and Glioblastoma cancer, and the third one encompassing normal prostate and prostate cancer. Exceptionally, we observed the clustering of colorectal cancer and breast cancer, and the clustering of normal colorectal and normal breast. The possible interpretation for this observation could be the previous finding that colorectal cancer and breast cancer own the common susceptibility genes (Garcia-Patino et al., 1998) and aberrant methylation of the common suppressor genes (Agrawal et al., 2007). The hierarchical clustering using Pearson correlation gives exactly the same observations motioned above. It is indicated that the methylation patterns among different states of the same tissue are more similar than those among different tissues.

1.2 The similarity of methylation pattern between cancer and corresponding normal control
We explored the similarity of methylation patterns of CpG sites between cancer and corresponding normal control globally. It is interesting that multiple myeloma cancer and plasma cell leukemia showed obvious lower methylation levels than other cancers, and their corresponding normal tissue plasma also showed lower methylation levels than other normal tissues (Figure 1B). The similar distribution of methylation levels between cancer and the corresponding normal tissue is revealed.

Further analysis on the methylation of CpGs in CpG islands and those out of CpG islands showed the same result. The methylation levels of CpGs in CpG islands are lower than those of CpGs out of CpG islands. For the CpGs in CpG islands, the methylation levels in cancers were slight higher than those in normal tissues (Figure 1 C), which is consistent with the previous reports of hypermethylation of the CpG islands in promoter regions (Koga et al., 2009). Then we mapped the methylation levels upstream of the transcription start site (TSS). It is shown that methylation level increased gradually with increasing distance upstream of TSS in all cancers/tissues (Figure 1 D). All these results revealed that cancers have similar methylation levels with their corresponding normal tissues. Thus, it is necessary to take account of the methylation difference among tissues when we study the methylation difference among different cancers.

1.3 Identification of differentially methylated sites among multiple cancers
In order to mine the cancer-specific methylation markers, we used QDMR to identify the DMSs among multiple cancers (C-DMSs) and DMSs among multiple normal tissues (T-DMSs). QDMR assigns each CpG site two entropy values. The entropy representing the methylation difference across six cancers ranges from 0.187 to 19.057, while another one representing the methylation difference across five normal tissues ranges from 0.194 to 17.673 (Figure 2 A and B). The lower the entropy is, the greater the methylation difference across cancers is.


Figure 2 Methylation patterns of C-DMSs, T-DMSs, Cs-UMSs and T-UMSs

Based on the quantitative methylation difference, all CpGs were classified as 9645 C-DMSs and 17898 Cs-UMSs by the threshold for six samples given in QDMR (Figure 2A). By another threshold for five normal tissues, all CpGs were classified as 8480 T-DMSs and 19063 T-UMSs (Figure 2B). The number of C-DMSs is more than that of T-DMSs, which indicates there are more CpGs with differential methylation across multiple cancers. Most of C-DMSs show lower methylation levels in multiple myeloma cancer and plasma cell leukemia than other types of cancer (Figure 2C). Coincidentally, most of T-DMSs showed lower methylation levels in plasma than other normal tissues (Figure 2D). It is suggested that C-DMSs and T-DMSs possess the similar methylation pattern among different cancers/tissues. Moreover, both of Cs-UMSs and T-UMSs show hypomethylation in all cancers/tissues (Figure 2 E and F).

1.4 Selection of bona fide C-DMSs
Further analysis revealed that 57% (5486/9645) of C-DMSs are also identified as T-DMSs, compared to only 31% (8480/27543) expected by chance (P<0.0001, Figure 3A). Thus, T-DMSs should be considered when we identify the bona fide C-DMSs. Here, the bona fide C-DMSs were defined as the CpG sites identified as C-DMSs across cancers but as T-UMSs across normal tissues. Using these criteria, we selected 4159 bona fide C-DMSs among six cancers. These CpG have different methylation among cancers than other tissue, and may be cancer-specific methylation markers. The function of the genes related with these bona fide C-DMSs may be helpful for understanding the roles of DNA methylation in cancers.


Figure 3 Overlap of T-DMRs and Cs-DMRs

1.5 The function of genes with differential methyl- lation sites
In order to explore the function of the genes with differential methylation sites, we carried out functional enrichment analysis for the genes related with 4159 bona fide C-DMSs among six cancers using DAVID ( It is revealed that the genes related with bona fide differentially methylated sites are enriched with the functions related with cancer such as cell membrane components, cell adhesion, cell migration, immune response and cell proliferation (Table 1). And these genes are enriched in some important signaling pathway in cancer. Twenty-eight genes are targeted by hsa-miR-323 which participates in tumorigenesis (Plaisier et al., 2012). It is indicated that miRNA may be a potential regulator of dynamic DNA methylation and may be the epigenetic marks for multiple cancers. These results reveal the potential roles of DNA methylation in cancer by regulating the cancer genes. 


Table 1 Functional enrichment analysis for genes related with bona fide c-DMSs

1.6 Identificaiton of potential cancer-related genes by protein interaction sub-network
Furthermore, we obtained a sub network from the protein interaction network by selecting the proteins coded by the genes with bona fide C-DMSs and their nearest neighbor proteins (Figure 4A). It is shown that the proteins coded by the genes with bona fide C-DMSs are prone to interact with other proteins. In this network, ACSM3 are interacted with most proteins, and this gene has been reported to be associated with liver, colon and breast cancer (Chen et al., 2002). In addition, functional enrichment analysis on the proteins in this network reveals these proteins are potential cancer-related genes (Table 2). 


Figure 4 Protein interaction sub-network based on huamn protein interaction network



Table 2 Functional enrichment analysis for Proteins/genes in protein interaction sub-network

2 Discussion
In this study, we mainly focus on differentially methylated CpG sites among cancers. Through a series of bioinformatic analysis including cluster analysis, differential sites identification, network building and functional enrichment analysis, we explored the characteristic of differentially methylated CpG sites and argued that the bona fide differentially methylated sites among six cancers may be the real functional elements related with DNA methylation in cancers. Our study proposed a new strategy to identify cancer-specific methylation markers which may be useful for cancer-specific diagnosis, treatment and prognosis.

3 Materials and Methods
3.1 DNA Methylation Data
The DNA methylation data were downloaded from Gene Expression Omnibus (GEO) repository under accession numbers “GSE17648”, "GSE21304”, "GSE22867”, "GSE26319” and “GSE26990” (Barrett et al., 2009). All these data were profiled by Illumina HumanMethylation27 BeadChip (Human Methylation 27_270596_v.1.2) which allows researchers to interrogate 27,578 highly informative CpG sites located within the proximal promoter regions of transcription start sites of 14,475 consensus coding sequencing in the NCBI Database (Genome Build 36). In this study, we used 27,543 CpGs whose methylation levels have been detected in all 297 samples from six cancers (colorectal cancer, multiple myeloma cancer, plasma cell leukemia, glioblastoma cancer, prostate cancer, and breast cancer) and five matched normal control tissues (colorectal, plasma, brain, prostate and breast). For each CpG site, the methylation level in a cancer/tissue is the mean of methylation levels in all the replicate samples per cancer/tissue.

3.2 Hierarchical clustering
Both the hierarchical clustering of all CpGs in all 297 samples and the hierarchical clustering in six cancers and five normal tissues were performed by GenePattern (http://genepattern. (Reich et al., 2006). Euclidean distance was used as the distance measure for both column and row distance clustering. In order to avoid preexisting bias in the distance measure, we also repeated the hierarchical clustering in six cancers and five normal tissues using Pearson correlation. Other parameters were used as the default given in GenePattern.

3.3 Identification of C-DMSs and T-DMSs
The C-DMSs and T-DMSs used in this paper were identified by QDMR which we developed in a previous study (Zhang et al., 2011). For each CpG site, the methylation differences among six cancers were quantified by QDMR. The CpG sites with entropy less than the DMR threshold (3.259) for six samples given by QDMR were identified as C-DMSs. In the same way, we obtained the quantified methylation differences of each CpG site among five normal control tissues and the T-DMSs with entropy lower than the threshold (2.701) for five samples.

The authors thank Scientific Research Fund of Heilongjiang Provincial Education Department for funding. This work is funded by the Scientific Research Fund of Heilongjiang Provincial Education Department [12521270].

Agrawal A., Murphy R.F., and Agrawal D.K., 2007, DNA methylation in breast and colorectal cancers, Mod Pathol, 20: 711-721

Barrett T., Troup D.B., Wilhite S.E., Ledoux P., Rudnev D., Evangelista C., Kim I.F., Soboleva A., Tomashevsky M., Marshall K.A., Phillippy K.H., Sherman P.M., Muertter R.N., and Edgar R., 2009, NCBI GEO: archive for high-throughput functional genomic data, Nucleic Acids Res, 37: D885-890
PMid:18940857 PMCid:PMC2686538

Bhatia S., Frangioni J.V., Hoffman R.M., Iafrate A.J., and Polyak K., 2012, The challenges posed by cancer heterogeneity, Nat Biotechnol, 30: 604-610

Calin G.A., and Croce C.M., 2006, MicroRNA signatures in human cancers, Nat Rev Cancer, 6: 857-866

Chen X., Cheung S.T., So S., Fan S.T., Barry C., Higgins J., Lai K.M., Ji J., Dudoit S., Ng I.O., Van De Rijn M., Botstein D., and Brown P.O., 2002, Gene expression patterns in human liver cancers, Mol Biol Cell, 13: 1929-1939
PMid:12058060 PMCid:PMC117615

Costello J.F., Fruhwald M.C., Smiraglia D.J., Rush L.J., Robertson G.P., Gao X., Wright F.A., Feramisco J.D., Peltomaki P., Lang J.C., Schuller D.E., Yu L., Bloomfield C.D., Caligiuri M.A., Yates A., Nishikawa R., Su Huang H., Petrelli N.J., Zhang X., O'dorisio M.S., Held W.A., Cavenee W.K., and Plass C., 2000, Aberrant CpG-island methylation has non-random and tumour-type-specific patterns, Nat Genet, 24: 132-138

Esteller M., 2008, Epigenetics in cancer, N Engl J Med, 358: 1148-1159

Garcia-Patino E., Gomendio B., Lleonart M., Silva J.M., Garcia J.M., Provencio M., Cubedo R., Espana P., Ramon Y Cajal S., and Bonilla F., 1998, Loss of heterozygosity in the region including the BRCA1 gene on 17q in colon cancer, Cancer Genet Cytogenet, 104: 119-123 
Hansen K.D., Timp W., Bravo H.C., Sabunciyan S., Langmead B., Mcdonald O.G., Wen B., Wu H., Liu Y., Diep D., Briem E., Zhang K., Irizarry R.A., and Feinberg A.P., 2011, Increased methylation variation in epigenetic domains across cancer types, Nat Genet, 43: 768-775
PMid:21706001 PMCid:PMC3145050

He L., He X., Lim L.P., De Stanchina E., Xuan Z., Liang Y., Xue W., Zender L., Magnus J., Ridzon D., Jackson A.L., Linsley P.S., Chen C., Lowe S.W., Cleary M.A., and Hannon G.J., 2007, A microRNA component of the p53 tumour suppressor network, Nature, 447: 1130-1134

Heng H.H., Liu G., Stevens J.B., Bremer S.W., Ye K.J., and Ye C.J., 2010, Genetic and epigenetic heterogeneity in cancer: the ultimate challenge for drug therapy, Curr Drug Targets, 11(10): 1304-1316

Irizarry R.A., Ladd-Acosta C., Wen B., Wu Z., Montano C., Onyango P., Cui H., Gabo K., Rongione M., Webster M., Ji H., Potash J.B., Sabunciyan S., and Feinberg A.P., 2009, The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores, Nat Genet, 41: 178-186
PMid:19151715 PMCid:PMC2729128

Koga Y., Pelizzola M., Cheng E., Krauthammer M., Sznol M., Ariyan S., Narayan D., Molinaro A.M., Halaban R., and Weissman S.M., 2009, Genome-wide screen of promoter methylation identifies novel markers in melanoma, Genome Res, 19: 1462-1470
PMid:19491193 PMCid:PMC2720187

Plaisier C.L., Pan M., and Baliga N.S., 2012, A miRNA-regulatory network explains how dysregulated miRNAs perturb oncogenic processes across diverse cancers, Genome Res, 22: 2302-2314
PMid:22745231 PMCid:PMC3483559

Rakyan V.K., Down T.A., Thorne N.P., Flicek P., Kulesha E., Graf S., Tomazou E.M., Backdahl L., Johnson N., Herberth M., Howe K.L., Jackson D.K., Miretti M.M., Fiegler H., Marioni J.C., Birney E., Hubbard T.J., Carter N.P., Tavare S., and Beck S., 2008, An integrated resource for genome-wide identification and analysis of human tissue-specific differentially methylated regions (tDMRs), Genome Res, 18: 1518-1529
PMid:18577705 PMCid:PMC2527707

Reich M., Liefeld T., Gould J., Lerner J., Tamayo P., and Mesirov J.P., 2006, GenePattern 2.0, Nat Genet, 38: 500-501

Rodenhiser D.I., 2009, Epigenetic contributions to cancer metastasis, Clin Exp Metastasis, 26: 5-18

Schuffler P., Mikeska T., Waha A., Lengauer T., and Bock C., 2009, MethMarker: user-friendly design and optimization of gene-specific DNA methylation assays, Genome Biol, 10: R105
PMid:19804638 PMCid:PMC2784320

Wang E., Lenferink A., and O'connor-Mccourt M., 2007, Cancer systems biology: exploring cancer-associated genes on cellular networks, Cell Mol Life Sci, 64: 1752-1762

Weber M., Davies J.J., Wittig D., Oakeley E.J., Haase M., Lam W.L., and Schubeler D., 2005, Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells, Nat Genet, 37: 853-862

Yoon S., and De Micheli G., 2005, Prediction of regulatory modules comprising microRNAs and target genes, Bioinformatics, 21 Suppl 2: ii93-100

Zhang Y., Kobayashi K., Kitazawa K., Imai K., and Kobayashi M., 2006, Contribution of cooperativity and the Bohr effect to efficient oxygen transport by hemoglobins from five mammalian species, Zoolog Sci, 23: 49-55

Zhang Y., Liu H., Lv J., Xiao X., Zhu J., Liu X., Su J., Li X., Wu Q., Wang F., and Cui Y., 2011, QDMR: a quantitative method for identification of differentially methylated regions by entropy, Nucleic Acids Res, 39(9):e58
PMid:21306990 PMCid:PMC3089487

Computational Molecular Biology
• Volume 3
View Options
. PDF(567KB)
. Online fPDF
Associated material
. Readers' comments
Other articles by authors
. Hongbo Liu
. Zhe Li
. Jing Ding
. Jie Liu
. Yan Zhang
Related articles
. DNA methylation
. Bona fide C-DMSs
. Methylation variation
. Ancer-specific methylation markers
. Email to a friend
. Post a comment