QBioDiff: a web-based tool for quantification and interpretation of biological dif-ferences among multiple samples  

Liu Hongbo1 , Liu Xiaojuan2 , Shang Shipeng1 , Zhang Yan1
1 College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
2 Department of Rehabilitation, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
Author    Correspondence author
Cancer Genetics and Epigenetics, 2015, Vol. 3, No. 4   doi: 10.5376/cge.2015.03.0004
Received: 01 May, 2015    Accepted: 21 May, 2015    Published: 04 Jun., 2015
© 2015 BioPublisher Publishing Platform
This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:

Hongbo Liu., Xiaojuan Liu., Shipeng Shang and Yan Zhang., 2015, QBioDiff: a Web-based Tool for Quantification and Interpretation of Biological Differences Among Multiple Samples, Cancer Genetics and Epigenetics, Vol.3, No.4 1-4


QBioDiff is a web-based tool for quantification and interpretation of biological difference of the regions of interest (ROIs) among multiple samples by adapted Shannon entropy. QBioDiff web server provides quantitative difference analysis tools including difference quantification, differential ROIs identification, result visualization and functional analysis, for the omics data profiled by high-throughput technologies such as microarrays and next-generation sequencing. In particular, standalone software is provided for pretreatment of next-generation sequencing data. The platform-free and species-free nature of QBioDiff makes it applicable to various QBioDiff. This approach is effective for the high-throughput identification of the functional regions involved in various biological progresses. The tools for integrated analysis and result visualization facilitate the interpretation of the biological mechanism hidden in high-throughput data.
Availability and implementation: The web server QBioDiff is available at the web site http://bioinfo.hrbmu.edu.cn/QBioDiff/

QBioDiff; Omics Data; Biological Differences; Regions of Interest

1 Introduction
High-throughput experimental techniques using microarrays and next-generation sequencing are providing omics data on an unprecedented scale. Microarrays have already been used to perform quantitative monitoring of biological characteristics, such as gene expression (Schena et al., 1995), DNA methylation (Schumacher et al., 2006). Recently,  the next-generation sequencing is rapidly becoming an indispensable and widely used tool in biology researches including whole genome sequencing (Abecasis et al., 2012), transcriptome profiling (Wang et al., 2009), DNA methylome (Cokus et al., 2008), DNA-binding proteins (Johnson et al., 2007), and histone modifications (Barski et al., 2007). The unprecedented scale and precision of omics data have enabled the quantitative analysis of dynamic biological status of the regions of interest (ROIs) in various biological processes such as development, aging and

disease by effective computational tools. Over recent years, considerable efforts have been made in the identification of biological differences between case and control samples from high throughput omics data (Bateman and Quackenbush, 2009). For an example, various statistics-based methods have developed for discovering of differentially expressed genes (Pan, 2002). And Shannon entropy, as a quantitative measure of difference and uncertainty in a data set, has also been widely applied in identification of differentially expressed genes among multiple tissues (Schug et al., 2005) and tissue-specific regulatory elements (Shen et al., 2012). For multiple samples, entropy-based methods have an advantage of quantifying biological difference and obtaining precise differential ROIs. In our previous work, we developed a entropy-based method QDMR for quantification of DNA methylation difference and identification of differentially methylated regions (Zhang et al., 2011). The correlation of different kinds of biological difference is important to understanding of the interplay of different biological elements in regulation of life process. A quantitative computational tool for identification and interpretation of various kinds of biological difference among multiple samples is still needed for studying the interaction and function of biological components. Here, we extend our previous entropy-based method QDMR to develop a web-server for quantification of various biological differences, identification of differential ROIs with high confidence and analysis of correlation of biological dynamics. The tabular and graphical outputs would be useful for interpretation of the biological interaction in various biological processes.

2 Features
QBioDiff attempts to facilitate the automated analysis of biological difference among multiple samples from the unprecedented scale and precision of omics data profiled by high-throughput technologies particularly by microarrays and next-generation sequencing. Before fed into QBioDiff, the raw data profiled by high-throughput technologies is assumed preprocessed according to standard procedure. For each biological element, the preprocessed data should be combined into a single file which is expected to have the following columns as follows: unique ROI ID, ROI description, and the omics data in different samples such as cells, tissues (Figure 1A). QBioDiff takes the combined files as input and processes quantitative and comparative analysis automatically.

Figure 1 Workflow of QBioDiff. (A) The input data of QBioDiff are the formatted omics data by high-throughput technologies such as microarrays and next-generation sequencing. (B) The workflow of QBioDiff for analysis of biological difference from multiple samples. (C) The tabular and graphical outputs of QBioDiff including: (1) the result of biological difference quantified by entropy, (2) the result table containing the differential ROIs identified by the threshold, (3) the result table of sample specify of differential ROIs, (4) the genome categories of differential ROIs classified according their locations relative Refseq genes; (5) the histogram describing the distribution of biological difference, (6) the boxplot showing difference patterns of various biological elements, (7) the scatter plot showing the correlation among differences of various biological elements, and (8) the overlap of various differential ROIs.

In particular, a standalone QBioDiff is provided to facilitate user to preprocess the data from next-generation sequencing technologies such as ChIP-Seq, RNA-Seq, for the scale of data precludes uploading. The standalone QBioDiff calculates and normalizes the read number in the ROIs, and provides users the result files are in appropriate format as the import data to QBioDiff web server for further quantitative analysis.

QBioDiff implements a data processing pipeline according to different choices as following (Figure 1B). The pipeline quantifies the difference among multiple samples for each biological element by entropy. Based on these entropy values, ROIs which are differential among these samples by a threshold determined from the probability model in QBioDiff are indicated. The specificity in each sample is then quantified for each differential ROI. Further analysis and visualization results for the biological difference and differential ROIs are returned to users. All of these analyses can be conveniently performed by mouse clicks on the web interfaces, which provides biologists with a practicable and reliable way to analyze and visualize epigenetic differences.

QBioDiff produces tabular and graphical outputs for user to interpret the biological mechanisms behind quantitative data (Figure 1C). All the tabular results, including quantified biological difference, differential ROIs, sample specificity and genome categories of differential ROIs, can be downloaded and for further local analysis. The graphical outputs would facilitate users to explore the distribution of biological difference, distribution discrepancy of various biological differences, correlation among various biological differences and the overlap of various differential ROIs.

The platform-free and species-free nature of QBioDiff makes it potentially applicable to a wide variety of omics data. The graphical user interface provides biologists with a convenient and reliable way to analyze and visualize biological difference. QBioDiff is expected to provide effective tools for the high-throughput identification of functional regions involved in gene regulation.

3 Implementation
The online implementation of the web server, QBioDiff, is freely available at http://bioinfo.hrbmu.edu.cn/QBioDiff/. This tool was built based on modern web interfaces that use less page transitions and more data dynamically loaded. The backstage processing programs were written in Java. Browser-based interfaces were built in JavaServer Pages (JSP) by the implementation of Struts Action Framework. A pure-Java library Batik is used to render, generate, and manipulate SVG graphics dynamically. More details about the usage of QBioDiff are outlined in the online Tutorial.

Funding: National Natural Science Foundation of China (61403112, 31371334).

Abecasis G.R., Auton A., Brooks L.D., Depristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., and Mcvean G.A., 2012, An integrated map of genetic variation from 1,092 human genomes, Nature, 491: 56-65

Barski A., Cuddapah S., Cui K., Roh T.Y., Schones D.E., Wang Z., Wei G., Chepelev I., and Zhao K., 2007, High-resolution profiling of histone methylations in the human genome, Cell, 129: 823-837

Bateman A., and Quackenbush J., 2009, Bioinformatics for next generation sequencing, Bioinformatics, 25: 429

Cokus S.J., Feng S., Zhang X., Chen Z., Merriman B., Haudenschild C.D., Pradhan S., Nelson S.F., Pellegrini M., and Jacobsen S.E., 2008, Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning, Nature, 452: 215-219

Johnson D.S., Mortazavi A., Myers R.M., and Wold B., 2007, Genome-wide mapping of in vivo protein-DNA interactions, Science, 316: 1497-1502

Pan W., 2002, A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments, Bioinformatics, 18: 546-554

Schena M., Shalon D., Davis R.W., and Brown P.O., 1995, Quantitative monitoring of gene expression patterns with a complementary DNA microarray, Science, 270: 467-470

Schug J., Schuller W.P., Kappen C., Salbaum J.M., Bucan M., and Stoeckert C.J., Jr., 2005, Promoter features related to tissue specificity as measured by Shannon entropy, Genome Biol, 6: R33

Schumacher A., Kapranov P., Kaminsky Z., Flanagan J., Assadzadeh A., Yau P., Virtanen C., Winegarden N., Cheng J., Gingeras T., and Petronis A., 2006, Microarray-based DNA methylation profiling: technology and applications, Nucleic Acids Res, 34: 528-542

Shen Y., Yue F., Mccleary D.F., Ye Z., Edsall L., Kuan S., Wagner U., Dixon J., Lee L., Lobanenkov V.V., and Ren
B., 2012, A map of the cis-regulatory sequences in the mouse genome, Nature, 488: 116-120

Wang Z., Gerstein M., and Snyder M., 2009, RNA-Seq: a revolutionary tool for transcriptomics, Nat Rev Genet, 10: 57-63

Zhang Y., Liu H., Lv J., Xiao X., Zhu J., Liu X., Su J., Li X., Wu Q., Wang F., and Cui Y., 2011, QDMR: a quantitative method for identification of differentially methylated regions by entropy, Nucleic Acids Res, 39: e58

Cancer Genetics and Epigenetics
• Volume 3
View Options
. PDF(464KB)
Associated material
. Readers' comments
Other articles by authors
pornliz suckporn porndick pornstereo . Liu Hongbo
. Liu Xiaojuan
. Shang Shipeng
. Zhang Yan
Related articles
. QBioDiff
. Omics Data
. Biological Differences
. Regions of Interest
. Email to a friend
. Post a comment