GBSA: A Comprehensive Software for Analysing Whole Genome Bisulfite Sequencing Data (Nucleic Acids Res, Dec 2012)

DNA methylation is an epigenetic event essential for gene transcription regulation and generally accepted to be associated with gene repression. Aberrant DNA methylation profiles have been observed in cancers and other human diseases, highlighting the value of understanding its role in regulation of gene expression, as well as a wider range of biological and cellular processes, such as chromatin reorganization. In the past few years, DNA methylation profiling techniques have undergone a veritable revolution with the emergence of high-throughput sequencing technologies which enables the investigation of the entire genome. However, even though these technologies are available, there are still challenges to overcome to make sense of this huge amount of data. Indeed, although several software programs handle pre-processing analysis, the resulting information is not readily interpretable and often requires further bioinformatic steps for meaningful analysis. Current post-processing programs are generally focused on the gene-specific level, a restrictive feature when analysis in the non-coding regions, such as enhancers and intergenic non-coding RNAs, is required.

Dr. Touati Benoukraf, a Special Fellow at the Cancer Science Institute of Singapore, focuses on epigenetics aberration in cancers, particularly concerning gene regulatory elements, in order to bring to light new mechanisms of tumorigenesis by leveraging on advancements in sequencing technology. To reach this goal, he is developing novel approaches and software for the analysis of high throughput datasets.

In this study, he has developed, with the team lead by A/Prof Richie Soong, Genome Bisulfite Sequencing Analyser (GBSA), a free open-source software capable of analysing whole-genome bisulfite sequencing data with either a gene-centric or gene-independent focus. Through analysis of the largest published datasets to date, they demonstrate GBSA’s features in providing sequencing quality assessment, methylation scoring, functional data management and visualization of genomic methylation at nucleotide resolution. Additionally, they show that GBSA’s output can be easily integrated with other high-throughput sequencing data, such as transcription data or other epigenetics information to elucidate the role of methylated intergenic regions in gene regulation. In essence, GBSA allows an investigator to explore not only known loci but also all the genomic regions, for which methylation studies could lead to the discovery of new regulatory mechanisms.


Benoukraf T*1, Wongphayak S1, Hadi LH1, Wu M1, Soong R*1,2. Nucleic Acids Res. 2012 Dec 24. *Corresponding authors

1Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore
2Department of Pathology, National University of Singapore, Singapore 117599, Singapore

Link to PubMed