About


OmicsSIMLA is a simulation tool for generating multi-omics data with disease status.
Currently, OmicsSIMLA has four main modules: SeqSIMLA, pWGBSSimla, RNA-Seq, and RPPA. SeqSIMLA can simulate sequence data in families with multiple affected and unaffected siblings or unrelated case-control samples under different disease models. pWGBSSimla is a profile-based whole-genome bisulphite sequencing data simulator, which can simulate whole-genome DNA methylation (WGBS), reduced representation bisulfite sequencing (RRBS), and oxidative bisulfite sequencing (oxBS-seq) data while modeling methylation quantitative trait loci, allele-specific methylations, and differentially methylated regions. RNA-Seq uses a negative binomial distribution to simulate NGS read counts for gene expression. Finally, RPPA uses a mass-action kinetic action model to simulate protein expression data.


Citations:

  • If you are using OmicsSIMLA for publication, please cite: Chung RH, Kang CY. A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data integration methods for disease classification. GigaScience. 2019. 8(5).

  • If you are using SeqSIMLA2 for publication, please cite:
    Chung RH, Tsai WY, Hsieh CH, Hung KY, Hsiung CA, Hauser ER. 2014. SeqSIMLA2: Simulating Correlated Quantitative Traits Accounting for Shared Environmental Effects in User-Specified Pedigree Structure. Genetic Epidemiology. 39(1):20-4.

  • If you are using pWGBSSimla for publication, please cite:
    Chung RH, Kang CY. pWGBSSimla: a profile-based whole-genome bisulphite sequencing data simulator. bioRxiv doi:10.1101/390633.

  • If you are using the exact option in SeqSIMLA2 for publication, please cite:
    Yao PJ, Chung RH. 2016. SeqSIMLA2_exact: simulate multiple disease sites in large pedigrees with given disease status for diseases with low prevalence. Bioinformatics. 32(4):557-62.