Skip to main content

Advertisement

Log in

Computational Techniques and Tools for Omics Data Analysis: State-of-the-Art, Challenges, and Future Directions

  • Original Paper
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Abstract

The heterogeneous and high-dimensional nature of omics data presents various challenges in gaining insights while analysis. In the era of big data, omics data is available as genome, proteome, transcriptome, and metabolome. Apart from the single omics data type, integrative omics known as multi-omics, and omics imaging data known as radiomics approaching to big data are being used for predictive analysis. The various computational approaches such as data mining, machine learning, deep learning, statistical methods, metaheuristic techniques have gained attention to process, normalize, integrate, analyse omics data. This paper presents the critical review of state-of-the-art techniques for omics, multi-omics, radiomics data analysis themed at disease prediction, disease recurrence, survival analysis, and biomarker discovery. The paper investigates, compares and categorizes various existing tools and technologies based on common characteristics for integration and analysis of omics data. In addition, the significant research challenges and directions are discussed for futuristic omics research. This survey would guide researchers to understand the use of computationally intelligent approaches for efficient omics data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Miotto R, Wang F, Wang S et al (2017) Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 19(6):1236–1246. https://doi.org/10.1093/bib/bbx044

    Article  Google Scholar 

  2. Zhang L, Lv C, Jin Y et al (2018) Deep learning-based multi-omics data integration reveals two prognostic subtypes in high-risk neuroblastoma. Front Genet 9:477. https://doi.org/10.3389/fgene.2018.00477

    Article  Google Scholar 

  3. Yan KK, Zhao H, Pang H (2017) A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits. BMC Bioinform 18(1):539. https://doi.org/10.1186/s12859-017-1982-4

    Article  Google Scholar 

  4. Manogaran G, Vijayakumar V, Varatharajan R et al (2018) Machine learning based big data processing framework for cancer diagnosis using hidden Markov model and GM clustering. Wirel Pers Commun 102(3):2099–2116. https://doi.org/10.1007/s11277-017-5044-z

    Article  Google Scholar 

  5. Dubourg-Felonneau G, Cannings T, Cotter F et al (2018) A framework for implementing machine learning on omics data. arXiv preprint arXiv:1811.10455

  6. Chaudhary K, Poirion OB, Lu L, Garmire LX (2018) Deep learning–based multi-omics integration robustly predicts survival in liver cancer. Clin Cancer Res 24(6):1248–1259. https://doi.org/10.1158/1078-0432.CCR-17-0853

    Article  Google Scholar 

  7. Kim M, Tagkopoulos I (2018) Data integration and predictive modeling methods for multi-omics datasets. Mol Omics 14(1):8–25

    Article  Google Scholar 

  8. Antonelli L, Guarracino MR, Maddalena L, Sangiovanni M (2019) Integrating imaging and omics data: a review. Biomed Signal Process Control 52:264–280

    Article  Google Scholar 

  9. Zhang Z, Zhao Y, Liao X et al (2019) Deep learning in omics: a survey and guideline. Brief Funct Genom 18(1):41–57. https://doi.org/10.1093/bfgp/ely030

    Article  Google Scholar 

  10. Li Y, Wu FX, Ngom A (2018) A review on machine learning principles for multi-view biological data integration. Brief Bioinform 19(2):325–340. https://doi.org/10.1093/bib/bbw113

    Article  Google Scholar 

  11. Rappoport N, Shamir R (2018) Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res 46(20):10546–10562. https://doi.org/10.1093/nar/gky889

    Article  Google Scholar 

  12. Wei Y (2015) Integrative analyses of cancer data: a review from a statistical perspective. Cancer Inform 14:173–181. https://doi.org/10.4137/CIN.S17303

    Article  Google Scholar 

  13. Wu C, Zhou F, Ren J et al (2019) A selective review of multi-level omics data integration using variable selection. High Throughput 8(1):4. https://doi.org/10.3390/ht8010004

    Article  Google Scholar 

  14. Hasin Y, Seldin M, Lusis A (2017) Multi-omics approaches to disease. Genom Biol 18(1):83

    Article  Google Scholar 

  15. Kodama Y, Shumway M, Leinonen R (2012) The sequence read archive: explosive growth of sequencing data. Nucl Acids Res 40(D1):D54–D56. https://doi.org/10.1093/nar/gkr854

    Article  Google Scholar 

  16. Clough E, Barrett T (2016) The gene expression omnibus database. In: Mathé E, Davis S (eds) Statistical genomics. Methods in molecular biology, vol 1418. Humana Press, New York, pp 93–110

    Google Scholar 

  17. Vizcaino JA, Deutsch EW, Wang R, Csordas A, Reisinger F, Rios D, Dianes JA, Sun Z, Farrah T, Bandeira N, Binz PA (2014) ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol 32(3):223–226. https://doi.org/10.1038/nbt.2839

    Article  Google Scholar 

  18. Jones P, Côté RG, Martens L, Quinn AF, Taylor CF, Derache W, Hermjakob H, Apweiler R (2006) PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucl Acids Res 34(1):D659–D663. https://doi.org/10.1093/nar/gkj138

    Article  Google Scholar 

  19. Wilhelm M, Schlegl J, Hahne H et al (2014) Mass-spectrometry-based draft of the human proteome. Nature 509(7502):582–587. https://doi.org/10.1038/nature13319

    Article  Google Scholar 

  20. Kale NS, Haug K, Conesa P et al (2016) MetaboLights: an open-access database repository for metabolomics data. Curr Protoc Bioinform 53(1):14.13.1-14.13.18. https://doi.org/10.1002/0471250953.bi1413s53

    Article  Google Scholar 

  21. Conesa A, Mortazavi A (2014) The common ground of genomics and systems biology. BMC Syst Biol 8(S2):S1

    Article  Google Scholar 

  22. Evangelou E, Ioannidis JPA (2013) Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 14(6):379–389. https://doi.org/10.1038/nrg3472

    Article  Google Scholar 

  23. Kristensen VN, Lingjærde OC, Russnes HG et al (2014) Principles and methods of integrative genomic analyses in cancer. Nat Rev Cancer 14(5):299–313. https://doi.org/10.1038/nrc3721

    Article  Google Scholar 

  24. Dhillon A, Singh A (2020) eBreCaP: extreme learning-based model for breast cancer survival prediction. IET Syst Biol 14(3):160–169

    Article  Google Scholar 

  25. Ding H (2016) Visualization and integrative analysis of cancer multi-omics data. Dissertation, The Ohio State University

  26. Ritchie MD, Holzinger ER, Li R et al (2015) Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet 16(2):85–97. https://doi.org/10.1038/nrg3868

    Article  Google Scholar 

  27. Dhillon A, Singh A, Vohra H, Ellis C, Varghese B, Gill SS (2020) IoTPulse: machine learning-based enterprise health information system to predict alcohol addiction in Punjab (India) using IoT and fog computing. Enterp Inf Syst 1–33. https://doi.org/10.1080/17517575.2020.1820583

  28. Dhillon A, Singh A (2019) Machine learning in healthcare data analysis: a survey. J Biol Today’s World 8(2):1–10

    MathSciNet  Google Scholar 

  29. Omics (2020) https://en.wikipedia.org/wiki/Omics. Accessed 20 March 2020

  30. Weissenbach J (2016) The rise of genomics. CR Biol 339(7–8):231–239. https://doi.org/10.1016/j.crvi.2016.05.002

    Article  Google Scholar 

  31. Jou WM, Haegeman G, Ysebaert M, Fiers W (1972) Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein. Nature 237(5350):82–88

    Article  Google Scholar 

  32. Kitchenham B, Brereton OP, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering–a systematic literature review. Inf Softw Technol 51(1):7–15

    Article  Google Scholar 

  33. Tao M, Song T, Du W et al (2019) Classifying breast cancer subtypes using multiple kernel learning based on omics data. Genes 10(3):200. https://doi.org/10.3390/genes10030200

    Article  Google Scholar 

  34. Liu Y (2004) Active learning with support vector machine applied to gene expression data for cancer classification. J Chem Inf Comput Sci 44(6):1936–1941. https://doi.org/10.1021/ci049810a

    Article  Google Scholar 

  35. Xu X, Zhang Y, Zou L, Wang M, Li A (2012) A gene signature for breast cancer prognosis using support vector machine. In: 2012 5th international conference on biomedical engineering and informatics, IEEE, 16–18 October 2012, Chongqing, China, pp 928–931

  36. Chen Y, Sun J, Huang LC et al (2015) Classification of cancer primary sites using machine learning and somatic mutations. Biomed Res Int 2015:1–9. https://doi.org/10.1155/2015/491502

    Article  Google Scholar 

  37. Anaissi A, Goyal M, Catchpoole DR et al (2016) Ensemble feature learning of genomic data using support vector machine. PLoS ONE 11(6):e0157330. https://doi.org/10.1371/journal.pone.0157330

    Article  Google Scholar 

  38. Cai Z, Xu D, Zhang Q et al (2015) Classification of lung cancer using ensemble-based feature selection and machine learning methods. Mol BioSyst 11(3):791–800. https://doi.org/10.1039/c4mb00659c

    Article  Google Scholar 

  39. Ruan J, Jahid MJ, Gu F et al (2019) A novel algorithm for network-based prediction of cancer recurrence. Genomics 111(1):17–23. https://doi.org/10.1016/j.ygeno.2016.07.005

    Article  Google Scholar 

  40. Long NP, Park S, Anh NH et al (2019) High-throughput omics and statistical learning integration for the discovery and validation of novel diagnostic signatures in colorectal cancer. Int J Mol Sci 20(2):296. https://doi.org/10.3390/ijms20020296

    Article  Google Scholar 

  41. Bravo-Merodio L, Williams JA, Gkoutos GV, Acharjee A (2019) Omics biomarker identification pipeline for translational medicine. J Trans Med 17(1):155. https://doi.org/10.1186/s12967-019-1912-5

    Article  Google Scholar 

  42. Moon M, Nakai K (2018) Integrative analysis of gene expression and DNA methylation using unsupervised feature extraction for detecting candidate cancer biomarkers. J Bioinform Comput Biol 16(02):1850006. https://doi.org/10.1142/S0219720018500063

    Article  Google Scholar 

  43. Hamzeh O, Rueda L (2019) A gene-disease-based machine learning approach to identify prostate cancer biomarkers. In: Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. Association for Computing Machinery, Niagara Falls, NY, USA, pp 633–638

  44. Swan AL, Stekel DJ, Hodgman C et al (2015) A machine learning heuristic to identify biologically relevant and minimal biomarker panels from omics data. BMC Genom 16(S1):S2. https://doi.org/10.1186/1471-2164-16-S1-S2

    Article  Google Scholar 

  45. Zuo Y, Cui Y, di Poto C et al (2016) INDEED: Integrated differential expression and differential network analysis of omic data for biomarker discovery. Methods 111:12–20. https://doi.org/10.1016/j.ymeth.2016.08.015

    Article  Google Scholar 

  46. Ramroach S, Joshi A, John M (2020) Optimisation of cancer classification by machine learning generates enriched list of candidate drug targets and biomarkers. Mol Omics 16(2):113–125. https://doi.org/10.1039/c9mo00198k

    Article  Google Scholar 

  47. Ching T, Zhu X, Garmire LX (2018) Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput Biol 14(4):e1006076. https://doi.org/10.1371/journal.pcbi.1006076

    Article  Google Scholar 

  48. Roadknight C, Suryanarayanan D, Aickelin U et al (2015) An ensemble of machine learning and anti-learning methods for predicting tumour patient survival rates. In: 2015 IEEE international conference on data science and advanced analytics. IEEE, 19–21 Oct. 2015, Paris, France, pp 1–8

  49. Spirko-Burns L, Devarajan K (2020) Supervised dimension reduction for large-scale “omics” data with censored survival outcomes under possible non-proportional hazards. IEEE/ACM Trans Comput Biol Bioinf. https://doi.org/10.1109/TCBB.2020.2965934

    Article  Google Scholar 

  50. Huang Z, Zhan X, Xiang S et al (2019) Salmon: survival analysis learning with multi-omics neural networks on breast cancer. Front Genet 10:166. https://doi.org/10.3389/fgene.2019.00166

    Article  Google Scholar 

  51. Lee C, Zame WR, Yoon J, van der Schaar M (2018) DeepHit: a deep learning approach to survival analysis with competing risks. In: Thirty-second AAAI conference on artificial intelligence, pp 2314–2321

  52. Yousefi S, Amrollahi F, Amgad M et al (2017) Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. Sci Rep 7(1):1–11. https://doi.org/10.1038/s41598-017-11817-6

    Article  Google Scholar 

  53. Argelaguet R, Velten B, Arnol D et al (2018) Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol Syst Biol 14(6):e8124. https://doi.org/10.15252/msb.20178124

    Article  Google Scholar 

  54. Dimitrakopoulos C, Hindupur SK, Hafliger L et al (2018) Network-based integration of multi-omics data for prioritizing cancer genes. Bioinformatics 34(14):2441–2448. https://doi.org/10.1093/bioinformatics/bty148

    Article  Google Scholar 

  55. Sun D, Li A, Tang B, Wang M (2018) Integrating genomic data and pathological images to effectively predict breast cancer clinical outcome. Comput Methods Prog Biomed 161:45–53. https://doi.org/10.1016/j.cmpb.2018.04.008

    Article  Google Scholar 

  56. Torshizi AD, Petzold LR (2018) Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification. J Am Med Inform Assoc 25(1):99–108. https://doi.org/10.1093/jamia/ocx032

    Article  Google Scholar 

  57. Fang Z, Ma T, Tang G et al (2018) Bayesian integrative model for multi-omics data with missingness. Bioinformatics 34(22):3801–3808. https://doi.org/10.1093/bioinformatics/bty775

    Article  Google Scholar 

  58. Gevaert O, Smet FD, Timmerman D et al (2006) Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks. Bioinformatics 22(14):e184–e190

    Article  Google Scholar 

  59. Subhani MM, Anjum A, Koop A, Antonopoulos N (2016) Clinical and genomics data integration using meta-dimensional approach. In: 2016 IEEE/ACM 9th international conference on utility and cloud computing (UCC). IEEE, Shanghai, China, pp 416–421

  60. Savage RS, Yuan Y (2016) Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion. R Soc Open Sci 3(2):140501. https://doi.org/10.1098/rsos.140501

    Article  Google Scholar 

  61. Kim M, Oh I, Ahn J (2018) An improved method for prediction of cancer prognosis by network learning. Genes 9(10):478. https://doi.org/10.3390/genes9100478

    Article  Google Scholar 

  62. Bica I, Velickovic P, Xiao H, Li P (2018) Multi-omics data integration using cross-modal neural networks. In: ESANN 2018 proceedings, European symposium on artificial neural networks, computational intelligence and machine learning. Bruges, Belgium, pp 385–390

  63. Klau S, Jurinovic V, Hornung R et al (2018) Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinform 19(1):322. https://doi.org/10.1186/s12859-018-2344-6

    Article  Google Scholar 

  64. Rappoport N, Shamir R (2018) NEMO: cancer subtyping by integration of partial multi-omic data. Bioinformatics 35(18):3348–3356. https://doi.org/10.1093/bioinformatics/btz058

    Article  Google Scholar 

  65. Wu D, Wang D, Zhang MQ, Gu J (2015) Fast dimension reduction and integrative clustering of multi-omics data using lowrank approximation: application to cancer molecular classification. BMC Genom 16(1):1022. https://doi.org/10.1186/s12864-015-2223-8

    Article  Google Scholar 

  66. Lopes MB, Veríssimo A, Carrasquinha E et al (2018) Ensemble outlier detection and gene selection in triple-negative breast cancer data. BMC Bioinform 19(1):168. https://doi.org/10.1186/s12859-018-2149-7

    Article  Google Scholar 

  67. Chang SW, Abdul-Kareem S, Merican AF, Zain RB (2013) Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods. BMC Bioinform 14(1):170. https://doi.org/10.1186/1471-2105-14-170

    Article  Google Scholar 

  68. Liang M, Li Z, Chen T, Zeng J (2015) Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach. IEEE/ACM Trans Comput Biol Bioinform 12(4):928–937. https://doi.org/10.1109/TCBB.2014.2377729

    Article  Google Scholar 

  69. Islam MM, Wang Y, Hu P (2018) deep learning models for predicting phenotypic traits and diseases from omics data. In: Artificial intelligence—emerging trends and applications. IntechOpen, pp 333–351

  70. Exarchos KP, Goletsis Y, Fotiadis DI (2011) Multiparametric decision support system for the prediction of oral cancer reoccurrence. IEEE Trans Inf Technol Biomed 16(6):1127–1134. https://doi.org/10.1109/TITB.2011.2165076

    Article  Google Scholar 

  71. Park C, Ahn J, Kim H, Park S (2014) Integrative gene network construction to analyze cancer recurrence using semi-supervised learning. PLoS ONE 9(1):e86309. https://doi.org/10.1371/journal.pone.0086309

    Article  Google Scholar 

  72. Kim S, Jhong JH, Lee J, Koo JY (2017) Meta-analytic support vector machine for integrating multiple omics data. BioData Min 10(1):2. https://doi.org/10.1186/s13040-017-0126-8

    Article  Google Scholar 

  73. Mallik S, Bhadra T, Maulik U (2017) Identifying epigenetic biomarkers using maximal relevance and minimal redundancy based feature selection for multi-omics data. IEEE Trans Nanobiosci 16(1):3–10. https://doi.org/10.1109/TNB.2017.2650217

    Article  Google Scholar 

  74. Long NP, Jung KH, Anh NH et al (2019) An integrative data mining and omics-based translational model for the identification and validation of oncogenic biomarkers of pancreatic cancer. Cancers 11(2):155. https://doi.org/10.3390/cancers11020155

    Article  Google Scholar 

  75. El-Manzalawy Y (2018) CCA based multi-view feature selection for multi-omics data integration. In: 2018 IEEE conference on computational intelligence in bioinformatics and computational biology (CIBCB). IEEE, St. Louis, MO, USA, pp 1–8

  76. Ma S, Ren J, Fenyö D (2016) Breast cancer prognostics using multi-omics data. In: AMIA summits on translational science proceedings, pp 52–59

  77. Chen YC, Ke WC, Chiu HW (2014) Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med 48:1–7. https://doi.org/10.1016/j.compbiomed.2014.02.006

    Article  Google Scholar 

  78. Zhu B, Song N, Shen R et al (2017) Integrating clinical and multiple omics data for prognostic assessment across human cancers. Sci Rep 7(1):1–13. https://doi.org/10.1038/s41598-017-17031-8

    Article  Google Scholar 

  79. Kim D, Li R, Dudek SM, Ritchie MD (2015) Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer. J Biomed Inform 56:220–228. https://doi.org/10.1016/j.jbi.2015.05.019

    Article  Google Scholar 

  80. Poirion OB, Chaudhary K, Garmire LX (2018) Deep Learning data integration for better risk stratification models of bladder cancer. AMIA Summits on Translational Science Proceedings. pp 197–206

  81. Incoronato M, Aiello M, Infante T et al (2017) Radiogenomic analysis of oncological data: a technical survey. Int J Mol Sci 18(4):805. https://doi.org/10.3390/ijms18040805

    Article  Google Scholar 

  82. Acharya UR, Hagiwara Y, Sudarshan VK et al (2018) Towards precision medicine: from quantitative imaging to radiomics. J Zhejiang Univ Sci B 19(1):6–24. https://doi.org/10.1631/jzus.B1700260

    Article  Google Scholar 

  83. Li H, Zhu Y, Burnside ES et al (2016) Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set. NPJ Breast Cancer 2(1):1–10. https://doi.org/10.1038/npjbcancer.2016.12

    Article  Google Scholar 

  84. Zhou H, Dong D, Chen B et al (2018) Diagnosis of distant metastasis of lung cancer: based on clinical and radiomic features. Trans Oncol 11(1):31–36. https://doi.org/10.1016/j.tranon.2017.10.010

    Article  Google Scholar 

  85. Takahashi S, Takahashi W, Tanaka S et al (2019) Radiomics analysis for glioma malignancy evaluation using diffusion kurtosis and tensor imaging. Int J Radiat Oncol Biol Phys 105(4):784–791. https://doi.org/10.1016/j.ijrobp.2019.07.011

    Article  Google Scholar 

  86. Li K, Xiao J, Yang J et al (2019) Association of radiomic imaging features and gene expression profile as prognostic factors in pancreatic ductal adenocarcinoma. Am J Trans Res 11(7):4491–4499

    Google Scholar 

  87. Clifton H, Vial A, Miller A et al (2019) Using machine learning applied to radiomic image features for segmenting tumour structures. In: 2019 Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC). IEEE, Lanzhou, China, pp 1981–1988

  88. Bae S, Choi YS, Ahn SS et al (2018) Radiomic MRI phenotyping of glioblastoma: improving survival prediction. Radiology 289(3):797–806. https://doi.org/10.1148/radiol.2018180200

    Article  Google Scholar 

  89. Chaddad A, Sabri S, Niazi T, Abdulkarim B (2018) Prediction of survival with multi-scale radiomic analysis in glioblastoma patients. Med Biol Eng Comput 56(12):2287–2300. https://doi.org/10.1007/s11517-018-1858-4

    Article  Google Scholar 

  90. Kaissis G, Ziegelmayer S, Lohöfer F et al (2019) A machine learning algorithm predicts molecular subtypes in pancreatic ductal adenocarcinoma with differential response to gemcitabine-based versus FOLFIRINOX chemotherapy. PLoS ONE 14(10):1–16. https://doi.org/10.1371/journal.pone.0218642

    Article  Google Scholar 

  91. Sun W, Jiang M, Dang J et al (2018) Effect of machine learning methods on predicting NSCLC overall survival time based on radiomics analysis. Radiat Oncol 13(1):1–8. https://doi.org/10.1186/s13014-018-1140-9

    Article  Google Scholar 

  92. D’Amico NC, Grossi E, Valbusa G et al (2020) A machine learning approach for differentiating malignant from benign enhancing foci on breast MRI. Eur Radiol Exp 4(1):5. https://doi.org/10.1186/s41747-019-0131-4

    Article  Google Scholar 

  93. Ferreira Junior JR, Koenigkam-Santos M, Cipriano FEG et al (2018) Radiomics-based features for pattern recognition of lung cancer histopathology and metastases. Comput Methods Prog Biomed 159:23–30. https://doi.org/10.1016/j.cmpb.2018.02.015

    Article  Google Scholar 

  94. Zhang Y, Li A, He J, Wang M (2020) A novel MKL method for GBM prognosis prediction by integrating histopathological image and multi-omics data. IEEE J Biomed Health Inform 24(1):171–179. https://doi.org/10.1109/JBHI.2019.2898471

    Article  Google Scholar 

  95. Lao J, Chen Y, Li ZC et al (2017) A deep learning-based radiomics model for prediction of survival in glioblastoma multiforme. Sci Rep 7(1):1–8. https://doi.org/10.1038/s41598-017-10649-8

    Article  Google Scholar 

  96. Fu Y, Liu X, Yang Q et al (2019) Radiomic features based on MRI for prediction of lymphovascular invasion in rectal cancer. Chin J Acad Radiol 2(1–2):13–22. https://doi.org/10.1007/s42058-019-00016-z

    Article  Google Scholar 

  97. Chufal KS, Ahmad I, Pahuja AK et al (2019) Application of artificial neural networks for prognostic modeling in lung cancer after combining radiomic and clinical features. Asian J Oncol 5(02):050–055. https://doi.org/10.1055/s-0039-3401438

    Article  Google Scholar 

  98. Wei B, Han Z, He X, Yin Y (2017) Deep learning model based breast cancer histopathological image classification. In: 2017 IEEE 2nd international conference on cloud computing and big data analysis (ICCCBDA), IEEE, April 28–30, Chengdu, China, pp 348–353

  99. Lu H, Wang H, Yoon SW (2019) A dynamic gradient boosting machine using genetic optimizer for practical breast cancer prognosis. Expert Syst Appl 116:340–350. https://doi.org/10.1016/j.eswa.2018.08.040

    Article  Google Scholar 

  100. Yang H, Cao H, He T et al (2018) Multilevel heterogeneous omics data integration with kernel fusion. Brief Bioinform 21(1):156–170. https://doi.org/10.1093/bib/bby115

    Article  Google Scholar 

  101. Li L, Jiang W, Li X et al (2005) A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset. Genomics 85(1):16–23. https://doi.org/10.1016/j.ygeno.2004.09.007

    Article  Google Scholar 

  102. Ram PK, Kuila P (2019) Feature selection from microarray data: genetic algorithm based approach. J Inform Optim Sci 40(8):1599–1610. https://doi.org/10.1080/02522667.2019.1703260

    Article  Google Scholar 

  103. Fortino V, Scala G, Greco D (2020) Feature set optimization in biomarker discovery from genome scale data. Bioinformatics, btaa144. https://doi.org/10.1093/bioinformatics/btaa144

  104. Kečo D, Subasi A, Kevric J (2018) Cloud computing-based parallel genetic algorithm for gene selection in cancer classification. Neural Comput Appl 30(5):1601–1610. https://doi.org/10.1007/s00521-016-2780-z

    Article  Google Scholar 

  105. Yu H, Ni J, Zhao J (2013) ACOSampling: an ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data. Neurocomputing 101:309–318. https://doi.org/10.1016/j.neucom.2012.08.018

    Article  Google Scholar 

  106. Xu J, Wu P, Chen Y et al (2019) A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data. BMC Bioinform 20(1):1–11. https://doi.org/10.1186/s12859-019-3116-7

    Article  Google Scholar 

  107. Sangaralingam A, Dayem Ullah AZ, Marzec J et al (2019) “Multi-omic” data analysis using O-miner. Brief Bioinform 20(1):130–143. https://doi.org/10.1093/bib/bbx080

    Article  Google Scholar 

  108. Colaprico A, Silva TC, Olsen C et al (2016) TCGAbiolinks: an R/bioconductor package for integrative analysis of TCGA data. Nucl Acids Res 44(8):e71. https://doi.org/10.1093/nar/gkv1507

    Article  Google Scholar 

  109. Yu KH, Fitzpatrick MR, Pappas L et al (2018) Omics analysis system for precision oncology (OASISPRO): a web-based omics analysis tool for clinical phenotype prediction. Bioinformatics 34(2):319–320. https://doi.org/10.1093/bioinformatics/btx572

    Article  Google Scholar 

  110. Martínez-Mira C, Conesa A, Tarazona S (2018) MOSim: multi-omics simulation in R. bioRxiv. https://doi.org/10.1101/421834

  111. Cumbo F, Fiscon G, Ceri S et al (2017) TCGA2BED: extracting, extending, integrating, and querying the cancer genome atlas. BMC Bioinform 18(1):6. https://doi.org/10.1186/s12859-016-1419-5

    Article  Google Scholar 

  112. Ulfenborg B (2019) Vertical and horizontal integration of multi-omics data with miodin. BMC Bioinform 20(1):649. https://doi.org/10.1101/431429

    Article  Google Scholar 

  113. Deng M, Brägelmann J, Schultze JL, Perner S (2016) Web-TCGA: an online platform for integrated analysis of molecular cancer data sets. BMC Bioinform 17(1):72. https://doi.org/10.1186/s12859-016-0917-9

    Article  Google Scholar 

  114. Hernandez-Ferrer C, Ruiz-Arenas C, Beltran-Gomila A, González JR (2017) MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration. BMC Bioinform 18(1):36. https://doi.org/10.1186/s12859-016-1455-1

    Article  Google Scholar 

  115. Singh A, Shannon CP, Gautier B et al (2018) DIABLO: from multi-omics assays to biomarker discovery, an integrative approach. bioRxiv, 067611. https://doi.org/10.1101/067611

  116. Wang YE, Kutnetsov L, Partensky A et al (2017) WebMeV: a cloud platform for analyzing and visualizing cancer genomic data. Can Res 77(21):e11–e14. https://doi.org/10.1158/0008-5472.CAN-17-0802

    Article  Google Scholar 

  117. Zhu Y, Qiu P, Ji Y (2014) TCGA-assembler: open-source software for retrieving and processing TCGA data. Nat Methods 11(6):599–600

    Article  Google Scholar 

  118. Wei L, Jin Z, Yang S et al (2018) TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data. Bioinformatics 34(9):1615–1617. https://doi.org/10.1093/bioinformatics/btx812

    Article  Google Scholar 

  119. Xie B, Yuan Z, Yang Y et al (2018) MOBCdb: a comprehensive database integrating multi-omics data on breast cancer for precision medicine. Breast Cancer Res Treat 169(3):625–632. https://doi.org/10.1007/s10549-018-4708-z

    Article  Google Scholar 

  120. Chen D, Zhang F, Zhao Q, Xu J (2019) OmicsARules: a R package for integration of multi-omics datasets via association rules mining. BMC Bioinform 20(1):1–8. https://doi.org/10.1186/s12859-019-3171-0

    Article  Google Scholar 

  121. Koh HWL, Fermin D, Vogel C et al (2019) iOmicsPASS: network-based integration of multiomics data for predictive subnetwork discovery. NPJ Syst Biol Appl 5(1):1–10. https://doi.org/10.1038/s41540-019-0099-y

    Article  Google Scholar 

  122. Fisch KM, Meißner T, Gioia L et al (2015) Omics pipe: a community-based framework for reproducible multi-omics data analysis. Bioinformatics 31(11):1724–1728. https://doi.org/10.1093/bioinformatics/btv061

    Article  Google Scholar 

  123. Jang Y, Yu N, Seo J et al (2016) MONGKIE: an integrated tool for network analysis and visualization for multi-omics data. Biol Direct 11(1):10. https://doi.org/10.1186/s13062-016-0112-y

    Article  Google Scholar 

  124. Polpitiya AD, Qian WJ, Jaitly N et al (2008) DAnTE: a statistical tool for quantitative analysis of -omics data. Bioinformatics 24(13):1556–1558. https://doi.org/10.1093/bioinformatics/btn217

    Article  Google Scholar 

  125. Eren AM, Esen OC, Quince C et al (2015) Anvi’o: an advanced analysis and visualization platformfor ’omics data. PeerJ 3:e1319. https://doi.org/10.7717/peerj.1319

    Article  Google Scholar 

  126. Guhlin J, Silverstein KAT, Zhou P et al (2017) ODG: omics database generator—a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding. BMC Bioinform 18(1):367. https://doi.org/10.1186/s12859-017-1777-7

    Article  Google Scholar 

  127. Surujon D, van Opijnen T (2020) ShinyOmics: collaborative exploration of omics-data. BMC Bioinform 21(1):1–8. https://doi.org/10.1186/s12859-020-3360-x

    Article  Google Scholar 

  128. Blatti C, Emad A, Berry MJ et al (2020) Knowledge-guided analysis of “omics” data using the KnowEnG cloud platform. PLoS Biol 18(1):e3000583. https://doi.org/10.1371/journal.pbio.3000583

    Article  Google Scholar 

  129. Zhao S, Prenger K, Smith L et al (2013) Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing. BMC Genom 14(1):425. https://doi.org/10.1186/1471-2164-14-425

    Article  Google Scholar 

  130. Chiesa M, Maioli G, Colombo GI, Piacentini L (2020) GARS: genetic algorithm for the identification of a robust subset of features in high-dimensional datasets. BMC Bioinform 21(1):54. https://doi.org/10.1186/s12859-020-3400-6

    Article  Google Scholar 

  131. Mohammed A, Biegert G, Adamec J, Helikar T (2017) CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget 9(2):2565–2573

    Article  Google Scholar 

Download references

Funding

The authors have no funding to report.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parampreet Kaur.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Ethical Standards

The author declares that this article complies the ethical standard.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, P., Singh, A. & Chana, I. Computational Techniques and Tools for Omics Data Analysis: State-of-the-Art, Challenges, and Future Directions. Arch Computat Methods Eng 28, 4595–4631 (2021). https://doi.org/10.1007/s11831-021-09547-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11831-021-09547-0

Navigation