Skip to main content
Log in

Mixed Distribution Models Based on Single-Cell RNA Sequencing Data

  • Original research article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

Progress in single-cell RNA sequencing (scRNA-seq) has yielded a lot of valuable data. Analysis of these data can provide a new perspective for studying the intratumoral heterogeneity and identifying gene markers. In this paper, the scRNA-seq data of colorectal cancer (CRC) are analyzed, and it is found that the shape of the gene expression difference (GED) data shows certain distribution regularity. To study the distribution regularity, mixed stable-normal distribution (MSND) model and mixed stable-exponential distribution (MSED) model are constructed to fit the GED data. And the estimated parameters of MSND and MSED are used to describe some characteristics of their distribution. Through the comparison of root mean square error and the chi-squared goodness of fit test, it is found that the fitting effect of MSED and MSND are both better than that of stable distribution and Cauchy distribution. Considering the given quantile thresholds, MSND and MSED can be used to identify tumor-related genes. The results of functional analysis indicate that the selected genes are highly correlated with CRC. In addition, the parameters of MSND and MSED exhibit a certain trend with the development of CRC. To explore the association, Gene-set enrichment analysis (GSEA) is performed. The results of GSEA reveal that the trend can well characterize the intratumoral heterogeneity of CRC. In addition, the application of MSED model on hepatocellular carcinoma shows that our model can analyze other cancers. Overall, MSND model and MSED model can well fit the GED data in different disease stages, the parameters of the two models can characterize the heterogeneity of CRC tumor cells, and the two models can be used to identify genes highly correlated with tumors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Suvà ML, Tirosh I (2019) Single-cell RNA sequencing in cancer: lessons learned and emerging challenges. Mol Cell 75(1):7–12. https://doi.org/10.1016/j.molcel.2019.05.003

    Article  CAS  PubMed  Google Scholar 

  2. Yasen A, Aini A, Wang H, Li W, Zhang C et al (2020) Progress and applications of single-cell sequencing techniques. Infect Genet Evol 80:104198–104209. https://doi.org/10.1016/j.meegid.2020.104198

    Article  CAS  PubMed  Google Scholar 

  3. Wu Z, Zhang Y, Stitzel ML, Wu H (2018) Two-phase differential expression analysis for single cell RNA-seq. Bioinformatics 34(19):3340–3348. https://doi.org/10.1093/bioinformatics/bty329

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Shalek AK, Satija R, Shuga J, Trombetta JJ, Gennert D et al (2014) Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 510(7505):363–369. https://doi.org/10.1038/nature13437

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E et al (2019) Comprehensive integration of single-cell data. Cell 177(7):1888–1902. https://doi.org/10.1016/j.cell.2019.05.031

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A et al (2017) SC3: consensus clustering of single-cell RNA-seq data. Nat Methods 14(5):483–486. https://doi.org/10.1038/nmeth.4236

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Wang J, Huang M, Torre E, Dueck H, Shaffer S et al (2018) Gene expression distribution deconvolution in single-cell RNA sequencing. Proc Natl Acad Sci USA 115(28):E6437–E6446. https://doi.org/10.1073/pnas.1721085115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Thomas R, de la Torre L, Chang X, Mehrotra S (2010) Validation and characterization of DNA microarray gene expression data distribution and associated moments. BMC Bioinf 11:576–589. https://doi.org/10.1186/1471-2105-11-576

    Article  Google Scholar 

  9. de Torrente L, Zimmerman S, Suzuki M, Christopeit M, Greally JM et al (2020) The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data. BMC Bioinform 21:562–579. https://doi.org/10.1186/s12859-020-03892-w

    Article  CAS  Google Scholar 

  10. Shahrezaei V, Swain PS (2008) Analytical distributions for stochastic gene expression. Proc Natl Acad Sci USA 105(45):17256–17261. https://doi.org/10.1073/pnas.0803850105

    Article  PubMed  PubMed Central  Google Scholar 

  11. Wan C, Chang W, Zhang Y, Shah F, Lu X et al (2019) LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data. Nucleic Acids Res 47(18):e111. https://doi.org/10.1093/nar/gkz655

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Vu TN, Wills QF, Kalari KR, Niu N, Wang L et al (2016) Beta-Poisson model for single-cell RNA-seq data analyses. Bioinformatics 32(14):2128–2135. https://doi.org/10.1093/bioinformatics/btw202

    Article  CAS  PubMed  Google Scholar 

  13. Li H, Courtois ET, Sengupta D, Tan Y, Chen KH et al (2017) Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors. Nat Genet 49(5):708–718. https://doi.org/10.1038/ng.3818

    Article  CAS  PubMed  Google Scholar 

  14. Nolan JP (1998) Parameterizations and modes of stable distributions. Stat Probab Lett 38:187–195. https://doi.org/10.1016/S0167-7152(98)00010-8

    Article  Google Scholar 

  15. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102(43):15545–15550. https://doi.org/10.1073/pnas.0506580102

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Koutrouvelis IA (1981) An iterative procedure for the estimation of the parameters of stable laws: An iterative procedure for the estimation. Commun Stat-Simul C 10:17–28. https://doi.org/10.1080/03610918108812189

    Article  Google Scholar 

  17. Slimane SN, Marcel V, Fenouil T, Catez F, Saurin JC et al (2020) Ribosome biogenesis alterations in colorectal cancer. Cells 9(11):2361–2385. https://doi.org/10.3390/cells9112361

    Article  CAS  Google Scholar 

  18. Qin M, Liu S, Li A, Xu C, Tan L et al (2016) NIK- and IKKβ-binding protein promotes colon cancer metastasis by activating the classical NF-κB pathway and MMPs. Tumour Biol 37(5):5979–5990. https://doi.org/10.1007/s13277-015-4433-8

    Article  CAS  PubMed  Google Scholar 

  19. Zheng C, Zheng L, Yoo JK, Guo H, Zhang Y et al (2017) Landscape of infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell 169(7):1342–1356. https://doi.org/10.1016/j.cell.2017.05.035

    Article  CAS  PubMed  Google Scholar 

  20. Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L et al (2016) Global epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology 64(1):73–84. https://doi.org/10.1002/hep.28431

    Article  PubMed  Google Scholar 

  21. He G, Karin M (2011) NF-κB and STAT3 - key players in liver inflammation and cancer. Cell Res 21(1):159–168. https://doi.org/10.1038/cr.2010.183

    Article  CAS  PubMed  Google Scholar 

  22. Dalerba P, Kalisky T, Sahoo D, Rajendran PS, Rothenberg ME et al (2011) Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat Biotechnol 29(12):1120–1127. https://doi.org/10.1038/nbt.2038

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Liu ZH, Dai XM, Du B (2015) Hes1: a key role in stemness, metastasis and multidrug resistance. Cancer Biol Ther 16(3):353–359. https://doi.org/10.1080/15384047.2015.1016662

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Zhang Y, Zheng L, Lao X, Wen M, Qian Z et al (2019) Hes1 is associated with long non-coding RNAs in colorectal cancer. Ann Transl Med 7(18):459–465. https://doi.org/10.21037/atm.2019.08.11

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This research was supported by the Key Project of National Natural Science Foundation of China (Grant no. 11831015), the Major Research Plan of National Natural Science Foundation of China (Grant no. 91730301), Postgraduate Research & Practice Innovation Program of Jiangnan University (Grant no. JNKY19_051) and Postgraduate Research & Practice Innovation Program of Jiangsu Province (Grant no. KYCX18_1864).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Gao.

Ethics declarations

Conflict of interest

The author declares that there is no competing interest.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (XLSX 93 kb)

Supplementary file2 (DOC 516 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, M., Xu, J., Ding, T. et al. Mixed Distribution Models Based on Single-Cell RNA Sequencing Data. Interdiscip Sci Comput Life Sci 13, 362–370 (2021). https://doi.org/10.1007/s12539-021-00427-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-021-00427-6

Keywords

Navigation