Abstract
Nonparametric estimation of cumulative distribution function and probability density function of continuous random variables is a basic and central problem in probability theory and statistics. Although many methods such as kernel density estimation have been presented, it is still quite a challenging problem to be addressed to researchers. In this paper, we proposed a new method of spline regression, in which the spline function could consist of totally different types of functions for each segment with the result of Monte Carlo simulation. Based on the new spline regression, a new method to estimate the distribution and density function was provided, which showed significant advantages over the existing methods in the numerical experiments. Finally, the density function estimation of high dimensional random variables was discussed. It has shown the potential to apply the method in classification and regression models.
Similar content being viewed by others
References
Candy JV (2009) Bayesian signal processing: classical, modern and particle filtering methods. Wiley-Interscience, New York
Bishop CM (1996) Neural networks for pattern recognition. Oxford University Press, New York
Mitchell TM, Carbonell JG, Michalski RS (1986) Machine learning: a guide to current research. Kluwer Academic Publishers, Norwell
Mood AM, Graybill FA, Boes DC (1974) Introduction to the theory of statistics, 3rd edn. McGraw-Hill Education, New York
Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27(3):832–837
Parzen E (1962) On estimation of probability density function and mode. Ann Math Stat 33(3):1065–1076
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, London
Terrell GR (1990) The maximal smoothing principle in density estimation. J Am Stat Assoc 85(410):470–477
Alexandre LA (2008) A solve-the-equation approach for unidimensional data kernel bandwidth selection. University of Beira Interior, Covilhã
Jeon B, Landgrebe DA (1994) Fast Parzen density estimation using clustering-based branch andbound. IEEE Trans Pattern Anal Mach Intell 16(9):950–954
Babich GA, Camps OI (1996) Weighted Parzen windows for pattern classification. IEEE Trans Pattern Anal Mach Intell 18(5):567–570
Girolami M, He C (2003) Probability density estimation from optimally condensed data samples. IEEE Trans Pattern Anal Mach Intell 25(10):1253–1264
Bowers NL (1966) Expansion of probability density functions as a sum of gamma densities with applications in risk theory. Trans Soc Actuar 18 PT.1(52):125–147
Van Khuong H, Kong HY (2006) General expression for pdf of a sum of independent exponential random variables. IEEE Commun Lett 10(3):159–161
Schwartz SC (1967) Estimation of probability density by an orthogonal series. Ann Math Stat 38(4):1261–1265
Engel J (1990) Density estimation with Haar series. Stat Probab Lett 9(2):111–117
Vannucci M (1998) Nonparametric density estimation using wavelets; Discussion Paper 95–26, ISDS. Duke University, Durham
Howard RM (2010) PDF estimation via characteristic function and an orthonormal basis set. In: Wseas international conference on systems
Xie J, Wang Z (2009) Probability density function estimation based on windowed fourier transform of characteristic function. In: International congress on image and signal processing
Wold S (1974) Spline functions in data analysis. Technometrics 16(1):1–11
Reinsch CH (1967) Smoothing by spline functions. Numer Math 10(3):177–183
Marsh L, Cormier DR (2002) Spline regression models. J R Stat Soc 52(3):49–58
Zong Z, Lam KY (1998) Estimation of complicated distributions using B-spline functions. Struct Saf 20(4):341–355
Mansour A, Mesleh R, Aggoune EHM (2015) Blind estimation of statistical properties of non-stationary random variables. J Adv Signal Process 51(1):309–314
Kitahara D, Yamada I (2015) Probability density function estimation by positive quartic C 2 -spline functions. In: IEEE international conference on acoustics, speech and signal processing
De Boor C (1978) A practical guide to splines. Springer, New York
Zhang H (2005) The optimality of Naive Bayes. In: Seventeenth international florida artificial intelligence research society conference, Miami Beach, Florida, USA
Rennie JDM, Shih L, Teevan J, Karger D (2003) Tackling the poor assumptions of Naive Bayes text classifiers. In: Proceedings of the twentieth international conference on machine learning, Washington, DC, USA
Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on machine learning, Pittsburgh, PA, USA
Pfanzagl J, Hamböker R (1996) Parametric statistical theory. J Am Stat Assoc 91(433):269–287
Acknowledgements
This work are supported by grants from the Key Research Area Grant 2016YFA0501703 of the Ministry of Science and Technology of China, the National High-Tech R&D Program of China (863 Program Contract no. 2012AA020307), the National Basic Research Program of China (973 Program) (Contract no. 2012CB721000), and Ph.D. Programs Foundation of Ministry of Education of China (Contract no. 20120073110057), the Young Scholars (Grant no. 31400704) of Natural Science Foundation of China, also computing resources provided by Center for High Performance Computing, Shanghai Jiao Tong University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dai, H., Wang, W., Xu, Q. et al. Estimation of Probability Distribution and Its Application in Bayesian Classification and Maximum Likelihood Regression. Interdiscip Sci Comput Life Sci 11, 559–574 (2019). https://doi.org/10.1007/s12539-019-00343-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-019-00343-w