当前位置: X-MOL 学术Journal of Property Research › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes
Journal of Property Research ( IF 2.1 ) Pub Date : 2019-01-02 , DOI: 10.1080/09599916.2019.1587489
Jorge Iván Pérez-Rave 1 , Juan Carlos Correa-Morales 2 , Favián González-Echavarría 3
Affiliation  

ABSTRACT The hedonic price regressions have mainly been used for inference. In contrast, machine learning employed on big data has a great potential for prediction. To contribute to the integration of these two strategies, this article proposes a machine learning approach to the regression analysis of big data, viz. real estate prices, for both inferential and predictive purposes. The methodology incorporates a new procedure of selecting variables, called ‘incremental sample with resampling’ (MINREM). The methodology is tested on two cases. The first is data from web advertisements selling used homes in Colombia (61,826 observations). The second considers the data (58,888 observations) from a sample of the Metropolitan American Housing Survey 2011 obtained and prepared by a reference study. The methodology consists of two stages. The first chooses the important variables under MINREM; the second focuses on the traditional training and validation procedure for machine learning, adding three activities. In both test cases, the methodology shows its value for obtaining highly parsimonious and stable models for different sample sizes, as well as taking advantage of the inferential and predictive use of the obtained regression functions. This paper contributes to an original methodology for big data regression analysis.

中文翻译:

一种用于推理和预测目的的房地产价格大数据回归分析的机器学习方法

摘要享乐价格回归主要用于推断。相反,在大数据上使用的机器学习具有很大的预测潜力。为了促进这两种策略的集成,本文提出了一种机器学习方法来对大数据进行回归分析,即。推断和预测目的的房地产价格。该方法采用了一种新的变量选择程序,称为“带重采样的增量样本”(MINREM)。该方法论在两种情况下进行了测试。首先是来自在哥伦比亚出售二手房的网络广告数据(观察到61,826个结果)。第二部分考虑了通过参考研究获得和准备的2011年美国大都会房屋调查样本的数据(58,888个观察值)。该方法包括两个阶段。首先选择MINREM下的重要变量;第二个重点是针对机器学习的传统培训和验证程序,增加了三个活动。在这两个测试案例中,该方法论均显示出其对于获得针对不同样本量的高度简约和稳定的模型以及利用所获得的回归函数的推论性和预测性使用的价值。本文为大数据回归分析的原始方法做出了贡献。以及利用获得的回归函数的推论性和预测性优势。本文为大数据回归分析的原始方法做出了贡献。以及利用获得的回归函数的推论性和预测性优势。本文为大数据回归分析的原始方法做出了贡献。
更新日期:2019-01-02
down
wechat
bug