An empirical comparison of predictive models for web page performance,Information and Software Technology

当前位置： X-MOL 学术 › Inf. Softw. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An empirical comparison of predictive models for web page performance
Information and Software Technology ( IF 3.9 ) Pub Date : 2020-03-12 , DOI: 10.1016/j.infsof.2020.106307
Raghu Ramakrishnan , Arvinder Kaur

Context

The quality of user experience is the cornerstone of any organization’s successful digital transformation journey. Web pages are the main touchpoint for users to access services in a digital mode. Web page performance is a key determinant of the quality of user experience. The negative impact of poor web page performance on the productivity, profits, and brand value of an organization is well-recognized. The use of realistic prediction models for predicting page load time at the early stages of development can help minimize the effort and cost arising out of fixing performance defects late in the lifecycle.

Objective

We present a comprehensive evaluation of models based on 18 widely used machine learning techniques on their capability to predict page load times. The models use only those metrics which relate to the form and structure of a page because such metrics are easy to ascertain during the early stages with minimal effort.

Method

The machine learning techniques are trained on more than 8,700 pages from HTTP Archive data, a database of web performance information widely used to conduct web performance research. The trained models are then validated using the 10-fold cross-validation method and accuracy measures like the Pearson correlation coefficient (r), Root Mean Square Error (RMSE), and Normalized Root Mean Square Error (NRMSE) are reported.

Results

Radial Basis Function regression and Random Forest outperform all other techniques. The value of r ranges from 0.69-0.92, indicating a high correlation between the observed and predicted values. The NRMSE varies between 0.11-0.16, implying that RMSE is less than 16% of the range of actual value. The RMSE improves by 41%-54% compared to the best baseline prediction model.

Conclusion

It is possible to build realistic prediction models using machine learning techniques that can be used by practitioners during the early stages of development with minimal effort.

中文翻译：

网页性能预测模型的实证比较

语境

用户体验的质量是任何组织成功进行数字化转型之旅的基石。网页是用户以数字方式访问服务的主要接触点。网页性能是用户体验质量的关键决定因素。网页性能差对组织的生产力，利润和品牌价值的负面影响是众所周知的。在开发的早期阶段使用现实的预测模型来预测页面加载时间可以帮助最小化在生命周期后期修复性能缺陷所产生的工作量和成本。