Online random forests regression with memories

doi:10.1016/j.knosys.2020.106058

Knowledge-Based Systems

Volumes 201–202, 9 August 2020, 106058

https://doi.org/10.1016/j.knosys.2020.106058 Get rights and content

Abstract

In recent years, the online schema of the conventional Random Forests(RFs) have attracted much attention because of its ability to handle sequential data or data whose distribution changes during the prediction process. However, most research on online RFs focuses on structural modification during the training stage, overlooking critical aspects of the sequential dataset, such as autocorrelation. In this paper, we demonstrate how to improve the predictive accuracy of the regression model by exploiting data correlation. Instead of modifying the structure of the off-line trained RFs, we endow RFs with memory during regression prediction through an online weight learning approach, which is called Online Weight Learning Random Forest Regression(OWL-RFR). Specifically, the weights of leaves are updated based on a novel adaptive stochastic gradient descent method, in which the adaptive learning rate considers the current and historical prediction bias ratios, compared with the static learning rate. Thus, leaf-level weight stores the learned information from the past data points for future correlated prediction. Compared with tree-level weight which only has immediate memory for current prediction, the leaf-level weight can provide long-term memory. Numerical experiments with OWL-RFR show remarkable improvements in predictive accuracy across several common machine learning datasets, compared to traditional RFs and other online approaches. Moreover, our results verify that the weight approach using the long-term memory of leaf-level weight is more effective than immediate dependency on tree-level weight. We show the improved effectiveness of the proposed adaptive learning rate in comparison to the static rate for most datasets, we also show the convergence and stability of our method.

Introduction

Random Forests(RFs) [1] are ensembles of randomized decision trees that use bagging for classifier or regression tasks. RFs are regularly used in machine learning applications, as well as some tasks demanding high real-time performance, such as computer vision [2], [3]. Their main advantages embody the good performances in computing efficiency and prediction.

RFs are typically off-line methods [4]. Nevertheless, there are numerous situations in which applications require real-time processing of massive streaming data, such as prediction of network workload, traffic conditions during peak time, and users’ evaluation of products. These types of extensive data are challenging to store in totality. Furthermore, the distribution of test samples may differ from that of training samples. Therefore, the evaluation of future testing instances using the models trained off-line directly may feature large prediction bias.

To address these issues, some efforts have been dedicated to online training [5], [6], [7]. However, to the best of our knowledge, most of the existing approaches suffer from frequent updating of the model, which generally leads to inefficient computation. Moreover, they often ignore the continuing dependencies between samples in the data stream.

In this paper, we propose a novel approach that updates the leaf-level weight of off-line trained base learners in an online manner and endows RFs with long-term memory to avoid the need for frequent model updating. More precisely, as testing data arrive sequentially, our method updates the leaf-level weight of RFs simultaneously, while keeping the structure of the trained RFs’ model unchanged. In this way, the proposed approach achieves better prediction by such online weight learning, in addition to exploiting the superior efficiency of the off-line training approach. Moreover, our method avoids the low efficiency and poor accuracy of a pure online growth strategy during the beginning stages. Our contribution is three-fold:

(1)
We propose an online weight learning approach that endows RFs with memory to improve regression prediction in cases of distribution is changed in streaming data.
(2)
We propose an adaptive learning rate for stochastic gradient descent (SGD) based on current and historical prediction bias ratios.
(3)
We validate the proposed method with extensive experiments and show the convergence and stability of our method.

The rest of the paper is organized as follows. In Section 2 we discuss previous work in related areas. In Section 3 we introduce the main principles of our method and present the online weight learning algorithm for RFs. Section 4 details a series of experiments with our approach on regression tasks. Finally, the conclusion and future work are given in Section 5.

Section snippets

Related work

The conventional RFs are an ensemble of unpruned (classification or regression) decision trees, such as C4.5 [8] and CART [9], and are a type of extension of bagging [10]. Conventional RFs use random feature selection in tree induction procedures, through bootstrap sampling of training data. RFs have many advantages, such as low cost of computation, ease of implementation, and powerful performance. RFs’ predictions are generated through aggregating predictions of trees, in the manner of either

Preliminaries

Online learning belongs to the online optimization problem, which attempts to overcome the shortcomings of statistical learning theory when addressing the dynamic aspects of large datasets. Online regression learning can be described formally as follows [38]. At every round $t = 1$ , $2$ , …, $n$ :

•
receive question $x_{t}$ $\in$ $X$ where $X$ $\in$ $R$ $^{d}$ which corresponds to a set of features
•
predict $p_{t}$ $\in$ $D$ by $h_{t}$ $\in$ $H$ , $H$ is the hypothesis set, $H$ ={ $x_{t}$ $\mapsto$ $\sum_{i = 1}^{d}$ $w_{t}$ $[i]$ $x_{t}$ $[i]$ : $\forall$ $i$ , $w_{t}$ $[i]$ $\in$ $R$ }, where $w_{t}$ $[i]$ is

Benchmark datasets

The algorithms proposed in this paper are designed for scenarios in which data are delivered through continuous streaming. For comparative evaluation, we selected eight different sizes of benchmark datasets from the UCI [39] Machine Learning Repository and Kaggle data science community. These datasets, are Boston, California Housing(California), Concrete Compressive Strength (Concrete) [40], Airfoil Self-Noise(Airfoil), ParkinsonsTelemonitoring(Parkinsons) [41], Wine Quality(WineQuality) [42],

Conclusion

We introduce an novel online weight learning algorithm(OWL-RFR), the basis of which is more typically used for the off-line training of RFs for regression prediction. We endow the RFs with memory by weight when predicting. We analyze the difference between each leaf weight’s long-term memory and the tree weight’s immediate memory in streaming data, to prove the effectiveness of the former. Experiments on commonly used machine learning datasets show that OWL-RFR significantly outperforms

CRediT authorship contribution statement

Yuan Zhong: Conceptualization, Methodology, Software, Writing - review & editing. Hongyu Yang: Supervision. Yanci Zhang: Validation, Investigation. Ping Li: Formal analysis, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors acknowledges the support from National Natural Science Foundation of China 61472261 and 61873218, Applied Basic Research Foundation of Sichuan Provincial Science and Technology Department, China (No. 18YYJC1147), Nanchong Science and Technology Department, China (18SXHZ0009) and Southwest Petroleum University innovation base, China (No. 642).

References (43)

WeiZ.S. et al.
Protein–protein interaction sites prediction by ensembling svm and sample-weighted random forests
Neurocomputing
(2016)
MaudesJ. et al.
Random feature weights for decision tree ensemble construction
Inform. Fusion
(2012)
WuQ. et al.
Forestexter: An efficient random forest algorithm for imbalanced text categorization
Knowl.-Based Syst.
(2014)
KimH. et al.
A weight-adjusted voting algorithm for ensembles of classifiers
J. Korean Statist. Soc.
(2011)
Frías-BlancoI. et al.
Online adaptive decision trees based on concentration inequalities
Knowl.-Based Syst.
(2016)
Gomes SoaresS. et al.
An on-line weighted ensemble of regressor models to handle concept drifts
Eng. Appl. Artif. Intell.
(2015)
YehI.-C.
Modeling of strength of high-performance concrete using artificial neural networks
Cement Concr. Res.
(1998)
CortezP. et al.
Modeling wine preferences by data mining from physicochemical properties
Decis. Support Syst.
(2009)
BreimanL.
Random forests
Mach. Learn.
(2001)
BoschA. et al.
Image classification using random forests and ferns

LeistnerC. et al.

Semi-supervised random forests

SaffariA. et al.