Online Aggregation of Probabilistic Forecasts Based on the Continuous Ranked Probability Score

V’yugin, V. V.; Trunov, V. G.

doi:10.1134/S1064226920060285

Online Aggregation of Probabilistic Forecasts Based on the Continuous Ranked Probability Score

MATHEMATICAL MODELS AND COMPUTATIONAL METHODS
Published: 21 July 2020

Volume 65, pages 662–676, (2020)
Cite this article

Journal of Communications Technology and Electronics Aims and scope Submit manuscript

V. V. V’yugin¹ &
V. G. Trunov¹

129 Accesses
1 Citation
Explore all metrics

Abstract—Methods for generating predictions online and in the form of probability distributions of future outcomes are considered. The difference between the probabilistic forecast (probability distribution) and the numerical outcome is measured using the loss function (scoring rule). In practical statistics, the continuous ranked probability score (CRPS) is often used to estimate the discrepancy between probabilistic forecasts and (quantitative) outcomes. The paper considers the case when several competing methods (experts) give their online predictions as distribution functions. An algorithm is proposed for online aggregation of these distribution functions. The performance bounds of the proposed algorithm are obtained in the form of a comparison of the cumulative loss of the algorithm and the loss of expert hypotheses. Unlike existing estimates, the proposed estimates do not depend on time. The results of numerical experiments illustrating the proposed methods are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts

Article Open access 21 November 2017

Online Aggregation of Probabilistic Predictions of Hourly Electrical Loads

Article 24 June 2022

Optimal probability aggregation based on generalized brier scoring

Article 20 June 2019

Notes

Here 1_{u ≥ y} = 1 if u ≥ y, otherwise it is 0.
Exact definitions can be found in Section 2.
Rules (5) or (6) can be employed (see below).
The distribution function is a nondecreasing function F defined on interval [a, b] so that F(a) = 0 and F(b) = 1.
Rule (6) can be similarly employed.
Regret boundary O(ln(TN)) can be obtained for the corresponding algorithm using variable parameter α.
Note that the task of the aggregation algorithm is the fastest adaptation to changes and an increase in the weight of the leading model.

REFERENCES

A. Jordan, F. Krüger, and S. Lerch, Evaluating Probabilistic Forecasts with Scoring Rules arXiv:1709.04743.
V. Vovk, J. Shen, V. Manokhin, and Xie. Min-ge, “Nonparametric predictive distributions based on conformal prediction,” Machine Learning 60, 82−102 (2017).
Google Scholar
I. J. Good, “Rational decisions,” J. R. Statist. Soc. B 60, 82−102 (2052). www.jstor.org/stable/2984087.
N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games (Cambridge Univ. Press, Cambridge, 2006).
Book Google Scholar
Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci. 55, 119−139 (1997).
Article MathSciNet Google Scholar
V. Vovk, “Aggregating strategies,” in Proc. 3rd Ann. Workshop on Computational Learning Theory, San Mateo, CA,1990, Ed. by M. Fulk and J. Case (Morgan Kaufmann, 1990), pp. 371−383.
N. Littlestone and M. Warmuth, “The weighted majority algorithm,” Inf. Comput. 108, 212−261 (1994).
Article MathSciNet Google Scholar
V. Vovk, “A game of prediction with expert advice,” J. Comput. Syst. Sci. 56 (2), 153−173 (1998).
Article MathSciNet Google Scholar
G. W. Brier, “Verification of forecasts expressed in terms of probabilities,” Mon. Weather Rev. 78, 1−3 (1950).
Article Google Scholar
J. Brocker and L. A. Smith, “Scoring probabilistic forecasts: The importance of being proper,” Weather & Forecasting 22, 382−388 (2007).
Article Google Scholar
J. Brocker and L. A. Smith, “From ensemble forecasts to predictive distribution functions,” Tellus A 60, 663−678 (2008).
Article Google Scholar
J. Brocker, “Evaluating raw ensembles with the continuous ranked probability score,” Q. J. R. Meteorol. Soc. B 138, 1611−1617 (2012).
Article Google Scholar
A. E. Raftery, T. Gneiting, F. Balabdaoui, and M. Polakowski, “Using Bayesian model averaging to calibrate forecast ensembles,” Mon. Weather Rev. 133, 1155−1174 (2005).
Article Google Scholar
K. Bogner, K. Liechti, and M. Zappa, “Technical note: Combining quantile forecasts and predictive distributions of streamflows,” Hydrol. Earth Syst. Sci. 21, 5493−5502 (2017).
Article Google Scholar
J. Thorey, V. Mallet, and P. Baudin, “Online learning with the continuous ranked probability score for ensemble forecasting,” Quarterly J. Royal Meteorolog. Soc. A 143, 521−529 (2017). https://doi.org/10.1002/qj.2940
Article Google Scholar
V. Vovk, “Competitive on-line statistics,” Int. Statist. Rev. 69, 213−248 (2001).
Article Google Scholar
D. Adamskiy, T. Bellotti, R. Dzhamtyrova, and Y. Kalnishkan, Aggregating Algorithm for Prediction of Packs, Machine Learning 108 (8-9): 1231–1260, 2019. arXiv:1710. 08114 [cs.LG], 2017.
J. E. Matheson and R. L. Winkler, “Scoring rules for continuous probability distributions,” Management Sci. 22, 1087−1096 (1976).
Article Google Scholar
M. Herbster and M. Warmuth, “Tracking the best expert,” Machine Learn. 32 (2), 151−178 (1998).
Article Google Scholar

Download references

Funding

This work was supported by the Russian Science Foundation, project no. 20-01-00203.

Author information

Authors and Affiliations

Kharkevich Institute for Information Transmission Problems, 127051, Moscow, Russia
V. V. V’yugin & V. G. Trunov

Authors

V. V. V’yugin
View author publications
You can also search for this author in PubMed Google Scholar
V. G. Trunov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. V. V’yugin.

Additional information

Translated by A. Chikishev

SUBSTITUTION FUNCTION

For arbitrary loss function $\lambda (\gamma ,\omega )$ (γ ∈ [0, 1] and ω ∈ {0, 1}), we consider a parametric curve on the plane

$$\left( {{{e}^{{ - \eta \lambda (\gamma ,0)}}},{{e}^{{ - \eta \lambda (\gamma ,1)}}}} \right).$$

(A1)

We consider the scenario in which the curve is concave. The concavity condition is written as

$$x{\kern 1pt} '(\gamma )y{\kern 1pt} ^{"}(\gamma ) - x{\kern 1pt} ^{"}(\gamma )y{\kern 1pt} '(\gamma ) \geqslant 0,$$

(A2)

for all γ, where x(γ) = ${{e}^{{ - \eta \lambda (\gamma ,0)}}}$ and y(γ) = ${{e}^{{ - \eta \lambda (\gamma ,1)}}}$.

In particular, for quadratic loss function $\lambda (\gamma ,\omega )$ = ${{(\gamma - \omega )}^{2}}$, we have x(γ) = ${{e}^{{ - \eta {{\gamma }^{2}}}}}$ and y(γ) = ${{e}^{{ - \eta {{{\left( {\gamma - 1} \right)}}^{2}}}}}$.

After simple transformations, inequality (A2) is equivalent to inequality

$$\eta \gamma (1 - \gamma ) \leqslant \frac{1}{2}.$$

Quantity $\gamma (1 - \gamma )$ takes on a maximum value of 1/4 at 0 ≤ γ ≤ 1, so that the concavity condition is satisfied for any γ at 0 < η ≤ 2.

The η-mixability condition means that, for any distribution w = $({{\omega }_{1}},...,{{\omega }_{N}})$ on a set of N experts and any forecasts f = $({{f}_{1}},...,{{f}_{N}})$, we can find γ* for which inequalities

$${{e}^{{ - \eta (\lambda (\gamma ^{ *},\omega ))}}} \geqslant \sum\limits_{i = 1}^N {{{w}_{i}}{{e}^{{ - \eta \lambda ({{f}_{i}},\omega )}}}} ,$$

(A3)

are satisfied at ω = 0, 1. Points $\left( {{{e}^{{ - \eta \lambda ({{f}_{i}},0)}}},{{e}^{{ - \eta \lambda ({{f}_{i}},1)}}}} \right)$ (i = 1, …, N) belong to curve (A1), and their convex combination (point M) lies inside the convex region bounded by such a curve. Condition (A3) means that the abscissa and ordinate of point N = $({{e}^{{ - \eta \lambda (\gamma ^{ *},0)}}},{{e}^{{ - \eta \lambda (\gamma ^{ *},1)}}})$ are no less than the abscissa and ordinate of point M. We search for point N. The line passing through point M marks point N = $({{e}^{{ - \eta \lambda (\gamma ^{ *},0)}}},{{e}^{{ - \eta \lambda (\gamma ^{ *},1)}}})$ on curve (A1) (see Fig. 1). Forecast γ* is calculated from the condition

$$\frac{{{{e}^{{ - \eta \lambda (\gamma ^{ *},1)}}}}}{{{{e}^{{ - \eta \lambda (\gamma ^{ *},0)}}}}} = \frac{{\sum\limits_{i = 1}^N {{{w}_{i}}{{e}^{{ - \eta \lambda ({{f}_{i}},1)}}}} }}{{\sum\limits_{i = 1}^N {{{w}_{i}}{{e}^{{ - \eta \lambda ({{f}_{i}},0)}}}} }}.$$

(A4)

For a quadratic loss function, equality (A4) yields the following expression for γ*:

$$\gamma ^{ *} = {\text{Subst}}({\mathbf{f}},{\mathbf{w}}) = \frac{1}{2} - \frac{1}{{2\eta }}\ln \frac{{\sum\limits_{i = 1}^N {{{w}_{i}}{{e}^{{ - \eta f_{i}^{2}}}}} }}{{\sum\limits_{i = 1}^N {{{w}_{i}}{{e}^{{ - \eta {{{({{f}_{i}} - 1)}}^{2}}}}}} }}.$$

The mixability condition for the quadratic loss function makes it possible to use η = 2.

It can be easily shown that, for any ω ∈ [0, 1], function f(γ) = ${{e}^{{ - \eta {{{(\gamma - \omega )}}^{2}}}}}$ is concave with respect to γ ∈ [0, 1] for 0 < η < 1/2. In this case, quantity γ* = $\sum\nolimits_{i = 1}^N {{{w}_{i}}{{f}_{i}}} $ satisfies inequality (A3) at any 0 < η < 1/2 in accordance with the definition of concavity.

Rights and permissions

Reprints and permissions

About this article

Cite this article

V’yugin, V.V., Trunov, V.G. Online Aggregation of Probabilistic Forecasts Based on the Continuous Ranked Probability Score. J. Commun. Technol. Electron. 65, 662–676 (2020). https://doi.org/10.1134/S1064226920060285

Download citation

Received: 11 May 2019
Revised: 11 May 2019
Accepted: 11 May 2019
Published: 21 July 2020
Issue Date: June 2020
DOI: https://doi.org/10.1134/S1064226920060285

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Aggregation of Probabilistic Forecasts Based on the Continuous Ranked Probability Score

Access this article

Similar content being viewed by others

Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts

Online Aggregation of Probabilistic Predictions of Hourly Electrical Loads

Optimal probability aggregation based on generalized brier scoring

Notes

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

SUBSTITUTION FUNCTION

Rights and permissions

About this article

Cite this article

Keywords:

Navigation

Online Aggregation of Probabilistic Forecasts Based on the Continuous Ranked Probability Score

Access this article

Similar content being viewed by others

Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts

Online Aggregation of Probabilistic Predictions of Hourly Electrical Loads

Optimal probability aggregation based on generalized brier scoring

Notes

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

SUBSTITUTION FUNCTION

SUBSTITUTION FUNCTION

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Search

Navigation