Gaussian Two-Armed Bandit: Limiting Description

Kolnogorov, A. V.

doi:10.1134/S0032946020030059

Gaussian Two-Armed Bandit: Limiting Description

Automata Theory
Published: 19 October 2020

Volume 56, pages 278–301, (2020)
Cite this article

Problems of Information Transmission Aims and scope Submit manuscript

A. V. Kolnogorov¹

389 Accesses
6 Citations
Explore all metrics

Abstract

For a Gaussian two-armed bandit, which arises when batch data processing is analyzed, the minimax risk limiting behavior is investigated as the control horizon N grows infinitely. The minimax risk is searched for as the Bayesian one computed with respect to the worst-case prior distribution. We show that the highest requirements are imposed on the control in the domain of "close” distributions where mathematical expectations of incomes differ by a quantity of the order of N^−1/2. In the domain of "close” distributions, we obtain a recursive integro-difference equation for finding the Bayesian risk with respect to the worst-case prior distribution, in invariant form with control horizon one, and also a second-order partial differential equation in the limiting case. The results allow us to estimate the performance of batch processing. For example, the minimax risk corresponding to batch processing of data partitioned into 50 batches can be only 2% greater than its limiting value when the number of batches grows infinitely. In the case of a Bernoulli two-armed bandit, we show that optimal one-by-one data processing is not more efficient than batch processing as N grows infinitely.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Kolnogorov, A.V., Gaussian Two-Armed Bandit and Optimization of Batch Data Processing, Probl. Peredachi Inf., 2018, vol. 54, no. 1, pp. 93–111 [Probl. Inf. Transm. (Engl. Transl.), 2018, vol. 54, no. 1, pp. 84–100].
MathSciNet MATH Google Scholar
Perchet, V., Rigollet, P., Chassang, S., and Snowberg, E., Batched Bandit Problems, Ann. Statist., 2016, vol. 44, no. 2, pp. 660–681.
Article MathSciNet Google Scholar
Kolnogorov, A.V., Two-Armed Bandit Problem for Parallel Data Processing Systems, Probl. Peredachi Inf., 2012, vol. 48, no. 1, pp. 83–95 [Probl. Inf. Transm. (Engl. Transl.), 2012, vol. 48, no. 1, pp. 73–84].
MathSciNet MATH Google Scholar
Prokhorov, Yu.V. and Rozanov, Yu.A., Teoriya veroyatnostei: osnovnye poniatiya, predel’nye teoremy, sluchainye protsessy, Moscow: Nauka, 1987, 3rd ed. First edition translated under the title Probability Theory: Basic Concepts, Limit Theorems, Random Processes, Berlin: Springer, 1969.
Vogel, W., A Sequential Design for the Two-Armed Bandit, Ann. Math. Statist., 1960, vol. 31, no. 2, pp. 430–443.
Article MathSciNet Google Scholar
Vogel, W., An Asymptotic Minimax Theorem for the Two-Armed Bandit Problem, Ann. Math. Statist., 1960, vol. 31, no. 2, pp. 444–451.
Article MathSciNet Google Scholar
Lai, T.L., Levin, B., Robbins, H., and Siegmund, D., Sequential Medical Trials, Proc. Natl. Acad. Sci. USA, vol. 77, no. 6, pp. 3135–3138.
Article MathSciNet Google Scholar
Lai, T.L. and Robbins, H., Asymptotically Efficient Adaptive Allocation Rules, Adv. in Appl. Math., 1985, vol. 6, no. 1, pp. 4–22.
Article MathSciNet Google Scholar
Kaufmann, E., On Bayesian Index Policies for Sequential Resource Allocation, Ann. Statist., 2018, vol. 46, no. 2, pp. 842–865.
Article MathSciNet Google Scholar
Kolnogorov, A.V., Parallel Design of Robust Control in the Stochastic Environment (The Two-Armed Bandit Problem), Avtomat. i Telemekh., 2012, vol. 73, no. 4, pp. 114–130 [Autom. Remote Control (Engl. Transl.), 2012, vol. 73, no. 4, pp. 689–701].
MathSciNet MATH Google Scholar
Kolnogorov, A.V., On a Limiting Description of Robust Parallel Control in a Random Environment, Avtomat. i Telemekh., 2015, vol. 767, no. 7, pp. 111–126 [Autom. Remote Control (Engl. Transl.), 2015, vol. 76, no. 7, pp. 1229–1241].
MathSciNet MATH Google Scholar
Samarskii, A.A., Teoriya raznostnykh skhem, Moscow: Nauka, 1989. Translated under the title The Theory of Difference Schemes, New York: Marcel Dekker, 2001.
Google Scholar
Bather, J.A., The Minimax Risk for the Two-Armed Bandit Problem, Mathematical Learning Models—Theory and Algorithms, Herkenrath, U., Kalin, D., and Vogel, W., Eds., Lect. Notes Statist., vol. 20, New York: Springer-Verlag, 1983, pp. 1–11.

Download references

Acknowledgements

The author is grateful to a reviewer for the attention to the paper and valuable remarks.

Funding

Supported in part by the Russian Foundation for Basic Research, project no. 20-01-00062.

Author information

Authors and Affiliations

Department of Applied Mathematics and Information Science, Yaroslav-the-Wise Novgorod State University, Novgorod, Russia
A. V. Kolnogorov

Authors

A. V. Kolnogorov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. V. Kolnogorov.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kolnogorov, A. Gaussian Two-Armed Bandit: Limiting Description. Probl Inf Transm 56, 278–301 (2020). https://doi.org/10.1134/S0032946020030059

Download citation

Received: 06 April 2020
Revised: 02 June 2020
Accepted: 02 June 2020
Published: 19 October 2020
Issue Date: July 2020
DOI: https://doi.org/10.1134/S0032946020030059

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaussian Two-Armed Bandit: Limiting Description

Abstract

Access this article

Similar content being viewed by others

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Poissonian Two-Armed Bandit: A New Approach

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Gaussian Two-Armed Bandit: Limiting Description

Abstract

Access this article

Similar content being viewed by others

Gaussian Two-Armed Bandit and Optimization of Batch Data Processing

Poissonian Two-Armed Bandit: A New Approach

Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation