Abstract
For a Gaussian two-armed bandit, which arises when batch data processing is analyzed, the minimax risk limiting behavior is investigated as the control horizon N grows infinitely. The minimax risk is searched for as the Bayesian one computed with respect to the worst-case prior distribution. We show that the highest requirements are imposed on the control in the domain of "close” distributions where mathematical expectations of incomes differ by a quantity of the order of N−1/2. In the domain of "close” distributions, we obtain a recursive integro-difference equation for finding the Bayesian risk with respect to the worst-case prior distribution, in invariant form with control horizon one, and also a second-order partial differential equation in the limiting case. The results allow us to estimate the performance of batch processing. For example, the minimax risk corresponding to batch processing of data partitioned into 50 batches can be only 2% greater than its limiting value when the number of batches grows infinitely. In the case of a Bernoulli two-armed bandit, we show that optimal one-by-one data processing is not more efficient than batch processing as N grows infinitely.
Similar content being viewed by others
References
Kolnogorov, A.V., Gaussian Two-Armed Bandit and Optimization of Batch Data Processing, Probl. Peredachi Inf., 2018, vol. 54, no. 1, pp. 93–111 [Probl. Inf. Transm. (Engl. Transl.), 2018, vol. 54, no. 1, pp. 84–100].
Perchet, V., Rigollet, P., Chassang, S., and Snowberg, E., Batched Bandit Problems, Ann. Statist., 2016, vol. 44, no. 2, pp. 660–681.
Kolnogorov, A.V., Two-Armed Bandit Problem for Parallel Data Processing Systems, Probl. Peredachi Inf., 2012, vol. 48, no. 1, pp. 83–95 [Probl. Inf. Transm. (Engl. Transl.), 2012, vol. 48, no. 1, pp. 73–84].
Prokhorov, Yu.V. and Rozanov, Yu.A., Teoriya veroyatnostei: osnovnye poniatiya, predel’nye teoremy, sluchainye protsessy, Moscow: Nauka, 1987, 3rd ed. First edition translated under the title Probability Theory: Basic Concepts, Limit Theorems, Random Processes, Berlin: Springer, 1969.
Vogel, W., A Sequential Design for the Two-Armed Bandit, Ann. Math. Statist., 1960, vol. 31, no. 2, pp. 430–443.
Vogel, W., An Asymptotic Minimax Theorem for the Two-Armed Bandit Problem, Ann. Math. Statist., 1960, vol. 31, no. 2, pp. 444–451.
Lai, T.L., Levin, B., Robbins, H., and Siegmund, D., Sequential Medical Trials, Proc. Natl. Acad. Sci. USA, vol. 77, no. 6, pp. 3135–3138.
Lai, T.L. and Robbins, H., Asymptotically Efficient Adaptive Allocation Rules, Adv. in Appl. Math., 1985, vol. 6, no. 1, pp. 4–22.
Kaufmann, E., On Bayesian Index Policies for Sequential Resource Allocation, Ann. Statist., 2018, vol. 46, no. 2, pp. 842–865.
Kolnogorov, A.V., Parallel Design of Robust Control in the Stochastic Environment (The Two-Armed Bandit Problem), Avtomat. i Telemekh., 2012, vol. 73, no. 4, pp. 114–130 [Autom. Remote Control (Engl. Transl.), 2012, vol. 73, no. 4, pp. 689–701].
Kolnogorov, A.V., On a Limiting Description of Robust Parallel Control in a Random Environment, Avtomat. i Telemekh., 2015, vol. 767, no. 7, pp. 111–126 [Autom. Remote Control (Engl. Transl.), 2015, vol. 76, no. 7, pp. 1229–1241].
Samarskii, A.A., Teoriya raznostnykh skhem, Moscow: Nauka, 1989. Translated under the title The Theory of Difference Schemes, New York: Marcel Dekker, 2001.
Bather, J.A., The Minimax Risk for the Two-Armed Bandit Problem, Mathematical Learning Models—Theory and Algorithms, Herkenrath, U., Kalin, D., and Vogel, W., Eds., Lect. Notes Statist., vol. 20, New York: Springer-Verlag, 1983, pp. 1–11.
Acknowledgements
The author is grateful to a reviewer for the attention to the paper and valuable remarks.
Funding
Supported in part by the Russian Foundation for Basic Research, project no. 20-01-00062.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kolnogorov, A. Gaussian Two-Armed Bandit: Limiting Description. Probl Inf Transm 56, 278–301 (2020). https://doi.org/10.1134/S0032946020030059
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0032946020030059