A Markovian Stackelberg game approach for computing an optimal dynamic mechanism

Clempner, Julio B.

doi:10.1007/s40314-021-01578-4

A Markovian Stackelberg game approach for computing an optimal dynamic mechanism

Published: 14 July 2021

Volume 40, article number 186, (2021)
Cite this article

Computational and Applied Mathematics Aims and scope Submit manuscript

Julio B. Clempner ORCID: orcid.org/0000-0002-5918-4671¹

209 Accesses
5 Citations
Explore all metrics

Abstract

This paper presents a dynamic Bayesian–Stackelberg incentive-compatible mechanism, in which multiple agents observe private information and learn their behavior through a sequence of interactions in a repeated game, for a class of controllable homogeneous Markov games. We assume that the leaders can ex ante commit to their disclosure strategy and mechanism, and affect followers’ actions. Along the paper, leaders possess and benefit from some commitment leadership, which describes the distinctive nature of a Stackelberg game. In this dynamics, leaders and followers together are in a Stackelberg game where actions are taken in a sequential way in the two layers of the hierarchy, but independently leaders and followers are involved non-cooperativelyin two (Nash) games where actions are taken simultaneously. This game considers an ex-ante incentive-compatible mechanism, which in equilibrium maximizes the reward while the agents are learning their actions over a countable number of periods. The formulation of the problem considers a Bayesian–Stackelberg equilibrium in the context of Reinforcement Learning. We propose an algorithm supported by the extraproximal method and show that it converges. The Tikhonov’s regularization technique is employed for ensuring the existence and uniqueness of the Bayesian–Stackelberg equilibrium. We show and guarantee the convergence of the method to a single incentive-compatible mechanism. We derive the analytical expressions for computing the mechanism in a Stackelberg game, which is one of the main results of this work. We demonstrate the efficiency of the method by an experiment drawn from an electric power problem represented by an oligopolistic market structure dominated by a small number of large sellers (oligopolists).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Stackelberg risk preference design

Article 02 April 2024

Shutian Liu & Quanyan Zhu

A strategic approach to bankruptcy problems based on the TAL family of rules

Article Open access 11 March 2024

Dirck Bouwhuis, Peter Borm & Ruud Hendrickx

Notes

Note that Eq. (16) is the regularized version of Eq. (15), which is a result of applying the Lagrange method over Eq. (14).

References

Antipin AS (2005) An extraproximal method for solving equilibrium programming problems and games. Computational Mathematics and Mathematical Physics 45(11), 1893–1914
MathSciNet Google Scholar
Asiain E, Clempner JB, Poznyak AS (2019) Controller exploitation-exploration: A reinforcement learning architecture. Soft Comput 23(11):3591–3604
Article Google Scholar
Athey S, Segal I (2013) An efficient dynamic mechanism. Econometrica 81(6), 2463–2485
Article MathSciNet Google Scholar
Baron D, Besanko D (1984) Regulation and information in a continuing relationship. Information Economics and Policy 1:447–470
Article Google Scholar
Battaglini M (2005) Long-term contracting with Markovian consumers. Am Econ Rev 95(3):637–658
Article Google Scholar
Bergemann D, Said M (2011) Wiley Encyclopedia Of Operations Research And Management Science. Wiley, Hoboken, pp 1511–1522
Google Scholar
Bergemann D, Välimäki J (2008) New Palgrave Dictionary of Economics. Palgrave Macmillan, chap Bandit Problems, New York, pp 336–340
Book Google Scholar
Bergemann D, Välimäki J (2010) The dynamic pivot mechanism. Econometrica 78(2), 771–789
Article MathSciNet Google Scholar
Berry CA, Hobbs BF, Meroney WA, O’Neill RP, Stewart WR Jr (1999) Understanding how market power can arise in network competition: a game theoretic approach. Utili Policy 8:139–158
Besanko D (1985) Multiperiod contracts between principal and agent with adverse selection. Economics Letters 17:33–37
Article Google Scholar
Board S (2007) Selling options”, journal of economic theory. Journal of Economic Theory 136:324–340.
Article MathSciNet Google Scholar
Board S, Skrzypacz A (2016) Revenue management with forward-looking buyers forward-looking buyers. Journal of Political Economy. 124(4):1046–1087
Article Google Scholar
Clempner JB, Poznyak AS (2016a) Analyzing an optimistic attitude for the leader firm in duopoly models: A strong stackelberg equilibrium based on a lyapunov game theory approach. Economic Computation & Economic Cybernetics Studies & Research 50(4):41–60
Google Scholar
Clempner JB, Poznyak AS (2016b) Convergence analysis for pure stationary strategies in repeated potential games: nash, lyapunov and correlated equilibria. Expert Syst Appl 46:474–484
Article Google Scholar
Clempner JB, Poznyak AS (2018a) A tikhonov regularization parameter approach for solving lagrange constrained optimization problems. Eng Optim 50(11):1996–2012
Article MathSciNet Google Scholar
Clempner JB, Poznyak AS (2018b) A tikhonov regularized penalty function approach for solving polylinear programming problems. J Comput Appl Math 328:267–286
Article MathSciNet Google Scholar
Clempner JB, Poznyak AS (2019a) Observer and control design in partially observable finite Markov chains. Automatica (To be publushed)
Clempner JB, Poznyak AS (2019b) Observer and control design in partially observable finite markov chains. Automatica 110:110
Article MathSciNet Google Scholar
Clempner JB, Poznyak AS (2020) A nucleus for bayesian partially observable markov games: Joint observer and mechanism design. Engineering Applications of Artificial Intelligence 95:103876
Article Google Scholar
Clempner JB, Poznyak AS (2021) Analytical method for mechanism design in partiallyobservable markov games. Mathematics (To be published)
Courty P, Li H (2000) Sequential screening. Review of Economic Studies 67:697–717
Article MathSciNet Google Scholar
Esö P, Szentes B (2007) Optimal information disclosure in auctions and the handicap auction. Review of Economic Studies 74(3), 705–731
Article MathSciNet Google Scholar
Garg D, Narahari Y (2008) Mechanism design for single leader stackelberg problems and application to procurement auction design. IEEE Transactions On Automation Science And Engineering 5(3):377–393
Article Google Scholar
Gershkov A, Moldovanu B (2009) Dynamic revenue maximization with heterogenous objects: A mechanism design approach. American Economic Journal: Microeconomics 1(2):168–198
Article Google Scholar
Golosov M, Skreta V, Tsyvinski A, Wilson A (2014) Dynamic strategic information transmission. Journal of Economic Theory 151:304–341
Article MathSciNet Google Scholar
Hartline JD, Lucier B (2015) Non-optimal mechanism design. American Economic Review 105(10):3102–3124
Article Google Scholar
Hobbs B, Metzler C, Pang J (2000) Strategic gaming analysis for electric power networks: an mpec approach. IEEE Trans Power Syst 15:638–645
Article Google Scholar
Hu M, Fukushima M (2011) Variational inequality formulation of a class of multi-leader-follower games. Journal of Optimization Theory and Applications 151:455–473
Article MathSciNet Google Scholar
Hurwicz L (1960) Optimality and informational efficiency in resource allocation processes. In: Arrow KJ, Karlin S, Suppes P (eds) Mathematical methods in the social sciences. Stanford University Press, California, pp 27–46
Google Scholar
Kakade S, Lobel I, Nazerzadeh H (2013) Optimal dynamic mechanism design and the virtual-pivot mechanism. Operations Research 61(4), 837–854
Article MathSciNet Google Scholar
Myerson RB (1983) Mechanism design by an informed principal. Econometrica 51(6), 1767–1797
Article MathSciNet Google Scholar
Myerson RB (1989) Allocation, information and markets, The New Palgrave. Palgrave Macmillan, London, pp 191–206
Book Google Scholar
Pang J, Fukushima M (2005) Quasi-variational inequalities, generalized nash equilibria, and multileader-follower games. Computational Management Science 2:21–56
Article Google Scholar
Pavan A, Segal I, Toikka J (2014) Dynamic mechanism design: A myersonian approach. Econometrica 82(2):601–653
Article MathSciNet Google Scholar
Saari DG (1988) On the types of information and mechanism design. Journal of Computational and Applied Mathematics 22(2–3), 231–242
Article MathSciNet Google Scholar
Solis C, Clempner JB, Poznyak AS (2019) Robust extremum seeking for a second order uncertain plant using a sliding mode controller. International Journal of Applied Mathematics and Computer Science 29(4), 703–712
Article MathSciNet Google Scholar
Trejo KK, Clempner JB, Poznyak AS (2015a) Computing the stackelberg/nash equilibria using the extraproximal method: Convergence analysis and implementation details for markov chains games. Int J Appl Math Comput Sci 25(2):337–351
Article MathSciNet Google Scholar
Trejo KK, Clempner JB, Poznyak AS (2015b) A stackelberg security game with random strategies based on the extraproximal theoretic approach. Eng Appl Artif Intell 37:145–153
Article Google Scholar
Wang X, Chin KS, Yin H (2011) Design of optimal double auction mechanism with multi-objectives. Expert Systems with Applications 38:13749–13756
Google Scholar

Download references

Author information

Authors and Affiliations

Escuela Superior de Física y Matemáticas (School of Physics and Mathematics), Instituto Politécnico Nacional (National Polytechnic Institute), Building 9, Av. Instituto Politécnico Nacional San Pedro Zacatenco, Gustavo A. Madero, 07738, Mexico City, Mexico
Julio B. Clempner

Authors

Julio B. Clempner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julio B. Clempner.

Additional information

Communicated by Jorge Zubelli.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Proof of Theorem 2

.

Following (Antipin 2005; Trejo et al. 2015a), let us consider $\eta =\gamma , \text { } z={\tilde{w}},\text { }x= {\tilde{v}}_{n}$ and $z^{*} =\hat{v}_{n}$

$$\begin{aligned} \begin{array}{c} r(z)={\mathcal {L}}_{\delta }({\tilde{w}},{\tilde{v}}_{n}), \text { } r(z^{*})={\mathcal {L}}_{\delta }(\hat{v}_{n} ,{\tilde{v}}_{n}) \end{array} \end{aligned}$$

we obtain

$$\begin{aligned} \begin{array}{c} \dfrac{1}{2}\Vert \hat{v}_{n} -{\widetilde{v}}_{n} \Vert ^{2} +\gamma {\mathcal {L}}_{\delta } (\hat{v}_{n},{\tilde{v}}_{n} )\le \dfrac{1}{2} \Vert {\tilde{w}}-{\tilde{v}}_{n}\Vert ^{2} + \gamma {\mathcal {L}}_{\delta }({\tilde{w}},{\tilde{v}}_{n} )-\dfrac{1}{2} \Vert {\tilde{w}}-\hat{v}_{n}\Vert ^{2} \end{array} \end{aligned}$$

(34)

Now, considering $\eta =\gamma , \small z={\tilde{w}},\text { }x={\tilde{v}}_{n}$ and $ z^{*}={\tilde{v}}_{n+1}$

$$\begin{aligned} \begin{array}{c} r(z)={\mathcal {L}}_{\delta }({\tilde{w}},\hat{v}_{n} ),\text { } r(z^{*})={\mathcal {L}}_{\delta } ({\tilde{v}}_{n+1},\hat{v}_{n}) \end{array} \end{aligned}$$

we obtain

$$\begin{aligned} \begin{array}{c} \dfrac{1}{2}\Vert {\tilde{v}}_{n+1} -{\tilde{v}}_{n}\Vert ^{2}+\gamma {\mathcal {L}}_{\delta } ({\tilde{v}}_{n+1},\hat{v}_{n} )\le \dfrac{1}{2} \Vert {\tilde{w}}-{\tilde{v}}_{n} \Vert ^{2} +\gamma {\mathcal {L}}_{\delta } ({\tilde{w}},\hat{v}_{n} )-\dfrac{1}{2} \Vert {\tilde{w}}-{\tilde{v}}_{n+1} \Vert ^{2} \end{array} \quad \end{aligned}$$

(35)

Choosing ${{\tilde{w}}}={\tilde{v}}_{n+1}$ in (34) and ${{\tilde{w}} }= \hat{v}_{n}$ in (35) we obtain

$$\begin{aligned}&\begin{array}{c} \dfrac{1}{2}\Vert \hat{v}_{n}-{\tilde{v}}_{n} \Vert ^{2} +\gamma {\mathcal {L}}_{\delta } (\hat{v}_{n},{\tilde{v}}_{n})\le \dfrac{1}{2} \Vert {\tilde{v}}_{n+1}-{\tilde{v}}_{n} \Vert ^{2}+ \gamma {\mathcal {L}}_{\delta }({\tilde{v}}_{n+1},{\tilde{v}}_{n} )-\dfrac{1}{2} \Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert ^{2} \end{array} \nonumber \\\end{aligned}$$

(36)

$$\begin{aligned}&\begin{array}{c} \dfrac{1}{2}\Vert {\tilde{v}}_{n+1} -{\tilde{v}}_{n}\Vert ^{2}+ \gamma {\mathcal {L}}_{\delta } ({\tilde{v}}_{n+1},\hat{v}_{n}) \le \dfrac{1}{2} \Vert \hat{v}_{n}-{\tilde{v}}_{n} \Vert ^{2} +\gamma {\mathcal {L}}_{\delta } (\hat{v}_{n},\hat{v}_{n}) -\dfrac{1}{2} \Vert \hat{v}_{n} -{\tilde{v}}_{n+1} \Vert ^{2} \end{array} \nonumber \\ \end{aligned}$$

(37)

Adding (36) with (37) and considering $ {\tilde{w}}+h={\tilde{v}}_{n+1}\text { }{\tilde{w}}=\hat{v}_{n},\text { }{\tilde{v}}+q={\tilde{v}}_{n},\text { }{\tilde{v}}=\hat{ v}_{n}, h={\tilde{v}}_{n+1}-\hat{v}_{n}$ and $q={\tilde{v}}_{n} -\hat{v}_{n}$ then, we have

$$\begin{aligned} \begin{array}{l} \Vert {\tilde{v}}_{n+1}-\hat{v}_{n}\Vert ^{2}\le \gamma [{\mathcal {L}}_{\delta }({\tilde{v}}_{n+1},{\tilde{v}}_{n} )-{\mathcal {L}}_{\delta } (\hat{v}_{n},{\tilde{v}}_{n} )] \\ \quad -\gamma [{\mathcal {L}}_{\delta }({\tilde{v}}_{n+1} , \hat{v}_{n})-{\mathcal {L}}_{\delta } (\hat{v}_{n} ,\hat{v}_{n})]\le \gamma L\Vert {\tilde{v}} _{n+1}-\widehat{v}_{n} \Vert \Vert {\tilde{v}}_{n}-\hat{v}_{n} \Vert \end{array} \end{aligned}$$

which implies

$$\begin{aligned} \Vert {\tilde{v}}_{n+1}- \hat{v}_{n}\Vert \le \gamma L\Vert {\tilde{v}}_{n}- \hat{v}_{n}\Vert \end{aligned}$$

(38)

Now, considering ${\tilde{w}}={\tilde{v}}_{n+1}$ in (34) and ${\tilde{w}}= {\tilde{v}}_{\delta }^{*}$ in (35) we have

$$\begin{aligned} \begin{array}{c} \dfrac{1}{2}\Vert \hat{v}_{n}-{\tilde{v}}_{n} \Vert ^{2}+\gamma {\mathcal {L}}_{\delta }(\hat{v}_{n},{\tilde{v}}_{n} )\le \frac{1}{2} \Vert {\tilde{v}}_{n+1} -{\tilde{v}} _{n} \Vert ^{2}+ \gamma {\mathcal {L}}_{\delta }({\tilde{v}}_{n+1},{\tilde{v}}_{n})-\dfrac{1}{2}\Vert {\tilde{v}}_{n+1}-\hat{v}_{n}\Vert ^{2} \end{array} \\ \begin{array}{c} \dfrac{1}{2}\Vert {\tilde{v}}_{n+1}-{\tilde{v}}_{n}\Vert ^{2}+\gamma {\mathcal {L}}_{\delta }({\tilde{v}}_{n+1} ,\hat{v}_{n} )\le \frac{1}{2} \Vert {\tilde{v}}_{\delta }^{*} -{\tilde{v}}_{n} \Vert ^{2} + \gamma {\mathcal {L}}_{\delta }({\tilde{v}}_{\delta }^{*},\hat{v}_{n})-\frac{1}{2}\Vert {\tilde{v}}_{\delta }^{*} -{\tilde{v}}_{n+1} \Vert ^{2} \end{array} \end{aligned}$$

Adding the previous inequalities and multiplying by two we obtain

$$\begin{aligned} \begin{array}{l} \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2}+\Vert {\tilde{v}}_{n+1} -\hat{v} _{n}\Vert ^{2}+\Vert \hat{v}_{n}-{\tilde{v}}_{n}\Vert ^{2}-2\gamma {\mathcal {L}}_{\delta }({\tilde{v}}_{\delta }^{*},\hat{v}_{n})\\ \quad +2\gamma [{\mathcal {L}} _{\delta }( {\tilde{v}}_{n+1},\hat{v}_{n} ) + {\mathcal {L}}_{\delta }(\hat{v}_{n},{\tilde{v}}_{n})-{\mathcal {L}}_{\delta }({\tilde{v}}_{n+1},{\tilde{v}}_{n} )] \text { }\le \text { }\Vert {\tilde{v}}_{\delta }^{*} -{\tilde{v}}_{n}\Vert ^{2} \end{array} \end{aligned}$$

Adding and subtracting the term ${\mathcal {L}}_{\delta }(\hat{v}_{n}, \hat{v}_{n})$ we get

$$\begin{aligned} \begin{array}{l} \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2}+\Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert ^{2}+\Vert \hat{v}_{n} -{\tilde{v}}_{n} \Vert ^{2}+ 2\gamma \left[ {\mathcal {L}}_{\delta }(\hat{v}_{n},\hat{v}_{n})-{\mathcal {L}}_{\delta }({\tilde{v}}_{\delta }^{*},\hat{v}_{n})\right] \\ \quad + 2\gamma \left[ {\mathcal {L}}_{\delta }({\tilde{v}}_{n+1},\hat{v}_{n}) -{\mathcal {L}}_{\delta }(\hat{v}_{n},\hat{v}_{n})+{\mathcal {L}}_{\delta } (\hat{v}_{n}, {\tilde{v}}_{n} )-{\mathcal {L}}_{\delta }({\tilde{v}}_{n+1},{\tilde{v}}_{n} )\right] \le \text { }\Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2} \end{array} \end{aligned}$$

Considering ${\tilde{w}}+h={\tilde{v}}_{n+1}$, ${\tilde{w}}=\hat{v}_{n}$, ${\tilde{v}}+k= {\tilde{v}}_{n}$ and ${\tilde{v}}=\hat{v}_{n}$ we have $h={\tilde{v}}_{n+1}- \hat{v}_{n}$ and $k={\tilde{v}}_{n}-\hat{v}_{n}$, then the resulting equation is as follows

$$\begin{aligned} \begin{array}{l} \Vert {\tilde{v}}_{\delta }^{*} -{\tilde{v}}_{n+1}\Vert ^{2}+\Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert ^{2}+\Vert \hat{v}_{n}-{\tilde{v}}_{n} \Vert ^{2}+ 2\gamma \left[ {\mathcal {L}}_{\delta } (\hat{v}_{n},\hat{v}_{n})-{\mathcal {L}}_{\delta }({\tilde{v}}_{\delta }^{*},\hat{v}_{n})\right] \\ \quad - 2\gamma L\Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert \Vert {\tilde{v}}_{n} -\hat{v}_{n} \Vert \le \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2} \end{array} \end{aligned}$$

Using Eq. (38) in the last term on the left-hand side and given the strict convexity of $ {\mathcal {L}}_{\delta }$ where

$$\begin{aligned} {\mathcal {L}}_{\delta }(\hat{v}_{n},\hat{v}_{n})-{\mathcal {L}}_{\delta }( {\tilde{v}}_{\delta }^{*},\hat{v}_{n})\ge \delta \Vert \hat{v}_{n} - {\tilde{v}}_{\delta }^{*}\Vert ^{2} \end{aligned}$$

we obtain

$$\begin{aligned}&\Vert {\tilde{v}}_{\delta }^{*} -{\tilde{v}}_{n+1}\Vert ^{2} +\Vert {\tilde{v}}_{n+1}-\hat{v}_{n}\Vert ^{2}+2\gamma \delta \Vert \hat{v}_{n}-{\tilde{v}}_{\delta }^{*}\Vert ^{2}\\&\qquad +\left( 1-2\gamma ^{2}L^{2}\right) \Vert {\tilde{v}}_{n} -\hat{v}_{n}\Vert ^{2} \le \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2} \end{aligned}$$

Applying the identity $2\langle a-c,c-b\rangle = \Vert a-b\Vert ^{2}- \Vert a-c\Vert ^{2} -\Vert c-b\Vert ^{2}$ with $a=\hat{v}_{n}$, $b={\tilde{v}}_{\delta }^{*}$ and $c={\tilde{v}}_{n},$ to the left-hand side of the last inequality we have

$$\begin{aligned} \begin{array}{l} \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2}+\Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert ^{2}+\left( 1-2\gamma ^{2}L^{2}\right) \Vert {\tilde{v}}_{n}-\hat{v}_{n} \Vert ^{2} + 2\gamma \delta [2\langle \hat{v}_{n}-{\tilde{v}}_{n},{\tilde{v}}_{n}-{\tilde{v}}_{\delta }^{*}\rangle ]\\ \quad +\Vert {\tilde{v}}_{n} -\hat{v}_{n}\Vert ^{2}+ \Vert {\tilde{v}}_{n}-{\tilde{v}}_{\delta }^{*} \Vert ^{2}= \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2} +\Vert {\tilde{v}}_{n+1} -\hat{v}_{n} \Vert ^{2} \\ \qquad + (1+2\gamma \delta -2\gamma ^{2}L^{2} )\Vert {\tilde{v}}_{n}-\hat{v}_{n}\Vert ^{2}\\ \quad +4\gamma \delta \langle \hat{v}_{n}-{\tilde{v}}_{n},{\tilde{v}}_{n}-{\tilde{v}}_{\delta }^{*} \rangle +2\gamma \delta \Vert {\tilde{v}}_{n}-{\tilde{v}}_{\delta }^{*}\Vert ^{2}\le \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2} \end{array} \end{aligned}$$

Let $p=1+2 \gamma \delta -2 \gamma ^{2}L^{2}$ and considering the square form of the third and fourth terms yields

$$\begin{aligned}&\Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2}+ \Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert ^{2}+ p\Vert {\tilde{v}}_{n}-\hat{v}_{n}\Vert ^{2}+ 4\gamma \delta \langle \hat{v}_{n}-{\tilde{v}}_{n},{\tilde{v}}_{n}-{\tilde{v}}_{\delta }^{*}\rangle \\&\quad +\dfrac{(2\gamma \delta )^{2}}{p}\Vert {\tilde{v}}_{n} -{\tilde{v}}_{\delta }^{*}\Vert ^{2}-\dfrac{(2\gamma \delta )^{2}}{p}\Vert {\tilde{v}}_{n} -{\tilde{v}}_{\delta }^{*}\Vert ^{2}+ 2\gamma \delta \Vert {\tilde{v}}_{n} -{\tilde{v}}_{\delta }^{*}\Vert ^{2}\le \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2} \end{aligned}$$

and

$$\begin{aligned}&\Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2}+ \Vert {\tilde{v}}_{n+1} -\hat{v}_{n}\Vert ^{2} + \left\| \sqrt{ p}({\tilde{v}}_{n}-\hat{v}_{n})+ \frac{ 2\gamma \delta }{\sqrt{p}}({\tilde{v}}_{n}-{\tilde{v}}_{\delta }^{*})\right\| ^{2}\\&\quad \le \left( 1-2\gamma \delta +\frac{(2\gamma \delta )^{2}}{p}\right) \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2} \end{aligned}$$

considering $k=1-2\gamma \delta +\dfrac{(2\gamma \delta )^{2}}{p}\in \left( 0,1\right) $ we have that

$$\begin{aligned} \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n+1}\Vert ^{2}\le k \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{n}\Vert ^{2}\le k^{n+1} \Vert {\tilde{v}}_{\delta }^{*}-{\tilde{v}}_{0}\Vert ^{2}\underset{n\rightarrow \infty }{\rightarrow }0. \end{aligned}$$

q.e.d..

Rights and permissions

Reprints and permissions

About this article

Cite this article

Clempner, J.B. A Markovian Stackelberg game approach for computing an optimal dynamic mechanism. Comp. Appl. Math. 40, 186 (2021). https://doi.org/10.1007/s40314-021-01578-4

Download citation

Received: 21 February 2021
Revised: 13 June 2021
Accepted: 05 July 2021
Published: 14 July 2021
DOI: https://doi.org/10.1007/s40314-021-01578-4

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Markovian Stackelberg game approach for computing an optimal dynamic mechanism

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Stackelberg risk preference design

A strategic approach to bankruptcy problems based on the TAL family of rules

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A Markovian Stackelberg game approach for computing an optimal dynamic mechanism

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Stackelberg risk preference design

A strategic approach to bankruptcy problems based on the TAL family of rules

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proof of Theorem 2

A Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation