Optimal exploitation for hybrid systems of renewable resources under partial observation

https://doi.org/10.1016/j.nahs.2021.101013Get rights and content

Abstract

This work focuses on optimal controls for hybrid systems of renewable resources in random environments. We propose a new formulation to treat the optimal exploitation with harvesting and renewing. The random environments are modeled by a Markov chain, which is hidden and can be observed only in a Gaussian white noise. We use the Wonham filter to estimate the state of the Markov chain from the observable process. Then we formulate a harvesting–renewing model under partial observation. The Markov chain approximation method is used to find a numerical approximation of the value function and optimal policies. Our work takes into account natural aspects of the resource exploitation in practice: interacting resources, switching environment, renewing and partial observation. Numerical examples are provided to demonstrate the results and explore new phenomena arising from new features in the proposed model.

Introduction

This work focuses on the management of renewable resources that are represented by controlled hybrid stochastic differential equations. Although controlled hybrid diffusions have been well studied, the work on controlled stochastic processes for exploitation of resources seems to be relatively limited. Mathematically, the class of stochastic control problems we are interested in have been studied in various settings for different domain of applications. To mention just a few, we refer to [1], [2] for optimal exploitation of a single resource with stochastic population dynamics; [3], [4], [5], [6] for optimal harvesting problems of single species ecosystems in random environments and [7], [8], [9], [10] for interacting populations. The reader can also find related works on optimal dividend strategies in [11], [12]. Motivated by recent developments in renewable resource exploitation and related areas, in contrast to the available results in the literature, we propose a new model to treat a generalized situation with incomplete information.

Consider a two component process (X(t),α(t)) given by dX(t)=[b(X(t),α(t))C(t)]dt+σ(X(t),α(t))dw(t),where b(), σ() are suitable real-valued functions, w() is a real-valued Brownian motion, and α() is a finite-state Markov chain, and C() is the control component. Because the example is for motivation only, we defer the discussion of the precise formulation and conditions needed to the next section. Suppose X(t) is the size of a renewable resource at time t. Typical examples of renewable resources are animals, water, and forests, food; where food and fiber are renewable agricultural resources. Harvesting–renewing policies are introduced to derive financial benefit as well as to ensure a certain sustainability of the resource. The goal is to find a policy C() that maximizes the expected reward. From the biological point of view, X(t) is the population size of a species at time t. The manager can either harvest partially and sell it or seed part of the species to maintain the ecosystem. Related examples are fishery [13] and forest replanting [14]. Thus, the problem under consideration is important for the establishment of ecologically, environmentally, and economically reasonable wildlife management; see also [1], [15] and references therein.

Treating renewable resource exploitation, most of aforementioned papers devote to a single resource evolving according to a logistic stochastic differential equation and the analysis employs the structure of the equation; see [1], [3], [5]. We are interested in a general hybrid system, which is a generalization of Eq. (1.1), of interacting resources. The numerical approach allows us to approximate the optimal policy and the value function for a broad class of models. It should be noted that in the popular optimal harvesting formulations [3], [5], [6], [8], [9], the control cost is ignored; that is, the manager pays nothing for the actions they do. As a result, in some cases it is optimal for the manager to harvest immediately all the available resources regardless of the fact that the manager might need to pay a lot for such a policy; see [3]. The analysis and numerical examples in [8] pointed out that when the manager decides to harvest (resp. renew), she should do that with the maximal possible harvesting rates (resp. renewing rates). In this work, we go a further step to consider the control costs associated with harvesting and renewing activities. Whenever the manager harvests or renews part of the resource, she has to pay a certain cost and the cost is considered as a real multivariate function of the harvesting–renewing rates, the current sizes of resources, and the regime of the environment. Under this consideration, we observe in Section 5 that in certain cases the maximal possible rates are no longer optimal for harvesting and renewing. Another interesting point in our work is that the environment (Markov chain α(t)) can only be observed with noise; that is, at any given instance, the exact state of residency of α(t) is unknown. We can only have noise-corrupted observation of α(t) plus noise. The Wonham filter appears to be a promising tool to convert such a partially observed system to completed observed one; see  [16], [17], [18]. Focusing on a two-state hidden Markov chain, we will develop an effective way to approximate the system of resources with partial observation.

We emphasize that the results on exploitation for such complex considerations seem to be scared to date to the best of our knowledge. Because of the complexity, a closed-form solution is virtually impossible to obtain, we develop numerical algorithms to approximate the value function and the optimal control. We adopt the Markov chain approximation methodology developed by Kushner and Dupuis [19], [20]. In contrast to the existing results, our new contributions in this paper are as follows. (i) We formulate a generalized exploitation model with a hidden Markov chain for interacting renewable resources, which is a new angle in the study of optimal exploitation problems. (ii) We develop numerical approximation schemes based on the Markov chain approximation method to treat the proposed model. (iii) We explore new phenomena arising from new features in our model by applying the numerical schemes to several stochastic models.

The rest of the work is organized as follows. Section 2 begins with the problem formulation. Section 3 presents the numerical algorithm based on the Markov chain approximation method. Section 4 focuses on the case that the harvesting–renewing rate of each resource is proportional to the resource size. In Section 5, we present several examples. Finally, the paper is concluded with the conclusion section. To facilitate the reading, all proofs are placed in an appendix at the end of the paper.

Section snippets

Formulation

For i=1,,d, let Xi(t) be the size of the ith resource at time t and denote X(t)=(X1(t),,Xd(t))Rd (with z denoting the transpose of zRd1×d2 with d1,d21). We assume that the growth of the resources is subject to random fluctuations and abrupt changes within a finite number of configurations of the environment and we model it by a continuous-time Markov chain α(t) taking values in M. In this work, we focus on two-state case M={1,2}. Such a Markov chain α(t) is also known as a telegraph

Numerical algorithm

Following the Markov chain approximation method in [19], [20], we will construct a controlled Markov chain in discrete time to approximate the controlled diffusions. A careful treatment is required due to the presence of a combination of the control C() and Wonham filter Φ().

The harvesting–renewing per-unit rates

In this section, we assume that the rate at which each resource can be harvested or renewed is proportional and positively related to the current size. Such a consideration is motivated by the observations in [3, Section 3] or [1]. The evolution of the combined process (X(),Φ()) is given by X(t)=x+0t(b¯(X(s),Φ(s))D(s))ds+0tσ¯(X(s),Φ(s))dw(s),Φ(t)=ϕ+0t(q11Φ(s)+q21(1Φ(s)))ds+0tσ01Φ(s)(b0(1)b¯0(Φ(s)))dw0(s),where D(t)=(D1(t),,Dd(t)) and Di(t)=Ci(t)Xi(t) for i=1,,d. We assume that C(t)

Examples

Throughout this section, we suppose the discounting factor is δ=0.02. Let U(0,) be an upper bound introduced for computational purpose. We will compute Vh(x,ϕ) for x[0,U]d and ϕ[0,1]. In particular, we take U=4. In our examples, since U=4 is much higher than the carrying capacity of the environment (which can be thought as the maximum population size of a species that can be sustained in that specific environment) the population size X() will rarely grow outside [0,U]d. Thus, we implicitly

Conclusion

This paper focused on modeling and numerical methods for optimal harvesting–renewing policies in random environments. The novelties of our work include that (1) we formulated an optimal exploitation problem for hybrid stochastic systems of renewable resources with a hidden Markov chain and new features arising in practice. (2) we built the numerical approximation based on Markov chain approximation techniques to solve the optimal control problem under partial observation. The convergence of the

CRediT authorship contribution statement

Ky Tran: Conceptualization, Methodology, Formal analysis, Investigation, Writing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The author is indebted to Professor George Yin for discussions on numerical methods for optimal control problems. The author thanks the two anonymous reviewers for their careful reading of the manuscript and their insightful comments and suggestions leading to much improvement.

References (29)

  • AlvarezH.L. et al.

    Optimal harvesting of stochastically fluctuating populations

    J. Math. Biol.

    (1998)
  • HeningA. et al.

    Asymptotic harvesting of populations in random environments

    J. Math. Biol.

    (2019)
  • SongQ. et al.

    On optimal harvesting problems in random environments

    SIAM J. Control Optim.

    (2011)
  • HeningA. et al.

    Harvesting of interacting stochastic populations

    J. Math. Biol.

    (2019)
  • Cited by (5)

    This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ICT Consilience Creative, Republic of Korea program (program (IITP-2020-2011-1-00783) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation, Republic of Korea .

    View full text