Elsevier

Computer Communications

Volume 160, 1 July 2020, Pages 554-566
Computer Communications

An optimal policy for joint compression and transmission control in delay-constrained energy harvesting IoT devices

https://doi.org/10.1016/j.comcom.2020.07.005Get rights and content

Abstract

Energy-efficient communication remains one of the key requirements of the Internet of Things (IoT) platforms. The concern on energy consumption can be mitigated by exploiting technical ploys to reduce the volume of data for transmission (e.g., via sensing data compression) as well as by resorting to technological advancements (e.g., energy harvesting). However, these mitigating measures carry their own cost, which is the additional complexity of control and optimization in the digital communication chain. In particular, compression ratio is another control knob that needs adjusting besides the usual transmission parameters. Also, with the random and sporadic nature of the harvested energy, the goal shifts from mere energy conservation to judicious consumption of the renewable energy in a foresighted manner. In this paper, we assume an energy-harvesting IoT device that is tasked with (loss-lessly) compressing and reporting delay-constrained sensing events to an IoT gateway over a time-varying wireless channel. We are interested in computing an optimal policy for joint compression and transmission control adaptive to the node’s energy availability, transmission buffer length, as well as its wireless channel conditions. We cast the problem as a Constrained Markov Decision Process (CMDP), and propose a two-timescale model-free reinforcement learning (RL) algorithm that is able to shape the optimal control policy in the absence of the statistical knowledge of the underlying system dynamics. Exhaustive simulation experiments are conducted to investigate the convergence of the learning algorithm, to explore the impacts of different system parameters (such as: the rate of sensing events, the energy arrival rate, and battery capacity) on the performance of the proposed policy, as well as to compare against some baseline schemes.

Introduction

Internet of Things (IoT) as the main pillar of the fourth industrial revolution (Industry 4.0) prevails in today’s Internet infrastructure. With the widespread use of IoT devices equipped with small batteries, lifetime maximization through energy optimization policies has attracted much attention in recent years. Energy conservation is even more important in mission-critical applications, where it is expected that the IoT device works for a long period of time without the need for battery or node replacement [1].

The energy management solutions in the literature are primarily geared towards lifetime maximization through effective use of the available energy capacity. In fact, due to the high cost of data transmission, in-network processing, involving operations such as filtering, compression, and fusion is a technique widely used in conventional wireless sensor and today’s IoT for reducing the communication overhead [1], [2], [3], [4]. Among the in-network processing techniques, compression is of particular interest given that it can be performed independently at every forwarding node; the downstream transmission rate of most stream-oriented data can be reduced by the application of appropriate compression algorithms, both lossless [5], [6] and lossy [7], [8], [9]. However, it has been shown that applying maximum compression level is not an energy-efficient policy in most cases as it incurs additional delay and aggravates the computational complexity (e.g., [10], [11]). Therefore, finding an efficient trade-off between compression and transmission is important to achieve higher energy savings without sacrificing much in terms of delay or other QoS criteria. Armed with this understanding, the problem of joint compression and transmission control (collectively, referred to as “communication control”) has begun to receive attention due to its tunable complexity and delay costs [7], [9], [12], [13].

Another efficient way to improve the lifetime of IoT systems is through the use of recent technological advancements that have enabled energy harvesting capabilities for wireless nodes. Energy harvesting, as a promising complement to the conventional battery power of IoT devices, enables the sensor nodes to absorb energy from the environment, operating over a longer time horizon. In fact, exploiting the energy harvesting capability helps the IoT nodes to power themselves from theoretically unlimited energy sources that are present in their surrounding environment (e.g., in the form of solar, vibration, thermoelectricity, etc.).

Despite all the benefits associated with energy harvesting, the communication control problem becomes more complex in this context. This is due to the uncertainty associated with the battery charging process as the amount of harvested energy randomly changes over time. Also, by nature, wireless channel fading renders the link conditions stochastic and time-varying as well. To top it all off, one should also consider the uncertainty inherent in random sensory event generations. Hence, to account for the time-varying nature of these processes, a principled way to optimize the communication control policy of an IoT device is to capture these dynamics explicitly within a stochastic optimization framework with long-run objectives (e.g., expected cumulative energy consumption).

In this paper, we consider the problem of compressing and reporting sensory data packets for a single IoT device over a point-to-point link. The device is equipped with a rechargeable battery with energy harvesting capability. The random event data gathered from the environment passes through a lossless compression module before getting queued in a buffer for subsequent transmission. The wireless channel state, packet arrival quantities and the amount of energy harvested from the environment will vary randomly over time. We seek an efficient policy that adaptively tunes the lossless compression level as well as the transmission window (of packets) as the control knobs. Our objective is to minimize the average power (required for compressing and ‘reliably’ transmitting at a given rate), while satisfying a constraint on the average (buffering) delay of the event data.

In order to better highlight our contributions, we first give a brief account on the most relevant previous works, identify the research gap, and state our motivations.

A pioneer work which has raised the issue of tradeoff between the energy costs of transmission and compression is [10]. There, the authors have argued that while the computation energy of data compression is negligible for simple applications (e.g., temperature sensing), for advanced applications with heavy data flow, including structural health monitoring, video surveillance, and image-based tracking, compression of complex data sets is envisioned to cost energy comparable with wireless communication. Hence, blindly applying maximum compression may lead to extra energy cost compared to transmitting the raw data.

Motivated by the above observation, several research works have investigated the concept of tunable compression that is capable of tuning the computation complexity of lossless data compression based on the energy availability (e.g., [5], [6], [14]). Such a concept is also well-supported by the reality of popular compression algorithms; for example, the gzip program supports up to ten levels of different compression ratio, with larger compression ratio resulting in longer compression time and hence higher energy cost [15], [16].

Based on the availability of information at the transmitter, the body of literature on the joint optimization of compression/transmission in IoT/sensor devices can be categorized into three distinct groups: offline, online and model-free schemes.

In offline optimization, it is basically assumed that exact information about the time and amount of data, energy arrival as well as the wireless channel state is available acausally (i.e., before decision making) at the sender side. An online optimization framework, on the other hand, only utilizes statistical information about the data, energy arrival and the wireless channel, but causal information about the process realization. However, in many real-world scenarios, it is not possible to attain exact or even statistical information from the non-deterministic, time-varying processes of data gathering, energy harvesting, or channel fading. Hence, both online and offline optimization approaches fail to achieve optimal energy consumption policies. In these scenarios, a model-free approach can instead be adopted to learn an adaptive communication policy through real-time interactions and experiences with the environment.

  • Within the category of offline optimization schemes, the work by Tavli et al. in [5] has come up with a linear programming (LP) formulation for joint dynamic data compression and flow balancing to maximize the lifetime of wireless sensor networks (WSNs). In [14], the LP formulation of [5] has been extended to jointly consider dynamic compression along with the stealth mode of operation for the sensors. In the context of wireless multimedia sensor networks, the work in [12] has exploited the convex optimization theory to derive the optimal tradeoff between transmission and compression with the objective of network lifetime maximization under the delay quality of service constraints. In [7], a multi-objective optimization problem has been formulated to select a rate–distortion working point to conduct lossy compression at each source node, while jointly assessing a routing path for the compressed information, under distortion and capacity constraints. The study in [17] has exploited the NUM framework [18] to jointly optimize the source data rate, the degree of stream compression, and the location of fusion operators. A more closely related work to ours is [19], where it considers the joint use of data compression and wireless transmission speed control for wearable devices to minimize the total energy consumption while satisfying a deadline constraint. However, being an offline scheme, the solution given in [19] assumes an unrealistic setting where future channel gains are known ahead.

  • As for online schemes, in [20], a solar-powered WSN is envisaged, and a simple threshold-based scheme is presented to decide whether there is any surplus energy. Nodes with residual energy less than a certain threshold transfer data with compression in order to reduce energy consumption, and nodes with residual energy over the threshold (which means there is surplus energy) transfer data without compression to reduce the delay time between nodes by using the surplus energy. The problem of online tradeoff between compression and transmission energy consumption is addressed more systematically in [21]. In particular, a formulation based on the formalism of Markov decision process (MDP) [22] is given in [21] for an energy harvesting WSN, with the objective of minimizing the average distortion of the compressed data in the long run under the energy variations. Another online scheme for joint data compression and wireless transmission speed control has been proposed in [19]. While the authors do not assume energy harvesting capability for the nodes, the future channel gains are assumed to be unknown and change stochastically. They show that their online algorithm, despite not knowing future channel conditions, closely approximates the performance of the offline optimal. Finally, Castiglione et al. has proposed an energy management policy in [23] for energy harvesting WSNs in which the energy management unit allocates energy to different units based on the statistics of energy harvesting, sensed data quality, signal-to-noise ratio and the size of data queue. In this study, the purpose of lossy compression is to make an optimal balance among the level of data distortion, the stability of the queue and the transmission delay. The proposed policy is adaptive to the energy status, channel, and flow of data generation.

  • Model-free methods can further be divided into two subcategories: those based on Lyapunov optimization [24] and others that utilize learning-theoretic schemes, e.g., reinforcement learning (RL) [25] techniques. Probably one of the first studies that have utilized the Lyapunov optimization framework to address the dynamic compression problem is [6]. The setting considered in [6] consists of a multi-sensor device (with a conventional energy source) sending losslessly compressed data over a point-to-point fading channel. The baseline scheme in [6] has then been extended to a lossy compression scenario with distortion constraints in [8], and also to multi-hop communications in [26]. More recently, by adopting the Lyapunov framework in [27], a joint compression–transmission approach has been proposed for wearable devices which not only minimizes the energy consumption of both compression and transmission, but also maintains the corresponding data distortion and transmission delay within a certain tolerant level. The proposed scheme in [27], however, does not assume harvesting capability for the devices. Finally, there is the work done by Tapparello et al. in [28] for the sensor nodes capable of energy harvesting. They have come up with a dynamic compression/transmission strategy which is adaptive to the harvested energy, channel state, data queue, battery, and the source correlation statistical model. While Lyapunov optimization does not rely on the statistical knowledge of the system’s stochastic dynamics to compute the optimal communication policy, it is only applicable to settings with more restricted stochastic assumptions. For example, standard Lyapunov schemes work by minimizing an instantaneous Lyapunov drift in each slot, while basically assuming that the underlying processes behind the channel variations, energy charging, and event generation are i.i.d. Moreover, the derived policy (e.g., dynamic backpresssure algorithm) by the Lyapunov drift theory and the Lyapunov optimization theory may not have good delay performance in moderate and light traffic loading regimes. It only allows potentially simple solutions with throughput optimality, which is a weak form of delay performance [29]. A more generalized method with the capability of handling delay-constrained communication control and of dealing with other types of stochastic processes is Makov decision problem [22] and RL [25]. The work in [30] adopts the constrained Markov decision process (CMDP) formalism [31] to address lossy data compression for wireless transmission over fading channels in the presence of a stochastic energy input process and a replenishable energy buffer. An RL-based algorithm has then been designed to derive an optimal compression policy through a Lagrangian relaxation approach combined with a dichotomic search for the Lagrangian multiplier. The authors only consider compression control, there is no data buffer and hence, no guarantee on average delay performance. In fact, the objective function in [30] concerns data fidelity with a constraint on average energy consumption. A more mature RL-based scheme is [9], which tackles the problem of joint compression, channel coding and retransmission for an energy-harvesting-based multi-sensor monitoring system. The compression scheme considered in [9] is a lossy type, and they aim for minimizing the long-term average distortion at the receiver. The only control knob considered in [9] is power, and again, there is no guarantee on the average delay performance.

According to our review of the related work in Section 1.2, the problem addressed in this paper is novel in the following respects:

  • The majority of the studies on joint optimization of compression/transmission in WSN or IoT systems lie within the offline optimization framework (e.g., [2], [5], [7], [12], [14], [17], and [19]), where it is unrealistically assumed that non-causal information regarding the exact trace of system states (i.e., channel, data, energy, etc.) is available beforehand. Also, there exist many online optimization schemes which more realistically assume that system states realize at run-time, but still, they require an explicit knowledge of the statistics of the system processes [20], [21], and [23]. Our work in this paper differs from both these lines of work, as we approach the communication control problem from a model-free perspective.

  • Compared to previous model-free schemes, on the one hand, we have those based on Lyapunov optimization (e.g., [6], [8], [26], [27], and [28]). Based on our review of these works, there are at least two main fronts that form our motivation: first, there is no prior work that addresses a setting featuring lossless compression, energy harvesting, and delay-constrained communication. Moreover, as mentioned earlier, standard Lyapunov schemes are not suitable for more realistic stochastic dynamics with temporal dependency between correlated fading channel conditions, energy arrivals, and sensory event generation. This warrants a formulation based on a more generalized formalism of MDPs. On the other hand, we have the RL-based methods in [30] and [9]. None of these two schemes has addressed delay-constrained communication and lossless compression. In this paper, we consider more realistically the case of random event generation where event packets arrive randomly at different times. In such cases, the data arrival process also contributes to the dynamics of the system, which calls for a control policy adaptive to data queue state for handling the time-varying queueing delay experienced by the packets. Under these dynamics, the system’s objective also needs to be constrained with an expected delay performance.

Our contributions in this paper can be summarized as follows:

  • We use the CMDP formalism [31] to formulate the joint lossless compression and transmission optimization problem in an energy harvesting IoT (sensing) device. Our formulation accounts for the time-varying channel, random event generation as well as the stochastic energy arrivals. Our objective is to minimize the discounted sum of energy consumption, subject to a specified delay constraint on the average data buffer waiting time. To handle to the constraint on average delay performance, we apply the Lagrangian technique to express the Bellman equations underlying the CMDP problem in a standard unconstrained form. This effectively paves the way to solve the original constrained problem by solving instead two (coupled) optimization problems, one in the space of control policies and the other in the space of Lagrange multipliers.

  • We propose a model-free reinforcement learning algorithm that can compute the optimal solution pair (i.e., the control policy and Lagrange multiplier) in the absence of the statistical knowledge of the system, and instead, by relying only on the immediate feedbacks acquired through real-time interactions with the operating environment. The proposed learning algorithm consists of two iterative procedures: Q-learning [32] for computing the control policy, and stochastic subgradient-ascent for computing the optimal Lagrange multiplier. Despite these two problems are coupled together by definition, we utilize the technique of timescale separation from the stochastic approximation theory [33] to allow for their simultaneous updating with no risk of divergence.

  • We conduct numerical analyses and experimental studies to evaluate our proposed algorithm in terms of its convergence properties, and its behavior under different intensities for event generation as well as varying energy charging rates, and energy buffer sizes. We also show that the learning algorithm achieves significant energy savings compared to baseline schemes which only optimize either the transmission or the compression blocks.

The remainder of the paper is organized as follows: in Section 2, the system model and the assumptions are presented. In Section 3, we come up with the problem formulation in terms of a CMDP. Section 4 proposes a two-timescale reinforcement learning algorithm to find the optimal control policy. In Section 5, we evaluate the performance of the proposed scheme. The paper concludes in Section 6, where we also highlight some directions for future works.

Section snippets

System model and assumptions

In this section, we first introduce the system model for a delay-constrained energy harvesting IoT device. Then, we elaborate on the assumptions made about the dynamics of the energy charging process, sensory data generation, and the variations of the wireless channel state.

Joint compression and transmission control problem

Given the system model in Section 2, the goal is to design a joint compression and transmission control policy that minimizes the expected cumulative power expenditure of the device while satisfying a certain time average latency constraint for the sensory events queued in the transmission buffer. The control policy should be adaptive to the dynamic states of the system, i.e., to the volume of the sensory data (DSI), the wireless channel quality (CSI), the number of energy packets harvested

Learning the optimal control policy

In practical scenarios, it is often difficult or even impossible to obtain reliable statistical information about the underlying stochastic processes governing the system dynamics; e.g. the average intensity of the event flow, energy harvesting, channel quality variations, etc. In these situations, we cannot adopt a model-based scheme (e.g., standard value iteration [42] or policy iteration [43]) for solving the problem in (18), since such a scheme relies on the knowledge of the system’s

Performance evaluation and results

In this section, we implement our proposed algorithm in a simulation environment where we simulate the point-to-point scenario depicted in Fig. 1. We first explain the simulation setup including the experiment settings and simulation parameters in Section 5.1. Next, in Section 5.2, as the first set of experiments, we investigate the convergence properties of the learning algorithm. In Section 5.3, we present the simulation results comparing the performance of the proposed algorithm against some

Conclusion and outlook

In this paper, the problem of jointly controlling the compression level and the transmission rate was studied for an IoT node equipped with a renewable energy source. Given the uncertainties posed by the randomness of the arrival processes of energy and sensory events, as well as the variations in channel qualities, the problem was formulated as a Constrained Markov Decision Process with the objective of minimizing the long-term average energy consumption, while satisfying an average delay

CRediT authorship contribution statement

Vesal Hakami: Conceptualization, Formal analysis, Project administration, Supervision, Validation, Writing - review & editing. Seyedakbar Mostafavi: Data curation, Investigation, Software, Visualization, Writing - review & editing. Nastooh Taheri Javan: Data curation, Visualization. Zahra Rashidi: Investigation, Writing - original draft, Visualization.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References (54)

  • IncebacakD. et al.

    Optimal data compression for lifetime maximization in wireless sensor networks operating in stealth mode

    Ad Hoc Netw.

    (2015)
  • BorkarV.S.

    Stochastic approximation with two-time scales

    Systems Control Lett.

    (1997)
  • BorkarV.S.

    An actor-critic algorithm for constrained Markov decision processes

    Systems Control Lett.

    (2005)
  • C. Pielli, A. Biason, A. Zanella, M. Zorzi, Joint optimization of energy efficiency and data compression in TDMA-based...
  • LiuH.-S. et al.

    Data compression for energy efficient communication on ubiquitous sensor networks

    Tamkang J. Sci. Eng.

    (2011)
  • VecchioM. et al.

    Adaptive lossless entropy compressors for tiny IoT devices

    IEEE Trans. Wireless Commun.

    (2014)
  • . Ukil, S. Bandyopadhyay, A. Pal, IoT data compression: sensor-agnostic approach, in: Proceedings of the Data...
  • TavliB. et al.

    Optimal data compression and forwarding in wireless sensor networks

    IEEE Commun. Lett.

    (2010)
  • M.J. Neely, Dynamic data compression for wireless transmission over a fading channel, in: Proceedings of the 42nd...
  • M. Centenaro, M. Rossi, M. Zorzi, Joint optimization of lossy compression and transport in wireless sensor networks,...
  • NeelyM.J. et al.

    Dynamic data compression with distortion constraints for wireless transmission over a fading channel

    (2008)
  • PielliC. et al.

    Joint compression, channel coding, and retransmission for data fidelity with energy harvesting

    IEEE Trans. Commun.

    (2018)
  • YuY. et al.

    Data gathering with tunable compression in sensor networks

    IEEE Trans. Parallel Distrib. Syst.

    (2008)
  • VecchioM. et al.

    Adaptive lossless entropy compressors for tiny IoT devices

    IEEE Trans. Wireless Commun.

    (2014)
  • M. Tahir, R. Farrell, Optimal communication-computation tradeoff for wireless multimedia sensor network lifetime...
  • HuW. et al.

    Toward joint compression–transmission optimization for green wearable devices: An energy-delay tradeoff

    IEEE Internet Things J.

    (2017)
  • BarrK.C. et al.

    Energy-aware lossless data compression

    ACM Trans. Comput. Syst.

    (2006)
  • R. Sharma, A data compression application for wireless sensor networks using LTC algorithm, in: Proceedings of the IEEE...
  • EswaranS. et al.

    Adaptive in-network processing for bandwidth and energy constrained mission-oriented multihop wireless networks

    IEEE Trans. Mob. Comput.

    (2012)
  • ChiangM. et al.

    Layering as optimization decomposition: A mathematical theory of network architectures

    Proc. IEEE

    (2007)
  • ZhangW. et al.

    Energy optimal wireless data transmission for wearable devices: A compression approach

    IEEE Trans. Veh. Technol.

    (2018)
  • KangMin Jae

    Energy-aware determination of compression for low latency in solar-powered wireless sensor networks

    Int. J. Distrib. Sens. Netw.

    (2017)
  • MohamedM.I. et al.

    Adaptive data compression for energy harvesting wireless sensor nodes

  • PuttermanM.L.

    Markov Decision Processes.: Discrete Stochastic Dynamic Programming

    (2014)
  • CastiglioneP. et al.

    Energy management policies for energy-neutral source-channel coding

    IEEE Trans. Commun.

    (2012)
  • NeelyMichael

    Stochastic Network Optimization with Application to Communication and Queueing Systems

    (2010)
  • SuttonR.S. et al.

    Reinforcement Learning: An Introduction

    (2018)
  • View full text