Battery charge scheduling in long-life autonomous mobile robots via multi-objective decision making under uncertainty

doi:10.1016/j.robot.2020.103629

Robotics and Autonomous Systems

Volume 133, November 2020, 103629

https://doi.org/10.1016/j.robot.2020.103629 Get rights and content

Highlights

•
Novel approach for scheduling battery charging of a mobile service robot.
•
Reasoning over learnt predictive models of tasks occurrence allows robot to manage battery effectively.
•
Extensive evaluation, including in two datasets obtained from a real-life long term deployment.

Abstract

The daily working hours of mobile robots are limited primarily by battery life. Most systems use a combination of thresholds and fixed periods to decide when to charge. This produces charging behaviour that ignores high-value tasks that must be performed within time-windows or by deadlines. Instead the robot should schedule charging adaptively, taking into account the times of day when it is expected to be given more valuable tasks to perform. This paper proposes an approach that exploits the fact that, during long-term deployments, the robot can learn when it is most probable that valuable tasks are added to the system, enabling it to schedule charging at times that are expected to be less busy. We pose the problem of scheduling battery charging as a multi-objective sequential decision making problem over a time-dependent Markov decision process model of expected task rewards and battery dynamics. We evaluate the scalability and solution quality of our multi-objective scheduler, and compare it with a typical rule-based approach. Empirical results show that our approach enables more flexible and efficient robot behaviour, which takes into account both the value of current available tasks and the predicted value of future tasks to decide whether to charge at a given time.

Introduction

Autonomous robots are being deployed for increasingly long durations in labs, homes, and offices, for tasks from vacuuming to security [1], [2], [3]. These robots perform tasks throughout the day, many of which must be performed within time-windows. Since the robots are typically battery powered, they have limited operation time, and thus they must recharge regularly. For many robots, this can be achieved autonomously by having the robot dock at a charging station. In spite of the importance of battery management for long-lived mobile robots, limited attention has been given to the problem of deciding when to charge. Current methods typically send the robot to charge when its battery is low, or designate a period of the day within which it always charges. In either case, the generated behaviour can be inflexible and inefficient, not taking into account task-related knowledge, such as the value associated to tasks currently available to the robot, and predictions on future tasks. In this paper, we propose an approach for scheduling battery charging that takes into account this task-related knowledge.

This work is inspired by our long-term deployments of mobile service robots in office environments [1]. During operation the robot would execute tasks requested both by human users, and by curiosity-driven components of the system, which added observation tasks to the robot’s schedule in order to improve its internal models. Every task was associated with a time-window within which execution should happen, and a reward value representing the benefit of completing it. For example, being at reception at 9 am to greet an important visitor was considered more important than being at the printer at 2pm to observe human activities. Furthermore, while some of these tasks were scheduled beforehand, many were demanded on-the-fly by users. Under these circumstances, the future schedule of the robot is not deterministic, giving rise to a need for models that encode the uncertainty associated with future tasks. We thus formulate the battery charge scheduling problem as a finite-horizon Markov decision process (MDP) that encodes the dynamics of the battery and available tasks, where the rewards available for executing tasks evolve according to a time-dependent transition function. These models are learnt from data obtained from robot experience, and are continuously maintained and updated. This provides greater accuracy as the robot learns the dynamics of a specific environment.

In order to increase the lifetime of modern batteries, manufacturers advise reducing the time spent with low levels of charge. Furthermore, if the battery discharges totally, then human intervention is required to manually move the robot to the charging station. Thus, we specify a multi-objective problem, where the goal is to find policies that trade off the amount of time the robot spends under a user-defined battery level threshold, against the expected cumulative reward from task execution. The MDP model is formulated and solved using the probabilistic model checker PRISM [4], utilising its implementation of Pareto front approximation for multi-objective problems [5]. In order to achieve online execution, and allow the system to react to the arrival of new tasks, we propose a receding horizon control (RHC) execution of the Pareto-optimal policies.

The battery scheduler proposed in this work decides whether the robot should charge or perform tasks, assuming that the robot is able to successfully complete all tasks (and thus gather all rewards) available for a specific time-window. Scheduling and execution of tasks is handled by a separate system [6], [7]. This separation of concerns allows for an approach that can reason about long windows of time when deciding whether to charge. We evaluate the scheduling system using real life deployment schedules with a scheduling time-window of up to 24 h and show that the system outperforms different rule-based systems.

In summary, the main contribution of this paper is the use of techniques from multi-objective decision making under uncertainty to develop a battery charge scheduler that adapts the schedule to new tasks as they arrive; charges the robot during less busy periods; and anticipates future busy periods, ensuring enough charge is built up to cope with them.

This paper is an extended version of [8]. Here, we provide a more thorough coverage of related work, improve the structure of the presentation of the framework, elaborate on our policy execution approach, and provide a broader empirical evaluation, covering more datasets and evaluating the scalability of the method.

Section snippets

Related work

Research in battery management usually focuses on conservation of energy or effective use of available power. Of particular interest in this context is dynamic power management (DPM), where one dynamically adjusts the battery consumption requirements depending on the needs of the tasks being executed. An early example of this in the context of robotics is [9], which proposes a DPM approach for a mobile robot using deterministic linear models of power consumption. Outside of robotics, DPM has

Markov models

We model the stochastic dynamics of the battery charge scheduling problem as a finite-horizon Markov decision process (MDP). We start by introducing deterministic time Markov chains (DTMCs), which we use to model the battery dynamics.

Definition 1 DTMC

A DTMC is a tuple $C = 〈 S, T_{C} 〉$ , where:

•
$S$ is a finite set of states; and
•
$T_{C} : S \times S \to [0, 1]$ is a probabilistic transition function, where $\sum_{s^{'} \in S} T_{C} (s, s^{'}) = 1$ for all $s \in S$ .

$T_{C} (s, s^{'})$ represents the probability of moving to state $s^{'}$ given that we were in state $s$ in the previous timestep.

Underlying stochastic processes

We model the stochastic dynamics of the two processes which influence whether to charge or execute tasks at a specific point in time: the battery level, and the predicted reward for future task execution. The values of these features will be part of the state representation of the MDP representing the battery charge scheduling problem. We propose discretised models for these processes, thereby providing a balance between accuracy and the resulting model size.

We assume a fixed-length timestep $δ$

MDP model of battery charge scheduling

We model the battery charge scheduling problem as a finite-horizon MDP.

Definition 8 Charge Scheduling MDP

A charge scheduling MDP is a tuple $M_{sched} = 〈 S_{sched}, A_{sched}, H, T_{sched} 〉$ , where:

•
$S_{sched} = S_{B} \times \tilde{R} \times {0, 1} \times {0, \dots, H}$ , where a state $(b, \tilde{r e w}, c, t)$ is such that $b$ represents the current battery level, $\tilde{r e w}$ represents the reward cluster closest to the reward observed, $c = 1$ if and only if the robot is charging, and $t$ is the current timestep.
•
$A_{sched} = {gr, gc, sc}$ , i.e., at each timestep the robot has three available actions: the gather reward action $gr$

Experimental study

We now present an empirical evaluation of our approach. Through the following experiments, we demonstrate the scalability and robustness of the battery scheduler. The implementation of the battery charge scheduler, along with the datasets used for the experiments, can be found in https://github.com/milanmt/Battery-Scheduler.

Conclusion

We presented an approach for scheduling battery charging for a mobile robot. Our approach exploits a model which predicts the distribution of rewards available from task execution in the future. It uses this model to build policies that ensure highly valued tasks can be executed whilst keeping the battery level above a safety threshold. Alongside the already discussed avenues for improvement, future work will focus on developing an approach to integrate the part of the schedule already known at

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Milan Tomy received her M.Sc. in Robotics in 2017, from the University of Birmingham. Her research interests lie in planning and decision making under uncertainty. She is motivated by the application of this research in simplifying rather than eliminating human effort. She has been working as a machine learning engineer aiming to improve the operations of manufacturing industries.

References (29)

HawesN. et al.
The STRANDS project: Long-term autonomy in everyday environments
IEEE Robot. Autom. Mag.
(2017)
M.M. Veloso, J. Biswas, B. Coltin, S. Rosenthal, CoBots: Robust symbiotic autonomous mobile service robots, in:...
KunzeL. et al.
Artificial intelligence for long-term robot autonomy: A survey
IEEE Robot. Autom. Lett.
(2018)
KwiatkowskaM. et al.
PRISM 4.0: Verification of probabilistic real-time systems
ForejtV. et al.
Pareto curves for probabilistic model checking
LacerdaB. et al.
Probabilistic planning with formal performance guarantees for mobile service robots
Int. J. Robot. Res.
(2019)
MudrovaL. et al.
An integrated control framework for long-term autonomy in mobile service robots
TomyM. et al.
Battery charge scheduling in long-life autonomous mobile robots
MeiY. et al.
A case study of mobile robot’s energy consumption and conservation techniques
RongP. et al.
Battery-aware power management based on Markovian decision processes
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
(2006)

BacciG. et al.

Optimal and robust controller synthesis: Using energy timed automata with uncertainty

GerasimouS. et al.

Efficient runtime quantitative verification using caching, lookahead, and nearly-optimal reconfiguration

LahijanianM. et al.

Resource-performance tradeoff analysis for mobile robots

IEEE Robot. Autom. Lett.

(2018)

ZhaoX. et al.

Towards integrating formal verification of autonomous robots with battery prognostics and health management

Cited by (19)

Co-optimizing for task performance and energy efficiency in evolvable robots
2022, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Ignoring the practical energy limits increases the reality gap (Jakobi et al., 1995; Mouret and Chatzilygeroudis, 2017) and is a limiting factor in the practical application of the evolved systems. Some robotics systems tackle the battery problem by scheduling charging times (Tomy et al., 2020; Floreano and Mondada, 1996), or by improving algorithms for green scheduling (Cota et al., 2019). However, these solutions do not observe possible controller or morphological changes resulting from battery-conscious robots.
Evolutionary robotics is concerned with optimizing autonomous robots for one or more specific tasks. Remarkably, the energy needed to operate autonomously is hardly ever considered. This is quite striking because energy consumption is a crucial factor in real-world applications and ignoring this aspect can increase the reality gap. In this paper, we aim to mitigate this problem by extending our robot simulator framework with a model of a battery module and studying its effect on robot evolution. The key idea is to include energy efficiency in the definition of fitness. The robots will need to evolve to achieve high gait speed and low energy consumption. Since our system evolves the robots’ morphologies as well as their controllers, we investigate the effect of the energy extension on the morphologies and on the behavior of the evolved robots. The results show that by including the energy consumption, the evolution is not only able to achieve higher task performance (robot speed), but it reaches good performance faster. Inspecting the evolved robots and their behaviors discloses that these improvements are not only caused by better morphologies, but also by better settings of the robots’ controller parameters.
Fast-charging station for electric vehicles, challenges and issues: A comprehensive review
2022, Journal of Energy Storage
Citation Excerpt :
A multi-objective optimization model was developed to minimize operating costs, transmission losses and carbon emissions of several micro-grid systems. In [146], the issue of battery charge scheduling was discussed as a sequential multi-objective decision problem in a Markov decision process model of predictable work rewards and battery dynamics. Fig. 12 shows a general classification of intelligent algorithms [147–152].
In recent years, many countries have set specific goals to replace fossil fuel vehicles with the electric ones due to environmental concerns and issues related to energy supply security; it is predicted that using these vehicles will increase rapidly in the upcoming years. Therefore, in addition to home chargers, fast charging stations are needed to accelerate the charging speed and to save the costs of the consumed energy by the owner, thus lowering the disruptive effects of the home chargers on the power quality of the electricity grid. The price of the electric vehicle, independence, charging process and charging infrastructures are the main factors that have major effects on the progress and development of electricity. During the last few years, numerous concepts and topics such as energy management, infrastructure and the best charging plan with integrated energy and developed technologies are introduced for modeling charging stations. Therefore, the most important requirements in this field are improving the efficiency of charging stations in terms of charging speed, managing between charging and discharging, existence of renewable sources and Energy Storage System (ESS). Recognizing and studying these components and their development are the important parts of this research, which has not been studied before. In other words, this paper review the state-of-the-art aspects for different levels of designing a fast-charging station with complete coverage of the research work done related to the upcoming challenges. Considering the advantages and disadvantages of electric vehicles (EVs), some challenges in this concept and ideas for the future expansion of EVs charging station and its communications are introduced. Results from different surveys show that along with mutual communications and people's increasing desire for EVs, participation in planning will be beneficial for both the society and the government, which will result in the desired social welfare. Also, the presence of renewable resources due to technology development has become far more impressive. Finally, the various aspects of fast-charging stations along with an overview of probable areas for future study in this field are also presented.
Special Issue on the 9th European Conference on Mobile Robots (ECMR 2019)
2022, Robotics and Autonomous Systems
Performance Guarantee for Autonomous Robotic Missions using Resource Management: The PANORAMA Approach
2024, Journal of Intelligent and Robotic Systems: Theory and Applications
Metal-air batteries for powering robots
2023, Journal of Materials Chemistry A
Formal Modelling for Multi-Robot Systems Under Uncertainty
2023, arXiv

View all citing articles on Scopus

Bruno Lacerda received his Ph.D. in Electrical and Computing Engineering from the Instituto Superior Técnico, University of Lisbon, Portugal, in 2013. Between 2013 and 2017, he was a Research Fellow at the School of Computer Science, University of Birmingham, UK. Currently, he is a Senior Researcher at the Oxford Robotics Institute, University of Oxford, UK. His research interests lie in the use of formal approaches to specify and synthesise high-level robot controllers. To achieve this goal, he his particularly interested in using temporal logics, Petri nets, supervisory control theory and planning under uncertainty.

Nick Hawes received his Ph.D. in Artificial Intelligence from the University of Birmingham in 2004. He is currently an Associate Professor in the Oxford Robotics Institute, part of the Department of Engineering Science at the University of Oxford, and a Fellow of Pembroke College. He leads the GOALS research group which performs research on problems in mission planning and decision making for autonomous system, particularly goal-oriented, long-lived robots acting in uncertain environments. He is an Associate Editor for the Journal of AI Research, and a Group Leader for AI and Robotics at the UK’s Turing Institute.

Jeremy L. Wyatt received his B.A. in Theology from the University of Bristol, U.K., a Masters in Knowledge-Based Systems from the University of Sussex, U.K., and his Ph.D. in Artificial Intelligence from the University of Edinburgh, U.K., in 1997. He is an Honorary Professor of Robotics and Artificial Intelligence at the University of Birmingham, U.K. He has published more than 110 refereed articles, led two international research projects on robot planning and learning (CogX) and robot manipulation (PaCMan), and edited three books. His research interests include machine learning, planning, architectures, robot vision, mobile robotics, and robot manipulation. Professor Wyatt was a recipient of two best paper prises. He was a Leverhulme Fellow from 2006 to 2008.

^☆: This work was supported by UK Research and Innovation and EPSRC, UK through the Robotics and Artificial Intelligence for Nuclear (RAIN) research hub [EP/R026084/1].

View full text

Battery charge scheduling in long-life autonomous mobile robots via multi-objective decision making under uncertainty☆

Highlights

Abstract

Introduction

Section snippets

Related work

Markov models

Underlying stochastic processes

MDP model of battery charge scheduling

Experimental study

Conclusion

Declaration of Competing Interest

The STRANDS project: Long-term autonomy in everyday environments

IEEE Robot. Autom. Mag.

Artificial intelligence for long-term robot autonomy: A survey

IEEE Robot. Autom. Lett.

PRISM 4.0: Verification of probabilistic real-time systems

Pareto curves for probabilistic model checking

Probabilistic planning with formal performance guarantees for mobile service robots

Int. J. Robot. Res.

An integrated control framework for long-term autonomy in mobile service robots