“DRL + FL”: An intelligent resource allocation model based on deep reinforcement learning for Mobile Edge Computing
Introduction
In recent years, with the continuous development of smart terminal devices, people have continuously increased the diversity and quality requirements of mobile services. A large number of computation-intensive and time-sensitive applications have appeared in mobile edge networks, such as object detection, gesture recognition, 3D modeling, interactive games, etc. However, due to the limited resources of network and computing, smart terminal devices can only run the model training part of most intelligent applications in the cloud and force a large amount of training data required by intelligent applications to be uploaded to the cloud, causing congestion in core network communications and reducing the Quality-of-Experience (QoE) of user.
As an important extension and supplement of cloud computing, a new computing paradigm—Mobile Edge Computing (MEC) [1], which sinks computing and storage resources from the cloud to the vicinity of User Mobile Devices (UMDs) to alleviate the burden on the core network. At the same time, offloading computing-intensive and time-sensitive applications on mobile devices to edge servers can reduce the energy consumption of UMDs and the service delay of application requests, and improve the Quality-of-Service (QoS) for mobile users. However, network resources and computing resources are also limited in the MEC system. If a large number of computing tasks cannot be efficiently transmitted to the edge servers, it will also cause congestion of wireless channels and lack of computing resources of the edge servers. Therefore, how to adaptively jointly allocate network resources and computing resources in the MEC system is the key to supporting computing offload and providing high service quality.
Most existing works use multi-objective optimization, semi-positive definite relaxation (SDR) and game theory methods to realize resource allocation. Even though these jobs can achieve better results in specific hypothetical scenarios. Nevertheless, considering the actual edge network scenario in MEC, these optimization methods may be difficult to cope with the following problems: (1) Dynamic network environment; dynamic network environment means uncertain user requests for input and changing system states, and these changing system conditions will have a significant impact on network and computing load distribution. (2) Lack of continuity; in a highly time-varying MEC system, most existing resource allocation algorithms can only achieve system optimization at a certain moment. In a real environment, we need to face continuous variables and continuous states. Only through continuous control can long-term benefits be brought to the system.
The use of Artificial Intelligence (AI) techniques to optimize the resource allocation process is a current new trend. Related research on computing offloading and resource allocation, such as [2], [3], has proven Reinforcement Learning [4] (especially Deep Reinforcement Learning (DRL) [5]) has unprecedented potential in joint resource management. Therefore, this paper utilizes DRL to adaptively allocate network and computing resources. In addition, Federal Learning (FL) [6] is also introduced as a framework for training of DRL agents in a distributed way while (1) greatly reducing the amount of data uploaded through the wireless uplink channel, (2) increasing cognitive response to edge network and core network environments, (3) adapting well to the heterogeneous UMDs in the actual edge network, and (4) protecting the privacy of personal data [7]. Our contributions are summarized as follows:
(1) We use the DRL architecture (specifically, DDQN [8]) to optimize the network and computing resource allocation of mobile edge systems, and propose a DDQN based Network and Computing Resource Allocation (DDQN-RA) algorithm, which can obtain perceptual information from the environment to improve strategies to adapt to changing circumstances and make decision sequences to realize adaptive resource allocation.
(2) We propose a “DRL + FL” resource allocation model, applying a Federal Learning framework to deploy DRL agents in edge networks. The FL framework trains DRL agents in a distributed way. The advantage of this combination is that it complements the advantages of deep reinforcement learning and federated learning to achieve approximately optimal performance.
(3) We consider minimizing the average system energy consumption and average service delay of all requests made by UMDs and balancing the network load on each data link and the computing load on the MEC servers. Through our proposed “DRL + FL” model, the intelligent joint management of the network and computing resources in the MEC system is realized. And it proves that this scheme has advantages in average system energy consumption, average service delay, and load balancing.
The rest of this paper is organized as follows: We review the related work in Section 2. Then, we depict the system model in Section 3, followed by the problem formulation in Section 4. In Section 5, we optimize the resource allocation by DRL. Furthermore, in Section 6, we integrate FL with DRL in the MEC system. Performance evaluation is shown in Section 7 compared with some classical algorithms. Finally, in Section 8, we conclude this paper.
Section snippets
Related work
Mobile edge computing, as a supplement to cloud computing, solves the problems of limited mobile device resources and request–response delays in mobile edge systems through computing offload strategies and mobile application deployment strategies. In order to solve the problems of mobile edge network congestion and edge server load imbalance, it is urgent to use a resource allocation method to allocate network and computing resources in the MEC system to improve the service efficiency of the
Network model
As shown in Fig. 1, considering a scenario in which a group of UMDs is covered by a group of base stations , UMDs can choose to offload the compute-intensive tasks to the edge node through a wireless channel or execute tasks locally. The compute-intensive task is defined as , where is the data amount that the user needs to complete the task [15], and is the number of CPU clock cycles required to complete the task [16]. In addition, all channels of a base
Problem formulation
From the description in Section 3, the system energy consumption includes the transmission energy consumption in the edge network and the computing energy consumption of the server-side (the transmission energy consumption of the local computing can be considered as 0), and the service delay includes the data transmission delay in the edge network and corresponding server data processing delay. This paper focuses on designing an intelligent allocation algorithm for network and computing
Optimizing the resource allocation by DRL
Considering that the dynamic changes of the environment will cause uncertain inputs and changes in system conditions, intelligent resource allocation algorithms should consider the state of the MEC environment to make favorable decisions. Deep reinforcement learning systems can form adaptive models without a large amount of labeled data by exploring the environment and receiving feedback. It is very suitable for the changing MEC environment we are considering. Therefore, in this section we
Integrating FL with DRL in the MEC system
The DDQN-RA algorithm we proposed above needs to be run on the DRL agent. However, the deployment of the DRL agent will be a key issue. Generally, we divide the deployment of DRL agents into three modes, i.e., Centralized DRL Deployment, (1)Deploy the DRL agent at the cloud center. And two modes of Distributed DRL Deployment, (2) Deploy the DRL agents at the cloud center and edge nodes, and (3) Deploy the DRL agents at the cloud center, edge nodes, and all UMDs. If the first deployment mode is
Experiment settings
In order to study the performance of “DRL + FL”, we did some simulations on network and computing resource allocation experiments. In all simulations, the time horizon is discretized into time epochs.
On investigating the capabilities of DRL coupled with FL over the MEC system for resource allocation, we set the average request data size within 5MB [29] and the request requirement of CPU cycles between 50 Megacycles and 1 Gigacycles based on the characteristics of several applications in the
Conclusions
In this paper, we study the resource allocation problem in a variable MEC environment, including network resource allocation and computing resource allocation. We considered this issue from the aspects of minimizing the average energy consumption of the system, minimizing the average service delay, and balancing resource allocation. Based on the advantages of adaptive learning ability and sequence decision of DRL technology in the environment, DRL is introduced into the MEC system to solve the
CRediT authorship contribution statement
Nanliang Shan: Conceptualization, Methodology, Visualization, Writing - original draft, Software, Writing - review & editing. Xiaolong Cui: Methodology, Supervision, Resources. Zhiqiang Gao: Conceptualization, Methodology, Visualization, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Funding
This research was funded by the National Natural Science Foundation of China (Grant nos. U1603261) and the Natural Science Foundation Program of Xinjiang Province, China (Program No. 2016D01A080).
References (34)
- et al.
Deep reinforcement learning based computation offloading and resource allocation for MEC
- et al.
Application of deep reinforcement learning to intrusion detection for supervised problems
Expert Syst. Appl.
(2020) - et al.
Adversarial environment reinforcement learning algorithm for intrusion detection
Comput. Netw.
(2019) - M. Patel, B. Naughton, C. Chan, N. Sprecher, S. Abeta, A. Neal, et al. Mobile-edge computing introductory technical...
- et al.
Optimal and scalable caching for 5G using reinforcement learning of space-time popularities
IEEE J. Sel. Top. Sign. Proces.
(2017) - et al.
Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach
IEEE Trans. Veh. Technol.
(2017) - et al.
Reinforcement Learning: An Introduction
(2018) - et al.
Human-level control through deep reinforcement learning
Nature
(2015) - et al.
Communication-efficient learning of deep networks from decentralized data
(2016) - et al.
Federated learning systems: Vision, hype and reality for data privacy and protection
(2019)