End-to-end CNN-based dueling deep Q-Network for autonomous cell activation in Cloud-RANs

doi:10.1016/j.jnca.2020.102757

Journal of Network and Computer Applications

Volume 169, 1 November 2020, 102757

https://doi.org/10.1016/j.jnca.2020.102757 Get rights and content

Highlights

•
A framework for autonomous cell activation/deactivation and customized physical resource allocation in C-RANs.
•
An end-to-end CNN-based relational dueling DQN approach for autonomous cell activation.
•
A customized physical resource allocation scheme for rate and delay to balance QoS satisfaction and resource utilization with minimum number of active RRHs.
•
A comprehensive set of simulation results is provided.

Abstract

The fifth generation (5G) technology is expected to support a rapid increase in infrastructure and mobile user subscriptions with an increase in the number of remote radio heads (RRHs) per unit area using cloud radio access networks (C-RANs). From the economic point of view, minimizing the amount of energy consumption of the RRHs is a challenging issue. From the environmental point of view, achieving “greenness” in wireless networks is one of the many goals of telecommunication operators. This paper proposes a framework to balance the energy consumption of RRHs and quality of service (QoS) satisfaction of users in cellular networks using a convolutional neural network (CNN)-based relational dueling deep Q-Network (DQN) scheme. Firstly, we formulate the cell activation/deactivation problem as a Markov decision process (MDP) and set up a two-layer CNN which takes raw captured images in the environment as its input. Then, we develop a dueling DQN-based autonomous cell activation scheme to dynamically turn RRHs on or off based on the energy consumption and QoS requirements of users in the network. Finally, we decouple a customized physical resource allocation for rate-constrained users and delay-constrained users from the cell activation scheme and formulate the problem as a convex optimization problem to ensure the QoS requirements of users are achieved with the minimum number of active RRHs under varying traffic conditions. Extensive simulations reveal that the proposed algorithm achieves faster rate of convergence than nature DQN, Q-learning and dueling DQN schemes. Our algorithm also achieves stability in mobility scenarios compared with DQN and dueling DQN without CNN. We also observe a slight improvement in balancing energy consumption and QoS satisfaction compared with DQN and dueling DQN schemes.

Introduction

The rapid surge in the number of connected mobile devices from varying applications in recent times has prompted infrastructure providers and service operators to expand their infrastructural network assets to be able to match the increased number of subscriptions. A research paper published by the international telecommunication union (ITU) in 2016 indicates that the mobile and wireless data traffic density in the world will increase by a 1000 fold in 2025 (ICT Data and Statistics Division, 2016). This has led to much attention being turned to research on energy consumption issues in the fifth generation (5G) cellular networks. As the issue of global warming is increasing day-by-day, telecommunication regulatory bodies such as 3rd generation partnership project (3GPP), ITU, etc. are seeking to design green cellular networks (Telecommunication Management, 2010). This motivation has given birth to innovative projects like the EARTH project and the GreenTouch project (Correia et al., 2010). Existing researches have revealed that base stations (BSs) are responsible for 60%–80% of the total energy consumption in cellular networks (Marsan et al., 2009) and the traffic load on BSs is less than one-tenth of the peak value for 30% time of weekdays (Oh et al., 2011). This calls for the deactivation of BSs dynamically, in which BSs automatically enter sleep mode when the traffic volume is low to ensure low energy consumption.

In traditional radio access networks (RANs), the baseband processing units are installed at each small cell BS. However, the deployment of small cells in RANs causes high energy consumption and increases the computational burden heavily (Cai et al., 2016). Cloud radio access network (C-RAN) is a promising energy-efficient network architecture in 5G using cell activation techniques. In C-RANs, the baseband units (BBUs) are installed in the cloud, rather than at small cells. In this way, C-RANs have the prospect of reducing the capital expenses (CapEx) and operational expenses (OpEx) of network operators. All signal processing and upper layer functionalities are performed by the BBU pool in the cloud and the remote radio heads (RRHs) are responsible for radio frequency transmission and reception (Luo et al., 2015). The centralized structure of C-RANs allows RRHs to be controlled by a common BBU pool, which can switch the RRHs on or off to reduce energy consumption. C-RANs can also reduce interference and improve cooperative processing gain through cloud computing techniques, thereby solving the energy efficiency (EE) problem in next generation networks. In an attempt to provide an energy-efficient cellular network, many researchers have formulated the joint cell activation, user association and resource allocation problem as convex optimization problems with different objective functions (Mesodiakaki et al., 2014; Xu et al., 2014; Zhuang et al., 2016a; Koudouridis et al., 2012a). However, the traditional model-based solutions may be impractical in dynamic wireless network environments where mobility of users affect the network statistics at each time frame. In this case, model-free solutions are likely to achieve gains since they optimize the network EE in the entire operational period in real-time. Reinforcement learning (RL) is a model-free machine learning technique with a centralized learning agent who can tackle complex decision-making problems and manage the energy consumption of RRHs based on the current state. In the scenario of energy-efficient C-RAN architecture, the learning agent selects possible actions in each state and then trains this data to make a decision whether to turn RRHs on or off at each decision epoch.

In recent times, deep learning (DL) has been successfully applied in speech recognition, and natural language processing, e.g. convolutional neural networks (CNN) (Lecun et al., 2015). DL has also been proposed in wireless communication problems to help RL algorithms learn sequential control tasks in end-to-end fashion (Cheng et al., 2017a, 2017b). CNN has the ability to capture more complex dynamics in mobility scenarios. Most existing machine learning algorithms such as Q-learning and deep reinforcement learning (DRL) applied in wireless network problems define the states of the users with hand-crafted features. If the relationship between users and RRHs are neglected, feature extraction may be somehow artificially operated and the learning agent may achieve sub-optimal solutions. On the other hand, users will have to report their information to their respective RRHs, which may result in a big burden on signaling overhead as feedback. With these shortfalls in mind, we introduce a two-layer end-to-end CNN-based relational dueling deep Q-Network (DQN) method with randomly-captured environment states. The raw observations are in the form of an “image” capturing the user-RRH relationships in the network. The main difference between the traditional DRL (nature DQN) and our proposed CNN-based relational dueling DQN method is that, traditional DRL develops a saturated replay memory with experiences from the current time frame under mobility scenarios. This may result in sub-optimal solutions from the learning agent. The DRL-based scheme in Xu et al. (2017) omits vital statistical characteristics of the users in the network which may have an impact on the outcome of the learning process. Our proposed scheme combines CNN and dueling DQN taking the raw image features between users and RRHs as raw input observations from the network. The CNN phase carries out the feature extraction process and the dueling DQN phase trains the network to achieve faster convergence. Dueling DQN ignores unnecessary states in the learning process, which enables faster rate of convergence. Also, we define a multi-objective reward function with coefficients to indicate the importance of each metric in the reward. To efficiently solve the energy consumption problem, we first formulate the cell activation problem as a Markov decision process (MDP) where we define states, actions, reward and future states. Then we develop an autonomous cell activation scheme based on CNN and dueling DQN techniques to activate or deactivate RRHs with respect to the energy consumption and quality of service (QoS) satisfaction requirements of users. Finally, we formulate the physical resource allocation problem as a convex optimization problem and solve to balance energy consumption and QoS satisfaction of users with the minimum number of active RRHs. The main contributions of this paper are as follows:

1.
We propose an autonomous decision-making framework for cell activation based on DL and DRL methods to minimize energy consumption of RRHs and ensure the QoS satisfaction of users in C-RANs.
2.
We formulate the cell activation problem as an MDP and set up a two-layer end-to-end CNN architecture that takes raw images captured by the RRHs as input states. The output of the CNN is fed to the dueling DQN as input to ensure autonomous cell activation or deactivation considering varying traffic profile and user mobility. The CNN part of the algorithm extracts image features of the network environment and the dueling DQN part has the ability to determine the best action for each state, making the learning process faster.
3.
We formulate the EE-QoS optimization problem as a convex optimization problem and customize it for rate-constrained users and delay-constrained users, but unified with aggregated QoS satisfaction.
4.
Comprehensive simulations are conducted to show that the proposed algorithm has a better performance in terms of balancing the energy consumption of RRHs and QoS satisfaction of users. The proposed algorithm also achieves faster rate of convergence and stable performance in dynamic mobility scenario.

The rest of the paper is organized as follows: Section 2 presents the related works and Section 3 describes the system model in terms of network model, traffic model, power consumption model and utility model. Section 4 introduces the problem formulation for autonomous energy management. Furthermore, we present the two-layer CNN-based relational dueling DQN algorithm in Section 5. Section 6 presents the simulation results and analysis. Finally, the conclusion is presented in Section 7.

Section snippets

Related work

Energy management is considered as a significant aspect in the future 5G wireless cellular networks, especially issues of EE and QoS satisfaction of users. As such, a number of academic and industrial researchers have developed huge interest in the investigation of this research area in order to find lasting solutions to this problem. The authors in Buzzi et al. (2016) carried out a survey on energy-efficient techniques for 5G and concluded that existing traditional solutions may have economic

System model

In the system model, there are four types of entities in the network, i.e. a macro base station (MBS), BBUs, RRHs and users. The RRHs provide fronthaul services and the MBS provides backhaul services in the same spectrum resource pool. By forming a coordinated multipoint processing (CoMP), the MBS executes control exchange with the BBU pool through the backhaul interface. When the radio signal arrives, the RRH compresses it and forwards it to the BBU through the fronthaul interface. The MBS and

Problem formulation

In this paper, cell activation/deactivation is taken into account, which solves EE and QoS satisfaction problem in C-RAN. We model the activation problem as an MDP and adopt a DRL approach is to solve the MDP. An adaptive controller attempts to adjust cell activation/deactivation decisions in response to the traffic model of the network, thus improving EE and QoS satisfaction. In the framework of MDPs, the controller interacts with the traffic environment as follows: At the decision step t,

Nature DQN

Nature DQN (Xu et al., 2017) is one of the RL techniques whereby the agent interacts solely with the environment. In addition to sensing the states of the environment, it does not need additional information about the environment. Based on the Bellman equation, the state-action value function $Q^{π} (s^{(t)}, a^{(t)})$ can be denoted as r + γQ∗(s′, a′) in DQN, where s′ is the next state, a′ is the next best action and r is the reward. The loss function can be calculated as: $L_{i} (θ_{i}) = E_{s, a, r, s^{'}} [{(y_{i}^{D Q N} - Q (s, a; θ))}^{2}],$ with $y$

Scenario configuration

In this section, we present our simulations to illustrate the performance of our end-to-end CNN-based relational dueling DQN algorithm. We benchmark our proposed algorithm against Q-Learning, Nature DQN (Xu et al., 2017), Dueling DQN (Sutton and Barto, 1998) and Anchor Graph Hashing (AGH + Q-learning) (Lobo et al., 1998). Performance evaluation is conducted in terms of convergence rate, the total energy consumption, the satisfaction of user and mobility scenario. The RRHs are distributed

Conclusion

In this paper, we proposed an end-to-end CNN-based relational dueling DQN framework for autonomous cell activation in C-RANs to balance energy consumption of RRHs and QoS satisfaction of users. The CNN phase of the framework extracts raw image features containing information about the relationship between users and RRHs, which serve as the input to the relational dueling DQN phase. Based on RL technique, the dueling DQN phase dynamically performs cell activation and deactivation decisions

CRediT authorship contribution statement

Guolin Sun: Conceptualization, Methodology, Validation, Resources, Visualization, Project administration, Supervision. Daniel Ayepah-Mensah: Writing - original draft, Software. Rong Xu: Software, Formal analysis. Gordon Owusu Boateng: Writing - original draft, Writing - review & editing. Guisong Liu: Writing - review & editing, Validation.

Declaration of competing interest

None.

References (46)

3GPP TS 36.52-1
Evolved Universal Terrestrial Radio Access (EUTRA); User Equipment (UE) conformance specification; Radio transmission and reception; Part 1: Conformance testing
(2018)
A. Adnan
Hap-SliceR: a radio resource slicing framework for 5G networks with haptic communications
IEEE Syst. J.
(Sept. 2018)
A.S. Alam et al.
A scalable multimode base station switching model for green cellular networks
G. Auer et al.
Energy Efficiency Analysis of the Reference Systems, Areas of Improvements and Target Break Down
(2012)
L. Busoniu et al.
Reinforcement Learning and Dynamic Programming Using Function Approximators
(April 2010)
S. Buzzi et al.
A survey of energy-efficient techniques for 5G networks and challenges ahead
IEEE J. Sel. Area. Commun.
(Apr. 2016)
S. Cai et al.
Green 5G heterogeneous networks through dynamic small-cell operation
IEEE J. Sel. Area. Commun.
(2016)
X. Cheng et al.
Mobile big data: the fuel for data-driven wireless
IEEE Internet Things J.
(Oct. 2017)
X. Cheng et al.
Exploiting mobile big data: sources, features and applications
IEEE Netw.
(Jan/Feb. 2017)
L.M. Correia et al.
Challenges and enabling technologies for energy aware mobile radio networks
IEEE Commun. Mag.
(Nov. 2010)

B. Dai et al.

Energy efficiency of downlink transmission strategies for cloud radio access networks

IEEE J. Sel. Area. Commun.

(Apr. 2016)

M. Deruyck et al.

Power consumption model for macro cell and microcell base stations

Trans. Emerg. Telecommun. Technol.

(2014)

G. Giambene

M/G/1 queuing theory and applications

ICT Data and Statistics Division

The World in 2016: ICT Facts and Figures

(June 2016)

H. Ide et al.

Improvement of learning for CNN with ReLU activation by sparse regularization

G.P. Koudouridis et al.

A centralized approach to power on-off optimization for heterogeneous networks

G.P. Koudouridis et al.

A centralised approach to power on-off optimization for heterogeneous networks

W. Lai et al.

Joint power and admission control for spectral and energy Efficiency maximization in heterogeneous OFDMA networks

IEEE Trans. Wireless Commun.

(May. 2016)

Y. Lecun et al.

Deep learning

Nature

(2015)

H. Li et al.

Deep reinforcement learning: framework, applications and embedded implementations

Y. Lin et al.

Optimizing user association and spectrum allocation in HetNets: a utility perspective

IEEE J. Sel. Area. Commun.

(June. 2014)

M.S. Lobo et al.

Applications of second-order cone programming

Linear Algebra Appl. J.

(1998)

S. Luo et al.

Downlink and uplink energy minimization through user association and beamforming in C-RAN

IEEE Trans. Wireless Commun.

(Jan. 2015)

Cited by (14)

An efficient energy saving scheme using reinforcement learning for 5G and beyond in H-CRAN
2024, Ad Hoc Networks
Maximizing the energy saving is one of the most important metrics in 5G and Beyond (B5G) cellular mobile networks. In order to satisfy the diverse requirements of 5G/B5G in dynamic environments, Reinforcement Learning (RL) is proven as a viable approach for solving resource management problems, especially for 5G energy resources. In this paper, we propose to apply the Q-Learning (QL) Reinforcement technique in the Heterogeneous Cloud 5G Radio Access Network (H-CRAN) architecture in order to optimize the energy efficiency in 5G/B5G networks. We compare its results with the Genetic Algorithm variant using Transformation (TGA) and Particle Swarm Optimization (PSO) under high and low traffic demands. The experimental results reveal the efficiency of RL compared to TGA and PSO techniques in terms of energy efficiency and system capacity.
DTGCN: a method combining dependency tree and graph convolutional networks for Chinese long-interval named entity relationship extraction
2023, Journal of Ambient Intelligence and Humanized Computing
Bi-Dueling DQN Enhanced Two-Stage Scheduling for Augmented Surveillance in Smart EMS
2023, IEEE Transactions on Industrial Informatics
Resource Allocation for Cloud Wireless Access Networks Based on Distributed Task Scheduling Algorithms
2023, International Conference on Integrated Intelligence and Communication Systems, ICIICS 2023
Intelligent Anti-Jamming Decision Algorithm of Bivariate Frequency Hopping Pattern Based on DQN With PER and Pareto
2023, International Journal of Information Technology and Web Engineering
Improved duelling deep Q-networks based path planning for intelligent agents
2023, International Journal of Vehicle Design

View all citing articles on Scopus

Guolin Sun received his B.S., M.S. and Ph.D. degrees all in Comm. and Info. System from the University of Electronic Sci. and Tech. of China (UESTC), Chengdu, China, in 2000, 2003 and 2005 respectively. After Ph.D. graduation in 2005, Dr. Guolin has eight years of industrial work experiences on wireless research and development for LTE, Wi-Fi, Internet of Things (ZIGBEE and RFID, etc.), Cognitive radio, Location and navigation. Before he joined the School of Computer Science and Engineering, University of Electronic Sci. and Tech. of China, as an Associate Professor on Aug. 2012, he worked in Huawei Technologies Sweden. Dr. Guolin Sun has filled over 40 patents, and published over 40 scientific conference and journal papers and acts as TPC member of conferences. Currently, he serves as a vice-chair of the 5G oriented cognitive radio SIG of the IEEE (Technical Committee on Cognitive Networks (TCCN) of the IEEE Communication Society. His general research interest is 5G/2020 oriented wireless network, such as software defined networks, network function virtualization, wireless networks

Daniel Ayepah-Mensah received his Bachelor in Computer Engineering from Kwame Nkrumah University of Science and Technology (KNUST), Kumasi, Ghana in 2014 and his master degree in Computer Science and Engineering from University of Electronic Science and Technology of China (UESTC) in 2019. He is currently studying Ph.D. Computer Science and Engineering in University of Electronic Science and Technology of China (UESTC). From 2014 to 2017, he worked as a Software Developer. He is also a member of the Mobile Cloud-Net Research Team – UESTC. His research interest includes generally Wireless Networks, Big Data and Cloud Computing.

Rong Xu received the BSc. Degree in Educational Technology from Nanchang Hongkong University, China in 2017. He is currently studying MSc in computer science at University of Electronic Science and Technology of China (UESTC), due to finish in 2020. He is also a member of the Mobile Cloud-Net Research Team – UESTC. His primary research interests include radio resource management, deep learning and cloud radio access networks.

Gordon Owusu Boateng received his bachelor degree in Telecommunications Engineering from the Kwame Nkrumah University of Science and Technology (KNUST), Kumasi-Ghana, West Africa, in 2014 and his master degree in Computer Science and Engineering from University of Electronic Science and Technology of China (UESTC) in 2019. He is currently studying Ph.D. Computer Science and Engineering in University of Electronic Science and Technology of China (UESTC). From 2014 to 2016, he worked under sub-contracts for Ericsson (Ghana) and TIGO (Ghana). He is also a member of the Mobile Cloud-Net Research Team – UESTC. His interests include mobile/cloud computing, 5G wireless networks, data mining, D2D communications and SDN.

Guisong Liu received his B.S. degree in Mechanics from the Xi'an Jiao Tong University, Xi'an, China, in 1995, and his M.S. degree in Automatics, Ph.D degree in Computer Science from the University of Electronic Science and Technology of China, Chengdu, China, in 2000 and 2007, respectively. Prof. Liu was a visiting scholar at Humbolt University at Berlin during September to December in 2015. Now, he is a full professor in the School of Computer Science and Engineering, the University of Electronic Science and Technology of China, Chengdu, China. His research interests include pattern recognition, neural networks, and machine learning.

^☆: This work is supported by National Natural Science Research Foundation of China, Grant No. 61771098, by the Fundamental Research Funds for the Central Universities under Grant No. ZYGX2018J068, by the fund from the Department of Science and Technology of Sichuan province, Grant No. 2017GFW0128, 8ZDYF2265, 2018JYO578 and 2017JY0007 and the ZTE Innovation Research Fund for Universities Program 2016.

View full text

End-to-end CNN-based dueling deep Q-Network for autonomous cell activation in Cloud-RANs☆

Highlights

Abstract

Introduction

Section snippets

Related work

System model

Problem formulation

Nature DQN

Scenario configuration

Conclusion

CRediT authorship contribution statement

Declaration of competing interest

Evolved Universal Terrestrial Radio Access (EUTRA); User Equipment (UE) conformance specification; Radio transmission and reception; Part 1: Conformance testing

Hap-SliceR: a radio resource slicing framework for 5G networks with haptic communications

IEEE Syst. J.

A scalable multimode base station switching model for green cellular networks

Energy Efficiency Analysis of the Reference Systems, Areas of Improvements and Target Break Down

Reinforcement Learning and Dynamic Programming Using Function Approximators

A survey of energy-efficient techniques for 5G networks and challenges ahead

IEEE J. Sel. Area. Commun.

Green 5G heterogeneous networks through dynamic small-cell operation

IEEE J. Sel. Area. Commun.

Mobile big data: the fuel for data-driven wireless

IEEE Internet Things J.

Exploiting mobile big data: sources, features and applications

IEEE Netw.

Challenges and enabling technologies for energy aware mobile radio networks

IEEE Commun. Mag.

Energy efficiency of downlink transmission strategies for cloud radio access networks

IEEE J. Sel. Area. Commun.

Power consumption model for macro cell and microcell base stations

Trans. Emerg. Telecommun. Technol.

M/G/1 queuing theory and applications

The World in 2016: ICT Facts and Figures

Improvement of learning for CNN with ReLU activation by sparse regularization

A centralized approach to power on-off optimization for heterogeneous networks

A centralised approach to power on-off optimization for heterogeneous networks

Joint power and admission control for spectral and energy Efficiency maximization in heterogeneous OFDMA networks

IEEE Trans. Wireless Commun.

Deep learning

Nature

Deep reinforcement learning: framework, applications and embedded implementations

Optimizing user association and spectrum allocation in HetNets: a utility perspective

IEEE J. Sel. Area. Commun.

Applications of second-order cone programming

Linear Algebra Appl. J.

Downlink and uplink energy minimization through user association and beamforming in C-RAN

IEEE Trans. Wireless Commun.