Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework

de C. Neto, José M.; Neto, Sildolfo F. G.; M. de Santana, Pedro; de Sousa, Vicente A.

doi:10.3390/s20071855

Open AccessArticle

Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework

Federal University of Rio Grande do Norte, Natal-RN 59078-970, Brazil

^*

Author to whom correspondence should be addressed.

^†

Current address: Robert Bosch GmbH Corporate Research, Renningen, 71272, Germany.

Sensors 2020, 20(7), 1855; https://doi.org/10.3390/s20071855

Submission received: 15 December 2019 / Revised: 23 January 2020 / Accepted: 23 January 2020 / Published: 27 March 2020

(This article belongs to the Special Issue Internet of Multimedia Things (IoMT): Opportunities, Challenges and Solutions)

Download

Browse Figures

Versions Notes

Abstract

:

Cellular broadband Internet of Things (IoT) applications are expected to keep growing year-by-year, generating demands from high throughput services. Since some of these applications are deployed over licensed mobile networks, as long term evolution (LTE), one already common problem is faced: the scarcity of licensed spectrum to cope with the increasing demand for data rate. The LTE-Unlicensed (LTE-U) forum, aiming to tackle this problem, proposed LTE-U to operate in the 5 GHz unlicensed spectrum. However, Wi-Fi is already the consolidated technology operating in this portion of the spectrum, besides the fact that new technologies for unlicensed band need mechanisms to promote fair coexistence with the legacy ones. In this work, we extend the literature by analyzing a multi-cell LTE-U/Wi-Fi coexistence scenario, with a high interference profile and data rates targeting a cellular broadband IoT deployment. Then, we propose a centralized, coordinated reinforcement learning framework to improve LTE-U/Wi-Fi aggregate data rates. The added value of the proposed solution is assessed by a ns-3 simulator, showing improvements not only in the overall system data rate but also in average user data rate, even with the high interference of a multi-cell environment.

Keywords:

multi-Cell; LTE-U; reinforcement learning; cellular broadband IoT

1. Introduction

The increasing use of the wireless licensed spectrum, as well as its scarcity, comes with the massive demand from mobile data, as the number of wireless devices connected to the internet is expected to keep growing every day. According to [1], the number of mobile broadband subscriptions in the first quarter of 2019 reached 6 billion, corresponding to a growth rate of 15 percent year-on-year. In the cellular Internet of Things (IoT) applications (services using a cellular infrastructure with no need for a new private network), the same report also estimates that 4.4 billion devices will be anchored in broadband by the end of 2024. These services that correspond to IoT deployments require a large amount of data, e.g., multimedia data, augmented reality (AR), and virtual reality (VR).

Historically, previous cellular broadband IoT applications have been enabled by the third-generation (3G) of mobile communication. Still, with the increasing demand, the fourth generation (4G) rises as the primary enabler, since it can better cope with such demand. However, the spectrum of long term evolution (LTE) 4G deployments, or the LTE-M (modified LTE for IoT applications introduced in 3rd generation partnership project (3GPP) Release 13 [2]) needs licenses earned in an auction by the operators, which is expensive. Additionally, most of the licensed bands already have an overused profile, which is impracticable for IoT deployment, especially for broadband services and a significant number of devices. Then, one alternative to overcome those drawbacks is to leverage the unlicensed spectrum once it is free to use, and to offer cheap additional bandwidth.

Nevertheless, some problems can be solved with this solution. Some regional regulatory entities in the world, like in Europe and Japan, require that devices using the unlicensed spectrum implement a Listen-Before-Talk (LBT) mechanism, while others do not, like in South Korea and the United States. Another problem is that Wi-Fi (IEEE 802.11 a/n/ac) is the main, and most successful, access technology already using the unlicensed spectrum. Wi-Fi is one of the first choices for some IoT deployments, besides the other non-cellular solutions like LoRaWAN [3] and SigFox [4]. Thus, new technologies trying to exploit the unlicensed band must, somehow, take into account the better coexistence among the technologies so that there is no monopoly from one access technology. Three leading solutions were proposed to deal with these two problems: The LTE-Unlicensed (LTE-U), proposed by the LTE-U Forum [5]; LTE-Licensed Assisted Access (LAA) proposed by 3GPP release 13 [2]; MuLTEfire proposed by Qualcomm [6].

LTE-U is a LTE solution for the unlicensed band proposed by the LTE-U Forum (Qualcomm, Verizon, Samsung and Ericsson) targeted to work primarily in the 5 GHz ISM band in countries where regulatory entities do not require an LBT mechanism [5]. 3GPP 10 introduced carrier aggregation, which is the primary mechanism to enable this LTE solution in the unlicensed band. To equitably coexist with Wi-Fi, the LTE-U also implements a MAC layer mechanism based on a time division multiplexing (TDM)-like duty cycle (DC) approach. Targeting broadband IoT deployments, this duty cycle approach leads to a behavior where the energy consumption can even be reduced since the device with this technology operates following ON/OFF periods. But even with this feature, LTE-U is rarely explored, because of the Wi-Fi monopoly and the LTE-M solution for licensed spectrum.

Targeting deployments in Europe and Japan, where an LBT mechanism is mandatory, 3GPP in its Release 13 proposed the LTE-Licensed Assisted Access (LAA) [7]. Like LTE-U, LTE-LAA uses a standard LTE with modifications to operate in the unlicensed band but with an anchor LTE carrier in the licensed spectrum. LAA implements an LBT mechanism very similar to the carrier-sense multiple access with collision avoidance (CSMA/CA) of the Wi-Fi, but with different parameters. On the other hand, MuLTEfire [6] is another proprietary solution proposed by Qualcomm aiming a stand-alone operation in the unlicensed spectrum, i.e., without a licensed anchor. However, compared to LTE-LAA, MulteFire Release 1.1 Specification has an expanded support for IoT services with Low power wide area at 800/900 MHz with NB-IoT-U (Narrowband IoT-Unlicensed), and 2.4 GHz with eMTC-U (enhanced machine type communication-unlicensed) [8].

As a branch of the 802 families by IEEE, the IEEE 802.11 is a standard targeted to wireless area local networks (WLANs) operating in the unlicensed band, mostly with 2.4 GHz and 5 GHz frequency [9]. Wi-Fi is the global brand name used by all the products based on IEEE 802.11 and certified by the Wi-Fi Alliance regarding interoperability as the standard evolves. As the demand and deployments of Wi-Fi grew up, new generations were standardized, bringing the creation of Wi-Fi 2, 3, 4 and 5, based on the 802.11a, 802.11g, 802.11n and 802.11ac, respectively, to cope with even higher data rates for even higher demands. All these subsequent versions, except for Wi-Fi 3, use the 5 GHz band and adopt the orthogonal frequency division multiplexing (OFDM) technology, as LTE-LAA and LTE-U. But despite the similarities, the medium access control (MAC) layer of Wi-Fi differs from the ones in LTE-U/LAA, creating the necessity for investigations in the coexistence of such technologies, especially when targeting scenarios for broadband internet applications.

This paper extends our previous coexistence solution [10] to a multi-cell scenario with co-channel users. The herein proposed coexistence framework uses Q-Learning applied to LTE-U to automatically adapt the duty cycle parameter based on the transmitted data rate as the central metric for assessing coexistence interference. Our solution is a centralized, coordinated reinforcement learning framework to improve the system aggregate data rate.

This paper is organized as follows. Section 2 presents a literature review on LTE-U/Wi-Fi coexistence with a focus on the application of machine learning techniques in the coexistence problem. We also highlight our main contributions comparing with those previous works. In Section 3, we review the LTE-U and Wi-Fi as well as their main coexistence mechanisms. System modeling, scenario definition and preliminaries results are presented in Section 4. The proposed framework is presented in Section 5. In Section 6, we present and discuss the results of our solution. Section 7 presents the conclusions and the future perspectives of this work.

2. Related Works

We can find several works dealing with the coexistence of the new unlicensed LTE technologies and the consolidated Wi-Fi [11,12,13,14]. One common finding of these works is that the LTE solutions are, for the evaluated scenarios, a better neighbor for Wi-Fi than Wi-Fi itself in terms of interference and capacity. Another important finding states that this coexistence problem has desired characteristics for applying machine learning techniques to reduce the coexistence impact. In [15], the specific coexistence of LTE-U and Wi-Fi is analyzed. The main analysis of this work is how the adjustment of the duty cycle parameter influences the system throughput.

Medium access mechanisms are key blocks to allow coexistence between different technologies in the unlicensed spectrum. Thus, the mismatch between parameters can lead one technology to overlap others concerning access to the communication channel [16]. Even with this concern, LTE-LAA, with the carrier-sense multiple access with collision avoidance (CSMA-CA) mechanism, can achieve a performance equal to or better than Wi-Fi coexisting with Wi-Fi [12]. With LTE-U, channel access alternatives based on channel sensing and dynamic adaptation of the duty cycle fractional work cycle have been proposed [17,18]. The authors in [18] proposed an approach to disable standard LTE transmissions on specific subframes using the almost blank subframe (ABS) LTE functionality. Their results demonstrate that LTE-DC can operate harmoniously with Wi-Fi and even LTE-LAA by adequately adjusting the duty cycle [19,20,21].

Some classic optimization techniques are used to enhance coexistence metrics. Attempting a better Quality of Service (QoS) level, a solution based on joint optimization of transmission time, carrier allocation and power is proposed [22]. The authors also used throughput and delay-based user association as the metric for LAA/Wi-Fi coexistence. They achieved the theoretical performance with the solution, although there is trade-off between the desired QoS level and solution complexity. In addition to QoS, fairness is another metric exploited by classical optimization techniques. The first algorithm proposed in [23], based on connection admission control (CAC), is responsible for fairness in the sharing of resources. The second algorithm uses a duty cycle approach, similar to the one proposed for LTE-U, but applied to LTE-LAA to improve delay related metrics. Still in the LAA/Wi-Fi coexistence, the authors of [24] provide an analytical study of the influence of energy detection (ED) on the throughput of each technology. Conclusions indicate that maximum system throughput can be achieved by adjusting the ED threshold. A branch of classical approaches called game theory is also used to solve the problem of LTE/Wi-Fi coexistence. A coalition game-based framework, in a real deployment of commercial routers are used as a testbed of an LTE-U/Wi-Fi scenario. In this scenario, and with the proposed solution, the goal of users in the game is to maximize system throughput (sum of each user) [25]. Another game, which is cooperative and based on the Nash bargain, is presented in [26]. LTE-U users QoS is enhanced by sharing resources between LTE-U and Wi-Fi while protecting Wi-Fi.

Coexistence solutions can be classified regarding the radio resource management (RRM) strategy to be modified to achieve the coexistence. We summarize the following list of solutions:

Carrier detection and sensing: The authors of [27] implement adjustments on the LTE MAC layer to support this technique. After evaluations, they concluded that LTE presents throughput gains without decreasing the performance of Wi-Fi systems. Aiming channel access and QoS fairness, Li et al. [28] devised an enhanced LBT scheme to adaptively tune the clear channel assessment (CCA) threshold of LTE-LAA and interference avoidance to Wi-Fi. For LTE-U coexistence, two-channel sensing schemes are proposed in [29], including periodic sensing and persistent sensing. In the periodic scheme, the LTE-U APs sense the medium for a fraction of time within each subframe and then determine if it is available to transmit data during the remaining subframe. In the persistent sensing scheme, the LTE-U APs also sense the channel during one entire subframe until the channel becomes free, and then transmits the data during the following several subframes.
Link-adaptation: Using the already defined Wi-Fi features to support coexistence, authors in [30] proposed a link-adaptation algorithm based on Wi-Fi’s MAC distributed coordination function (DCF) protocol to enhance LTE-LAA’s throughput performance. The authors in [31] use a scheme similar to Wi-Fi’s request to send (RTS)/clear to send (CST) enabling protections for the LTE cells and helping Wi-Fi stations to save energy.
Power control: In [32], the authors affirmed that the static control power used for Wi-Fi might be used in a similar form for LTE in unlicensed bands, controlling coexistence interference. The authors of [33] compared the duty cycle mechanism with the power control mechanism for the LTE-U uplink, which resulted in a higher average user throughput for both LTE-U and Wi-Fi compared to LTE-U with the specific duty cycle of 80%. Regarding the ABS mechanism, they also affirm that this is a conservative solution, and the coexistence may be better addressed using some power control approach.
Spectrum slicing: The authors of [34] adopt the spectrum slicing, a dynamic and statics spectrum sharing through spectrum division into several partitions. In their solution, each partition can be exclusively accessed by one mobile network operator with few concerns about power control and therefore enhancing the coexistence scenario. Furthermore, works like [35] propose the use of game theory frameworks to mitigate interference in LTE-U/Wi-Fi coexistence and show improvements on the overall throughput.
Channel selection: In scenarios where spectrum slices are not possible, a solution can be using the channel selection algorithms to search for clean channels [15]. In [36], the authors propose an adaptive LBT mechanism that incessantly switches between the channels, impeding the channels for being occupied for a long time. Adopting a priority access approach, authors of [37] set the LTE-U with a higher priority than Wi-Fi to access the channel and demonstrated that with this approach, the coexistence is enhanced since LTE-U is more robust to the interference. Another way is to limit the LTE presence to increase the chance of Wi-Fi transmitting, as presented in [38].

Most of the techniques mentioned so far were all non-cooperative since they do not consider the cooperation of coexisting technologies to mitigate interference. However, the coexistence problem presents attributes that can enable the application of cooperative techniques. The authors of [36] proposed a centralized cooperative control management to employ the virtualization of some network functions, allowing the continuous transfer of resources between unlicensed technologies using cloud control of the distributed access points. In [18,39], the proposed solution has networks with modes of operation in which messages are exchanged between them to negotiate which mode the networks should operate and which parameters to use to enable a better coexistence. If a coexisting system is detected, one synchronization phase is started, and consequently, another negotiation phase starts to choose the coexistence parameters.

As previously mentioned, for regions where LBT is mandatory and transmissions are limited to burst, coexistence can be enabled by disabling the standard LTE transmissions on specific subframes using the almost blank subframe (ABS) LTE functionality [18]. In [40], the authors solve the problem of coexistence by optimizing the probability of access to the channel. They proposed a proportional fair allocation scheme based on equal distribution of channel access among competing entities, also considering idle periods, successful transmissions and collisions for Wi-Fi networks. In turn, authors of [13] leveraged the time division duplex (TDD) functionality of LTE to support various numbers of uplink and downlink frames to grant the fair coexistence between LTE-LAA and Wi-Fi. The authors of [41] proposed another technique to adapt the number of subframes according to the traffic load from WLAN systems. Yet considering LAA/Wi-Fi coexistence, the work [42] presents an investigation of offered transmission time of LTE-LAA to evaluate its performance compared to Wi-Fi.

Regarding the use of machine learning techniques to improve the coexistence of access technologies in the unlicensed band, the authors of [43,44] survey use cases and appropriate techniques that can be leveraged for wireless communication problems in general. In [45], a reinforcement learning technique called double Q-Learning for LAA and Wi-Fi coexistence is proposed. This solution is based on discontinuous transmissions (DTX) and transmit power control (TPC) to accurately select the channel, which leads to a better coexistence of LAA and Wi-Fi, as stated by the authors. Also, the authors of [46] proposed two solutions, one with Q-Learning and other with game theory, to tackle the problem of channel selection in an indoor scenario. Both solutions indeed enable a better coexistence, providing improvements in data rate. The work [47] proposes an algorithm to dynamically select the LTE-U duty cycle ratio in coexistence with Wi-Fi. The results demonstrate that system capacity increased for simulations with file transfer protocol (FTP) data traffic. Still using Q-Learning, an algorithm applied in a multichannel scenario has the mission of dynamically choosing the least populated channel and also the duty cycle value of that channel to improve system performance [48]. Two other algorithms also use the dynamic choice of duty cycle to achieve fairness in coexistence, differing only in the metric used [49,50].

In our previous contribution [10], we proposed a reinforcement learning framework based on Q-Learning applied to LTE-U to automatically adapt the duty cycle parameter to deal with the interference profile between the coexisting systems. However, the evaluated scenario presents only one device and one cell for each operator. So, to better evaluate the coexistence scenario between these two technologies with the proposed framework, as well as to extend our previous conclusions, we proposed herein a multi-cell evaluation, with even more devices (and cells) so that the interference profile can be more realistic and more challenging to track. Thus, in this paper, we highlight the following contributions:

A study of the coexistence problem between LTE-U and Wi-Fi in a multi-cell scenario with co-channel and inter-radio access network (RAT) interference under changing offered data rate that better approximates a possible cellular broadband IoT deployment. This evaluation differentiates from the previous one [10] due to the inclusion of multiples users and multiple interfering RATs.
A new centralized Q-Learning mechanism to dynamically adjust the duty cycle pattern so that it can achieve higher aggregated data rates per user and operator. This centralized mechanism differentiates from the previous one proposed in [10] as it operates under co-channel interference deployment, while the previous one, which is embedded only in a single operator, only considers a single source of interference. Furthermore, the modeling of the proposed solution in [10] takes into account the operator’s data rates being the same as the user’s data rate, since there is only one user per operator, whilst the modeling of the proposed solution in this current work discriminates between these data rates. Also, there is no coordination in the previous mechanism, consequently there is no communication among cells (base stations).
Extend evaluations and conclusions regarding LTE-U/Wi-Fi coexistence, without and with the proposed framework, to the multi-cell scenario with changing offered data rate and more challenging interference profile.

3. Wi-Fi and LTE-U MAC Layer

Medium access control is the enabler for the communication of multiple stations in a network since it provides channel access control, besides addressing and power control. In a coexistence scenario, each operator can use a different mechanism to control the medium access. So, it is imperative to review the basics of Wi-Fi and LTE-U and show their differences. In this section, we review the 802.11 MAC layer, followed by some specifications of LTE-U.

3.1. The 802.11 MAC Layer

Since the first release of 802.11 standards in 1997, several enhancements have been developed in the physical layer targeting higher data rates, besides amendments to support additional features. But even with these enhancements, the MAC layer of the following standards still share the same primary MAC mechanism, which is primarily based on the distributed coordination function (DCF). This function defines how the 802.11 devices will access the wireless medium, described as follows.

The DCF is classified as an LBT contention-based mechanism and operates in a distributed manner without any coordination. Whenever a station (STA) wishes to transmit, it has to first perform a clear channel assessment (CCA) for a fixed duration of time called distributed inter-frame spacing (DIFS) [51]. If the medium stays idle for DIFS, then the station can take control of the medium and starts to transmit for a period called channel occupation time (CoT). As the station transmits, it controls the channel by keeping a gap between each sent frame of one sequence, the short inter-frame spacing (SIFS), that is the necessary time for the transmitter to receive an acknowledgment (ACK) frame indicating that the receiver successfully received the frame. To avoid other stations accessing the channel duration of the transmission, SIFS is always smaller than DIFS, so that stations do not mislead a SIFS gap with the channel being completely idle. But if the channel goes busy before DIFS is completed, the station defers for other DIFS plus a random backoff time. If the medium remains idle for a DIFS duration plus the backoff time, the STA determines that it can take control of the channel and begins to transmit [9].

A random backoff time in the DCF approach provides the collision avoidance feature, since whenever the channel transitions from busy to idle, it is possible that multiple STAs were waiting and may be ready to transmit simultaneously. Thus, allowing each station to select a random time decreases the probability of collisions when the channel goes idle. To initiate a random backoff procedure, a station waits for the channel to be idle for a DIFS, then it picks an integer random backoff count in the [0, CW] range, where CW—contention window.

The CW value, used in the backoff procedure, always starts with a lower bound

C W_{m i n}

and every time that happens an unsuccessful transmission, the value of CW is updated according to

C W = [2 \cdot (C W + 1) - 1]

, with

C W_{m a x}

being the upper bound. Once a successful transmission occurs, CW is reset to the lower bound

C W_{m i n}

[9].

DCF uses two types of carrier sense functions to check the state of the medium:

Physical (PHY) carrier sense: Two PHY functions based on preamble detection (from Wi-Fi transmissions) and energy (from all transmissions in the bandwidth) to check the channel’s state. By measuring the energy level of the channel and comparing if the level is above a certain threshold, these functions can assure the medium is not busy [52].
Virtual carrier sense: A MAC function based on one field of the MAC header with the expected transmission duration. All stations that receive the information set their network allocation vector (NAV) and wait still this time had passed to start sensing the channel again [52].

Important changes are provided in other 802.11 related products, such as: a 60 GHz carrier PHY and a scheduled-fashion MAC access of the IEEE 802.11 ad [53,54] (for broadband access with limited coverage); a modified MAC to provide a higher number of connected devices of the 802.11 ad [55] (achieving a trade-off between the goals of the wireless personal area network (WPAN) and the low-power wide-area network (LPWAN)); and the improvements of dual-band (2.4 and 5 GHz) operation of the IEEE 802.11 ax (Wi-Fi 6) [56,57], enabling bandwidth-hungry applications in a dense deployment scenario (4K/8K video, virtual and augmented reality (VR/AR) applications).

3.2. The LTE-U Coexistence Mechanisms

LTE-U is underpinned by three mechanisms to enable coexistence with Wi-Fi and other LBT-like technologies in the unlicensed band: supplemental downlink (SDL), channel selection and carrier-sense adaptive transmission (CSAT) [15]. These mechanisms are carefully designed by software and allow the LTE deployment compatible with 3GPP Releases 10 and 11 [15]. Figure 1 presents the relational flowchart of the three mechanisms that enable LTE-U for coexistence.

Supplemental downlink (SDL) was proposed in 3GGP Release 9 to enable the bonding of paired and unpaired spectrum bands for enhancing downlink transmissions over single-cell deployments [58]. This feature enables the use of one unpaired band to create a downlink-only carrier besides the traditional downlink/uplink frequency division duplex (FDD) scheme in the paired spectrum. Release 10 introduced the possibility of bonding up to three carriers utilizing carrier aggregation that is another feature introduced in this release. SDL with carrier aggregation allows the transmission of data to a station using more than one downlink carrier and presents a division into the primary cell (PCell) carrier and secondary cell (SCell) carrier. For LTE-U, PCell is anchored in a licensed band and carries control and signaling information, while SCell is anchored in the unlicensed band and is used to boost downlink capacity opportunistically. Depending on the traffic load, the LTE base station (eNodeB) can turn the SCell on or off. If the load is light, the eNodeB can use only the PCell and turns off the secondary one, thus mitigating the interference in the unlicensed band. But if the load is heavy, the SCell is enabled and operates following the channel selection coexistence mechanism.

Channel selection enables the LTE cell to choose the cleanest channel for the secondary cell based on measurements of LTE and Wi-Fi coexisting technologies. It provides a behavior where LTE-U always tries to avoid interference, choosing the channel with the lowest interference profile, releasing the remaining channels to Wi-Fi or other LTE-U base stations. Channel selection operates in an on-going base, then every time the transmitting station finds a cleaner channel, it begins an operation of switching to the cleaner channel. The interference measurements are based on energy detection since it does not discriminate the types of interfering sources. The interference measurement can also be technology-specific based on the detection of Wi-Fi preambles or LTE-U synchronization signals. Furthermore, for some LTE-U deployments, the channel selection mechanism already satisfies coexistence requirements [15].

If channel selection does not find a clear channel, the carrier-sense adaptive transmission (CSAT) algorithm is used to enable the coexistence. CSAT is a time division multiplexing (TDM) algorithm based on long-term medium sensing. Before any attempt of transmission, the CSAT algorithm senses the medium for a long duration than the other coexisting technologies, and according to the observed medium activity, it proportionally adopts a duty cycle behavior activating and deactivating the SCell, as illustrated in Figure 2.

During the activated periods, LTE-U transmits as a standard LTE, and during the deactivated period, the channel is released so that the coexisting technologies can transmit. LTE-U operating with CSAT algorithm is also called, in literature, as LTE-DC [10,12,21]. As Qualcomm mainly proposed CSAT, the details of its algorithm implementation are hidden, opening up the possibility of alternative implementations by the industrial and academic community.

After a successful CSAT procedure, the duty cycle behavior in LTE-U can be performed by the almost blank subframe (ABS) functionality. This solution presented in [18], following the already defined feature with the same name presented in Release 10, but in the context of heterogeneous networks (HetNets). In LTE HetNets, cells present different sizes (macro, micro, pico and femto). Taking into account the essential ABS operation in this context, during the transmission time of the macro cell, the data transmission only occurs in individual subframes. Thus the remaining subframes can be used by the other cells. In the context of LTE-U, the data can be transmitted in the defined subframes while Wi-Fi, for example, could exploit the blanked LTE-U subframes to perform its transmission. The proportion of subframes to blank out is defined by the duty cycle ratio informed by the CSAT algorithm after its medium sensing procedure.

4. System Model, Evaluation Scenario and Preliminary Results

Before any attempt to incorporate the proposed reinforcement learning framework into the coexistence problem, the evaluation scenario must be defined, and the preliminary results must show why it is possible and necessary to use the proposed framework. This section presents the scenario as well as the preliminary results without the proposed reinforcement learning algorithm.

4.1. System Model and Evaluation Scenario

To better evaluate the coexistence problem between Wi-Fi and LTE-U, 3GPP proposed evaluation scenarios in the technical report TR36889 [7]. Among the proposed scenarios, the indoor (Hotspot) scenario is the most challenging and complex since it comprises not only multiple users but also interference between the same and different access technologies. Figure 3 presents the 3GPP indoor scenario considered in this work. This scenario is composed of two operators, called operator A and operator B, in an unlicensed band deploying four small cells each. The deployment is on a building room with dimensions of 50 × 120 m, no walls, and with the four base stations of each operator equally spaced in the X-axis defined by the d and

b s_s p a c e

distances. Besides the four cells, forty stations are randomly distributed in this rectangular region with no mobility [7].

We use the ns-3 simulator [59] to model the Wi-Fi, the LTE-U and their coexistence. The ns-3 is an open-source C++ network simulator based on discrete event simulation for research and educational purposes, also complying with technical norms from standard organizations like IEEE, 3GPP and Wi-Fi Alliance. As part of a project funded by Wi-Fi Alliance, the Centre Tecnològic de Telecomunicaciones de Catalunya (CTTC) and the University of Washington cooperated and created a branch of this simulator with the system modeling and coexistence evaluation of LTE-U and Wi-Fi, named as ns3-lbt [60]. In this version, the implemented duty cycled transmission of LTE-U follows the approach presented in [18], which is based on LTE ABS functionality, where the ABS pattern is defined with the duration of 40 ms. Since each subframe of an LTE transmission lasts for 1 ms, one bitmask of 40 values is defined (for each subframe) to control the ON–OFF pattern. Both the core and discussed modifications of this simulator are found on http://code.nsnam.org/laa/ns-3-lbt/file/9529febb7ebc.

Since the evaluation scenario presents a challenging interference profile, simulations were performed for different offered data rates for each duty cycle to map possible patterns. Following this approach, the DC value used is in the range of 0.2 to 0.9 with a granularity 0.1. The offered data rate follows the full buffer scheme, i.e., user datagram protocol (UDP) transmission with a constant bit rate, since we want to analyze the situation where there is always data to transmit, thus allowing an intense dispute to access the medium. This UDP offered data rate represents the available data rate capacity for each station, and a variable called UDPRate controls it. The parameters used on the simulation, for both LTE-U and Wi-Fi, are presented in Table 1. The simulation is set to run for 20 s since it is a sufficient duration that is possible to see the relation between the offered data rate and duty-cycle values in this scenario.

4.2. Preliminaries Results

Figure 4 presents the relation between the transmitted data rate (per operator) versus duty cycle value for

U D P R a t e = 2

Mbps. In theory, the expected transmitted data rate for each operator is the sum of its user’s rates (without interference), then it would be expected for each operator

20 \cdot 2 = 40

Mbps.

And as we can see, this theoretical data is nearly reached for each operator for some DC values, but never at the same time. For this behavior, the aggregated transmitted data rate (LTE-U + Wi-Fi) presents its maximum only for DC 0.6, even though this DC value does not present the best possible data rate for Wi-Fi. This duty cycle value indicates that for this configuration and the offered data rate, the system reaches its maximum rate with LTE-U having more channel access time than Wi-Fi. Furthermore, the LTE-U’s data rates for the DC values from 0.6 and on do not present meaningful variations, indicating that for this UDPRate as the DC value keeps increasing, the most affect access technology is the Wi-Fi. The same pattern also occurs for Wi-Fi for DC values less than 0.5.

Figure 5 is the similar to Figure 4, but now with

U D P R a t e = 4

Mbps. For this offered data rate, the maximum aggregated transmitted data rate is reached for a DC value of 0.4., indicating that the system, for this situation, reaches its maximum with Wi-Fi having more channel access time than LTE-U. Furthermore, for this UDPRate, the transmitted data rate for each operator is more affected by the DC value. While for

U D P R a t e = 2

Mbps, and duty cycle values bigger than 0.6, the LTE-U’s transmitted rate slightly changes, now for

U D P R a t e = 4

Mbps, the data rates keep increasing with the DC value, and the Wi-Fi ones keep decreasing.

As a general result for the multi-cell environment, the duty cycle value has a significant influence on the transmitted data rates combined with the UDPRate used. This pattern also occurs in other scenarios, and different data rates, as presented in [10] for a single-cell scenario, showing a general pattern in this LTE-U/Wi-Fi coexistence problem. This problem also gets more complicated if the UDPRate changes over time, as the best DC value also changes per each offered data rate. Then, for setting the goal for maximizing the system throughput in a dynamic offered data rate scenario, it is necessary to use a tool for automatically choosing the optimal DC value on the fly. The next section presents our proposed reinforcement learning framework based on Q-Learning to address this problem.

5. Proposed Reinforcement Learning Framework

If we set the goal for maximizing the aggregated transmitted data rate as the time passes, and the offered data rate changes, a new DC value must be chosen to keep the performance on the maximum point, then bringing the problem to a sort of sequential decision-making problem. In this decision-making problem, each new chosen duty cycle value leads the system to a new state (a new value of aggregated transmitted data rate) that can change if the offered data rate also changes. So, for choosing the best duty cycle value for each case, the correlated decision-making problem must be solved. We address the Q-Learning as a solution to this problem.

5.1. Q-Learning

Q-Learning is a reinforcement learning algorithm proposed in [61] that combines dynamic programming and Monte Carlo concepts to solve sequential decision-making problems in the form of Markov decision processes (MDP) [62]. In MDP, we have a decision-maker that is called an agent, and all other things are called the environment. As the agent interacts with the environment, it chooses actions, and in turn, the environment responds to these actions in the form of observations or new situations. This observation returned from the environment are called rewards and are numerical values that the agent uses to achieve a goal, i.e., maximizing the sum of all received rewards. Figure 6 presents the relationship of these concepts in a diagram.

Practically all reinforcement learning algorithms involve the estimation of a value function [62]. The value function is a map from state–action pairs to a value that estimates how good or bad is a given state for the agent to be in. This notion of “how good” is defined in terms of the immediate and all expected future rewards that an agent receives as the learning process occurs, and since the future rewards depend on actions taken, a track of actions defines a policy

π

. Formally, a policy is a mapping from states to actions, and it is changed as a result of the agent’s learning process.

If at time t the action selected is a, the system is in state s under policy

π

, then it is defined the action-value function for policy

π

as

q_{π} (a, s) ≐ E_{π} [\sum_{k = 0}^{\infty} γ^{k} R_{t + k + 1} | S_{t} = s, A_{t = a}],

where

E

is the expected value, and

γ

is the discount factor controlling the relation between immediate and future rewards. Solving this reinforcement learning task means finding a policy that returns the maximum possible reward over the long run. There is always an optimal policy that is better or equal to the other policies [63] denoted

π_{*}

. Then, the optimal value for this optimal policy is defined as

q_{*} (a, s) ≐ \underset{π}{m a x} q_{π} (s, a) .

However, solving the reinforcement learning task for

q_{*}

is not trivial, since it requires the transition probabilities associated with the states and actions taken. To solve this problem, Q-Learning defines a Q-value that approximates

q_{*}

and solves it independent of the policy being followed. This Q-value is defined by

Q (S_{t}, A_{t}) \overset{}{\leftarrow} Q (S_{t}, A_{t}) + α [R_{t + 1} + γ \underset{a}{m a x} Q (S_{t + 1}, a) - Q (S_{t}, A_{t})],

(1)

where

S_{t}

is the set of states;

A_{t}

is the set of actions;

α

is the learning rate, indicates how much of the learning is taken into account to update the Q-value;

γ

is a parameter that controls how much the future rewards influences to update Q. If

γ

is set to zero, the agents learn to take into account only the immediate rewards, while if

γ

is to set to 1, the agent learns using only the possible future rewards. Both

α

and

γ

are in the range 0 to 1. The learning task of the agent is represented by updating the relation of Equation (1) for every state it reaches paired with the corresponding action. As presented in [61,62], the only requisite for convergence is all pairs (s, a) continue to be updated. Under this assumption, as each pair is visited several times and the corresponding Q value is updated, the algorithm converges with probability 1 to the optimal

q_{*}

[61,62].

5.2. Proposed Framework

In our work, we propose a centralized, coordinated algorithm to maximize the system aggregated transmitted data rate, and in the basis of Q-Learning we formulate the sets of states

S

and actions

A

as follows:

The duty cycle values that can be chosen as actions are $A = {0.2, 0.4, 0.6, 0.8}$ . These duty cycle values are chosen in a coordinated way. The node running the Q-Learning algorithm uses system information to choose the best DC value that is therefore set to all LTE-U cells;
The states represented by the aggregated transmitted data rate, $S = {0, 1, 2, 3}$ , are:
- State 0: $0 < T_{t x w i f i} + T_{t x l t e u} \leq \frac{1 * M}{4}$ Mbps;
- State 1: $\frac{1 * M}{4} < T_{t x w i f i} + T_{t x l t e u} \leq \frac{2 * M}{4}$ Mbps;
- State 2: $\frac{2 * M}{4} < T_{t x w i f i} + T_{t x l t e u} \leq \frac{3 * M}{4}$ Mbps;
- State 3: $\frac{3 * M}{4} < T_{t x w i f i} + T_{t x l t e u} \leq M$ Mbps;
where $T_{t x w i f i}$ is the sum of the transmitted data rate of all four Wi-Fi cells, and $T_{t x l t e u}$ is the sum of all four LTE-U cells. Then, the proposed algorithm works on a centralized way operating only on the operator data rate, LTE-U or Wi-Fi, and not on each cell. In this way, the state formulation can be simplified, since the algorithm operates on two operators and not on four cells. Thus, the number of states can be decreased to the only four states presented. Furthermore, defining the algorithm to work in a centralized operation also softens issues related to energy consumption since the algorithm will be operating in one single node in a coordinated way (or in a cloud server);
The reward for each DC value chosen is set to be $T_{t x w i f i} + T_{t x l t e u}$ that will be the aggregated transmitted data rate the system will reach for that DC. The general goal is to minimize the cost function $C = T_{t a r g e t} - (T_{t x w i f i} + T_{t x l t e u})$ that can be also viewed as the maximization of the reward $T_{t x w i f i} + T_{t x l t e u}$ over time to fulfill the goal. $T_{t a r g e t}$ is the desired aggregated transmitted data rate and it is set as a parameter.
The goal is set to maximize the system aggregated transmitted data rate.

The choice of these DC values is based on, firstly, the trade-off between computational complexity and the number of actions. This is intrinsic to the Q-Learning algorithm since the higher the number of actions and states, the higher the time is taken by the algorithm to converge. The second point is that these values represent four situations, starting with Wi-Fi having more air time than LTE-U, for

D C = 0.2

, to the opposite situation of LTE-U having more air time than Wi-Fi,

D C = 0.8

. Thus, after convergence, the best action (DC value) for the current state will be determined by the optimal Q-value, indicating what percentage of air time for each technology is more suitable in that moment for achieving the optimum goal.

The intervals between states were chosen as equally spaced intervals from 0 to M, where M is the maximum aggregated transmitted data rate that the system can achieve. It depends on the parameters of each system and the coexisting environment. The value of M differentiates from

T_{t a r g e t}

, as M is the real maximum that the system can achieve by the parameters used without the proposed solution, while

T_{t a r g e t}

is the desired maximum we want to achieve using the proposed algorithm.

Since the proposed solution operates in a centralized and coordinated manner, the new duty cycle value chosen by the algorithm, in each step, must be sent to all LTE-U cells by the central node. Regarding the simulation, as we use a discrete event simulator, the parameter parsing is by functions and occurs in the updating duty cycle phase, incurring in no latency (and overhead) for sending the new duty cycle value from the central node. However, targeting the real deployments, this communication between nodes can be achieved by jointly utilizing the X2 interface and the X2AP services. Defined by 3GPP in [64], the X2 is the denomination of the interface that connects one eNodeB to another for exchanging signaling messages, and the X2AP is the related protocol. The X2AP provides several functions as mobility management, cell configuration update, error reporting and others, which can be used by our proposed Q-Learning mechanism for sending the updated DC value with controlled overhead. However, when it comes to the latency issue, additional studies must be performed considering that the latency using X2 interface is highly influenced by the used network topology, transport technology (e.g., optical fiber and copper-base) and capacity of the backhaul link, which can vary the latency from a hundred of microseconds to 20 ms [65,66].

Figure 7 shows a flowchart of the proposed solution. From a macro-perspective, the proposed algorithm has two main steps. The first one, at the LTE-U cells, updates the actual DC value and reports the reward (data rates) to the central Q-Learning node. The second step, at the central Q-Learning node, performs a Q-Learning cycle, yielding in a new DC. This new DC value is reported back to the LTE-U cells in which the step one is started again.

To complement, the pseudo-algorithm of the proposed centralized, the coordinated solution based on Q-Learning is also presented in the Algorithm 1.

Algorithm 1: Q-Learning application for dynamic duty-cycle selection in a multi-cell environment.

6. Proposed Solution Evaluation

A new simulation to evaluate the proposed Q-Learning algorithm was performed following the same basic parameters for Wi-Fi and LTE-U presented in Table 1, but with modifications regarding the offered data rate for each station and the total duration of the simulation. The proof-of-conception simulation was performed according to the following scheme:

The simulation has a duration of 250 s where the offered data rate changes at specific points as the simulation runs.
The offered data rate is uniformly picked from $T_{o f e r r e d}$ = { 500 kbps, 1 Mbps, 2 Mbps, 4 Mbps}.
The specific points in time where the offered data rate changes that occur are uniformly picked from an interval between 10 to 15 s that are then summed to the current time until it reaches the 250 s limit. In this way, during the 250 s simulation, there will be from 16 to 25 changes in data rate (total duration divided by minimum and maximum intervals).
The Q-Learning algorithm always operates at the end of the current ABS interval, which is 40 ms, choosing the DC value for the next ABS interval until it converges.
To gain statistical confidence, 100 snapshots/repetitions of the 250 s simulation were realized. In each of these snapshots, the stations are uniformly distributed in the scenario. Then, in each snapshot, the stations are in different positions generating different interference profile.
Regarding the states, M is set to 160 Mbps and $T_{t a r g e t} = M$ ;
Regarding the Q-Learning parameters, the values of $α$ and $γ$ were chosen as $0.3$ and $0.5$ , respectively. These values were the best ones when the algorithm converged in the test phase using the proposed solution.

Figure 8 presents a bar graph of the simulation results using some fixed DC values and with our proposed framework. With the proposed framework, the maximum aggregated transmitted data rate is increased when compared with all the duty cycle values without the solution. When compared with the best static results,

D C = 0.6

, and

D C = 0.7

, the Q-Learning increases the maximum aggregated with around 3 Mbps more than the static ones, corresponding to a 5% increase percentage. If we compare the lower UDPRate versus the Q-Learning gain, it can correspond to six times increasing (UDRate with 500 kbps). If we focus on each operator, the LTE-U and Wi-Fi transmitted data rates present gains over the ones for

D C = 0.6

. However, if we take the other best static DC value, 0.7, only the Wi-Fi presents gain over the static case, even though the system still reaches its maximum. Furthermore, neither the LTE-U data rate with Q-Learning does not reach the maximum it can achieve

D C = 0.9

, nor the Wi-Fi at

D C = 0.2

. Still, the goal was set to maximize the aggregated

T_{t x l t e u} + T_{t x w i f i}

. The algorithm chose the best combination of duty cycle values for each state the system had passed to reach the goal, which led to a combination of data rates for each operator that was not the maximum possible, but the necessary to the goal as it would be expected.

The results so far have only shown the aggregated transmitted data rate perspective, but the Q-Learning also indirectly operates on each user transmitted data rate. Figure 9 presents a comparison between the cumulative density function (CDF) of the user transmitted data rate for the Q-Learning and the best static duty cycle value 0.6. Observing the Q-Learning curves, the transmitted data rate of practically all the Wi-Fi users improved, while all LTE-U users maintained their performance. The successive duty cycle values chose by the Q-Learning algorithm led the system to a behavior where Wi-Fi users have more air time. Consequently, more data rate, than the static

D C = 0.6

, but maintaining the mean data rate of LTE-U users, improving the aggregated one. Furthermore, the gain of around 3 Mbps in the aggregated performance presented in Figure 8 is higher than the maximum user transmitted data rate for both Wi-Fi and LTE-U that is

2.7

Mbps. Still, since the maximum data rate is the same with or without the proposed solution, we conclude that the gain in the aggregated data rate is distributed all over the users with less data rate.

For the duty cycle value of 0.7, the CDF presents a different behavior, as shown in Figure 10. The Wi-Fi stations present even more gains with Q-Learning, despite the gain is comes with the cost of worse performance for the LTE-U stations. With Q-Learning, the 60th-percentile shows that when compared with the case of

D C = 0.7

, the Wi-Fi gains affect the LTE-U stations with fewer data rates. Furthermore, the gain of 3 Mbps presented at the system level, as opposed to the

D C = 0.6

case, is still distributed for practically all Wi-Fi stations. Still, the LTE-U stations with fewer data rates are severely affected.

Figure 11 and Figure 12 presents the comparison of the 10th and 90th percentile for each duty cycle value and the Q-Learning, respectively.

For the 10th-percentile, the Wi-Fi Q-Learning results are better than all duty cycles values higher than 0.5, while the LTE-U ones are better for the duty cycles values lower than 0.5. For the 90th percentile, the Wi-Fi results with the proposed solution are better than all DC values higher than 0.4, while the LTE-U results are better for the duty cycle values lower than 0.5. Still, in the 90th-percentile analysis, it can be seen that LTE-U values do not show much variation for high values of the duty cycle. These duty cycles does not significantly affect this percentile for LTE-U, but only the Wi-Fi.

7. Conclusions and Future Works

In this work, we propose and evaluate a centralized, coordinated reinforcement learning framework, based on the Q-Learning algorithm, to deal with a multi-cell coexistence scenario between LTE-U and Wi-Fi. This scenario, differently from other previous works, is more challenging since it comprises not only more cells, but also interference between the same and different access technologies. Investigative results showed that each duty cycle value leads the system to a different maximum data rate, which also depends on the offered data rate we provided for each user in the coexistence scenario.

The automatic duty cycle selection of our proposed solution is accomplished at a time window of 40 ms, reaching to a improved aggregated transmitted data rate. This gain is distributed all over the Wi-Fi users, with Q-Learning providing gains in both system and user transmitted data rate.

Regarding the multimedia IoT deployments, since our solution adopts a centralized and coordinated approach, the issues related to battery consumption are diminished because the additional energy consumption due to the Q-Learning algorithm is only applied to the central node that runs it. Furthermore, the benefits of our framework matches one of the goals of cellular broadband IoT deployments that is the use of data rates sufficient to transmit multimedia data while keeping a reasonable energy consumption. Then, in the trade-off between energy consumption and increasing data rate, our solution achieves improvements in data rate without cost of energy consumption for each terminal.

One drawback of our solution is that the way it was modeled, the interference between the same access technologies is not explicitly taken in account, even it is included in the

T_{t x w i f i}

and

T_{t x l t e u}

collected values due to the lost packets. If we also explicitly consider the interference measurement, the results could present even better results, since the algorithm would use this information to better perform in the choice of the duty cycle value. Then, we address this as an extension of our solution and a future work, when we also could explore modifications regarding QoS or a decentralized approach with minimum additional energy consumption.

Author Contributions

Investigation, J.M.d.C.N. and S.F.G.N.; methodology, V.A.d.S.J. and J.M.d.C.N.; project administration, V.A.d.S.J.; resources, V.A.d.S.J.; supervision, P.M.d.S. and V.A.d.S.J.; validation, V.A.d.S.J.; writing—original draft, J.M.d.C.N. and S.F.G.N.; Writing—review and editing, J.M.d.C.N., S.F.G.N., P.M.d.S. and V.A.d.S.J. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Acknowledgments

The proof of concept simulations provided by this paper was supported by the High-Performance Computing Center at UFRN (NPAD/UFRN).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ericsson. Ericsson Mobility Report—June 2019. Available online: https://www.ericsson.com/49d1d9/assets/local/mobility-report/documents/2019/ericsson-mobility-report-june-2019.pdf (accessed on 5 December 2019).
3GPP. TR 36.300: LTE; Evolved Universal Terrestrial Radio Access (E-UTRA) and Evolved Universal Terrestrial Radio Access Network (E-UTRAN); Overall description; Stage 2 (Release 13). 2016. Available online: https://www.etsi.org/deliver/etsi_ts/136300_136399/136300/14.03.00_60/ts_136300v140300p.pdf (accessed on 5 December 2019).
Alliance, L. LoRaWAN™ 1.0.3 Specification. 2018. Available online: https://lora-alliance.org/sites/default/files/2018-07/lorawan1.0.3.pdf (accessed on 26 January 2020).
SIGFOX. Sigfox Device ETSI Mode White Paper. 2018, pp. 1–12. Available online: https://support.sigfox.com/docs/sigfox-device-etsi-mode-white-paper (accessed on 26 January 2020).
LTE-U Forum. LTE-U Forum; Technical Report; Alcatel-Lucent, Ericsson, Qualcomm Technologies: San Diego, CA, USA, 2017. [Google Scholar]
Qualcomm. Multefire: LTE-Like Performance with Wi-Fi-Like Deployment Simplicity. Available online: https://www.qualcomm.com/invention/technologies/lte/multefire (accessed on 5 December 2019).
3GPP. TR 36.889: Feasibility Study on Licensed-Assisted Access to Unlicensed Spectrum (Release 13). 2015.
Alliance, M. About MulteFire Release 1.1 Specification. Available online: https://www.multefire.org/technology/specifications/ (accessed on 5 December 2019).
Perahia, E.; Stacey, R. Next Generation Wireless LANs: Throughput, Robustness, and Reliability in 802.11n, 1st ed.; Cambridge University Press: New York, NY, USA, 2008. [Google Scholar]
De Santana, P.M.; de Sousa, V.A., Jr.; Abinader, F.M.; de Castro Neto, J.M. DM-CSAT: A LTE-U/Wi-Fi coexistence solution based on reinforcement learning. Telecommun. Syst. 2019, 71, 615–626. [Google Scholar] [CrossRef]
Dama, S.; Kumar, A.; Kuchi, K. Performance Evaluation of LAA-LBT based LTE and WLANs Co-existence in Unlicensed Spectrum. In Proceedings of the 2015 IEEE Globecom Workshops (GC Wkshps), San Diego, CA, USA, 6–10 December 2015. [Google Scholar] [CrossRef]
De Santana, P.M.; de Lima Melo, V.D.; De Sousa, V.A. Performance of License Assisted Access Solutions Using ns-3. In Proceedings of the 2016 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 15–17 December 2016. [Google Scholar] [CrossRef]
Rupasinghe, N.; Guvenc, I. Licensed-Assisted Access for WiFi-LTE Coexistence in the Unlicensed Spectrum. In Proceedings of the Globecom Workshops Emerging Technologies for 5G Wireless Cellular Networks, Austin, TX, USA, 8–12 December 2014. [Google Scholar] [CrossRef]
Chen, B.; Chen, J.; Gao, Y.; Zhang, J. Coexistence of LTE-LAA and Wi-Fi on 5 GHz With Corresponding Deployment Scenarios: A Survey. IEEE Commun. Surv. Tutor. 2017. [Google Scholar] [CrossRef]
Qualcomm; Ericsson; Alcatel-Lucent. Qualcomm Research LTE in Unlicensed Spectrum: Harmonious Coexistence with Wi-Fi; Technical Report; Qualcomm: San Diego, CA, USA, 2014. [Google Scholar]
Zinno, S.; Di Stasi, G.; Avallone, S.; Ventre, G. On a fair coexistence of LTE and Wi-Fi in the unlicensed spectrum: A Survey. Comput. Commun. 2018, 115, 35–50. [Google Scholar] [CrossRef]
Abinader, F.M., Jr.; de Sousa, V.A., Jr.; Choudhury, S.; de S. Chaves, F.; Cavalcante, A.M.; Almeida, E.P.L.; Vieira, R.D.; Tuomaala, E.; Doppler, K. LTE/Wi-Fi Coexistence in 5 GHz ISM Spectrum: Issues, Solutions and Perspectives. Wirel. Pers. Commun. 2018, 99, 403–430. [Google Scholar] [CrossRef]
Almeida, E.; Cavalcante, A.M.; Paiva, R.C.D.; Chaves, F.S.; Abinader, F.M.; Vieira, R.D.; Choudhury, S.; Tuomaala, E.; Doppler, K. Enabling LTE/WiFi coexistence by LTE blank subframe allocation. In Proceedings of the 2013 IEEE International Conference on Communications (ICC), Budapest, Hungary, 9–13 June 2013. [Google Scholar] [CrossRef]
Bojovic, B.; Giupponi, L.; Ali, Z.; Miozzo, M. Evaluating Unlicensed LTE Technologies: LAA vs LTE-U. IEEE Access 2019, 7, 89714–89751. [Google Scholar] [CrossRef]
Alhulayil, M.; Lopez-Benitez, M. Coexistence Mechanisms for LTE and Wi-Fi Networks over Unlicensed Frequency Bands. In Proceedings of the 2018 11th International Symposium on Communication Systems, Networks and Digital Signal Processing, CSNDSP 2018, Budapest, Hungary, 18–20 July 2018; pp. 1–6. [Google Scholar] [CrossRef]
De Santana, P.M.; Neto, J.M.d.C.; de Sousa, V.A. Evaluation of LTE Unlicensed Solutions Using ns-3. In Proceedings of the 2017 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 14–16 December 2017; pp. 674–679. [Google Scholar] [CrossRef]
Tan, J.; Xiao, S.; Han, S.; Liang, Y.C.; Leung, V.C. QoS-Aware user association and resource allocation in LAA-LTE/WiFi coexistence systems. IEEE Trans. Wirel. Commun. 2019, 18, 2415–2430. [Google Scholar] [CrossRef]
Maule, M.; Moltchanov, D.; Kustarev, P.; Komarov, M.; Andreev, S.; Koucheryavy, Y. Delivering Fairness and QoS Guarantees for LTE/Wi-Fi Coexistence under LAA Operation. IEEE Access 2018, 6, 7359–7373. [Google Scholar] [CrossRef]
Mehrnoush, M.; Sathya, V.; Roy, S.; Ghosh, M. Analytical Modeling of Wi-Fi and LTE-LAA Coexistence: Throughput and Impact of Energy Detection Threshold. IEEE/ACM Trans. Netw. 2018, 26, 1990–2003. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Jiang, C.; Wang, J.; Han, Z.; Yuan, J.; Cao, J. Coalition Formation Game Based Access Point Selection for LTE-U and Wi-Fi Coexistence. IEEE Trans. Ind. Inform. 2018, 14, 2653–2665. [Google Scholar] [CrossRef]
Bairagi, A.K.; Tran, N.H.; Saad, W.; Hong, C.S. Bargaining game for effective coexistence between LTE-U and Wi-Fi systems. In Proceedings of the IEEE/IFIP Network Operations and Management Symposium: Cognitive Management in a Cyber World (NOMS 2018), Taipei, Taiwan, 23–27 April 2018; pp. 1–8. [Google Scholar] [CrossRef]
Ratasuk, R.; Uusitalo, M.A.; Mangalvedhe, N.; Sorri, A.; Iraji, S.; Wijting, C.; Box, P.O.; Group, F.N. License-Exempt LTE Deployment in Heterogeneous Network. In Proceedings of the 2012 International Symposium on Wireless Communication Systems (ISWCS), Paris, France, 28–31 August 2012; pp. 246–250. [Google Scholar] [CrossRef]
Li, Y.; Zheng, J.; Li, Q. Enhanced Listen-before-talk Scheme for Frequency Reuse of Licensed-assisted Access Using LTE. In Proceedings of the 2015 IEEE 26th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Hong Kong, China, 30 August–2 September 2015; pp. 1918–1923. [Google Scholar] [CrossRef]
Jia, B.; Tao, M. A Channel Sensing Based Design for LTE in Unlicensed Bands. In Proceedings of the 2015 IEEE International Conference on Communication Workshop (ICCW), London, UK, 8–12 June 2015; pp. 2332–2337. [Google Scholar] [CrossRef]
Bhorkar, A.; Ibars, C.; Papathanassiou, A.; Zong, P. Medium Access Design for LTE in Unlicensed Band. In Proceedings of the 2015 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), London, UK, 8–12 June 2015; pp. 369–373. [Google Scholar] [CrossRef]
Jeon, J.; Niu, H.; Li, Q.C.; Papathanassiou, A.; Wu, G. LTE in the Unlicensed Spectrum: Evaluating Coexistence Mechanisms. In Proceedings of the 2014 IEEE Globecom Workshops (GC Wkshps), Austin, TX, USA, 8–12 December 2014; pp. 740–745. [Google Scholar] [CrossRef]
Study, A.; Power, T.; Channel, C.; Lte, U. How Loud to Talk and How Hard to Listen-Before-Talk in Unlicensed LTE. In Proceedings of the 2015 IEEE International Conference on Communication Workshop (ICCW), London, UK, 8–12 June 2015; pp. 2314–2319. [Google Scholar] [CrossRef]
Chaves, F.S.; Almeida, E.P.L.; Vieira, R.D.; Cavalcante, M.; Abinader, F.M., Jr. LTE UL Power Control for the Improvement of LTE/Wi-Fi Coexistence. In Proceedings of the IEEE 78th Vehicular Technology Conference (VTC Fall), Las Vegas, NV, USA, 2–5 September 2013; pp. 1–6. [Google Scholar] [CrossRef]
Teng, F.; Guo, D.; Member, S.; Honig, M.L. Sharing of Unlicensed Spectrum by Strategic Operators. IEEE J. Sel. Areas Commun. 2017, 35, 668–679. [Google Scholar] [CrossRef]
Xu, Y.; Wang, J.; Wu, Q.; Du, Z.; Shen, L.; Anpalagan, A. E NGINEERING FOR C OGNITIVE C ELLULAR S YSTEMS A Game-Theoretic Perspective on Self-Organizing Optimization for Cognitive Small Cells. IEEE Commun. Mag. 2015, 53, 100–108. [Google Scholar] [CrossRef] [Green Version]
Al-dulaimi, A.; Al-rubaye, S.; Ni, Q.; Sousa, E. 5G Communications Race: Pursuit of More Capacity Triggers LTE in Unlicensed Band. IEEE Veh. Technol. Mag. 2015, 10, 43–51. [Google Scholar] [CrossRef]
Song, H.; Fang, X. A Spectrum Etiquette Protocol and Interference Coordination for LTE in Unlicensed Bands ( LTE-U ). In Proceedings of the 2015 IEEE International Conference on Communication Workshop (ICCW), London, UK, 8–12 June 2015; pp. 2338–2343. [Google Scholar] [CrossRef]
Nihtil, T.; Tykhomyrov, V.; Alanen, O.; Uusitalo, M.A.; Sorri, A.; Moisio, M.; Iraji, S.; Ratasuk, R.; Mangalvedhe, N. System performance of LTE and IEEE 802.11 coexisting on a shared frequency band. In Proceedings of the 2013 IEEE Wireless Communications and Networking Conference (WCNC), Shanghai, China, 7–10 April 2013; pp. 1038–1043. [Google Scholar] [CrossRef]
Abinader, F.M.; Almeida, E.P.L.; Chaves, F.S.; Cavalcante, A.M.; Vieira, R.D.; Paiva, R.C.D.; Sobrinho, A.M.; Choudhury, S.; Tuomaala, E.; Doppler, K.; et al. Enabling the coexistence of LTE and Wi-Fi in unlicensed bands. IEEE Commun. Mag. 2014, 52, 54–61. [Google Scholar] [CrossRef]
Cano, C.; Leith, D.J. Coexistence of WiFi and LTE in Unlicensed Bands: A Proportional Fair Allocation Scheme. In Proceedings of the 2015 IEEE International Conference on Communication Workshop (ICCW), London, UK, 8–12 June 2015; pp. 2288–2293. [Google Scholar] [CrossRef]
Xing, M.; Peng, Y.; Xia, T.; Long, H.; Zheng, K. Adaptive Spectrum Sharing of LTE Co-existing with WLAN in Unlicensed Frequency Bands. In Proceedings of the 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), Glasgow, UK, 11–14 May 2015; pp. 1–5. [Google Scholar]
Choi, J.; Kim, E.; Chang, S. Dynamic Resource Adjustment for Coexistence of LAA and Wi-Fi in 5 GHz Unlicensed Bands. ETRI J. 2015, 37, 845–855. [Google Scholar] [CrossRef]
Ayoubi, S.; Limam, N.; Salahuddin, M.A.; Shahriar, N.; Boutaba, R.; Estrada-Solano, F.; Caicedo, O.M. Machine Learning for Cognitive Network Management. IEEE Commun. Mag. 2018, 56, 158–165. [Google Scholar] [CrossRef]
Zhou, F.; Lu, G.; Wen, M.; Liang, Y.C.; Chu, Z.; Wang, Y. Dynamic Spectrum Management via Machine Learning: State of the Art, Taxonomy, Challenges, and Open Research Issues. IEEE Netw. 2019, 33, 54–62. [Google Scholar] [CrossRef]
Galanopoulos, A.; Foukalas, F.; Tsiftsis, T.A. Efficient Coexistence of LTE With WiFi in the Licensed and Unlicensed Spectrum Aggregation. IEEE Trans. Cogn. Commun. Netw. 2016, 2, 129–140. [Google Scholar] [CrossRef]
Castane, A.; Perez-Romero, J.; Sallent, O. On the implementation of channel selection for LTE in unlicensed bands using Q-learning and Game Theory algorithms. In Proceedings of the 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC), Valencia, Spain, 26–30 June 2017; pp. 1096–1101. [Google Scholar] [CrossRef]
Rupasinghe, N. Reinforcement Learning for Licensed-Assisted Access of LTE in the Unlicensed Spectrum. In Proceedings of the 2015 IEEE Wireless Communications and Networking Conference (WCNC), New Orleans, LA, USA, 9–12 March 2015; pp. 1279–1284. [Google Scholar] [CrossRef]
Yu, Y.; Wang, T.; Liew, S.C. Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks. IEEE J. Sel. Areas Commun. 2019, 37, 1277–1290. [Google Scholar] [CrossRef] [Green Version]
Maglogiannis, V.; Naudts, D.; Shahid, A.; Moerman, I. A Q-Learning Scheme for Fair Coexistence between LTE and Wi-Fi in Unlicensed Spectrum. IEEE Access 2018, 6, 27278–27293. [Google Scholar] [CrossRef]
Bajracharya, R.; Shrestha, R.; Kim, S.W. Q-Learning Based Fair and Efficient Coexistence of LTE in Unlicensed Band. Sensors 2019, 19, 2875. [Google Scholar] [CrossRef] [Green Version]
Schiller, J. Mobile Communications; Addison-Wesley: New York, NY, USA, 2003. [Google Scholar]
IEEE. IEEE 802.11: Wireless LANs. Available online: http://standards.ieee.org/about/get/802/802.11.html (accessed on 5 December 2019).
TGad. TGad D0.1 - PHY/MAC Complete Proposal Specification. Available online: http://www.ieee802.org/11/Reports/tgad_update.html (accessed on 5 December 2019).
Alliance, W.F. Wi-Fi CERTIFIED WiGig™: Wi-Fi^® expands to 60 GHz. Available online: https://www.wi-fi.org/discover-wi-fi/wi-fi-certified-wigig (accessed on 5 December 2019).
Park, M. IEEE 802.11ah: Sub-1-GHz license-exempt operation for the internet of things. Commun. Mag. IEEE 2015, 53, 145–151. [Google Scholar] [CrossRef]
Khorov, E.; Kiryanov, A.; Lyakhov, A.; Bianchi, G. A Tutorial on IEEE 802.11ax High Efficiency WLANs. IEEE Commun. Surv. Tutor. 2019, 21, 197–216. [Google Scholar] [CrossRef]
Kim, S.; Park, H.; Kim, J.; Ryu, K.; Cho, H. 11ax PAR Verification Through OFDMA. Available online: https://mentor.ieee.org/802.11/dcn/16/11-16-1363-00-00ax-11ax-par-verification-through-ofdma.ppt (accessed on 5 December 2019).
Qualcomm. HSPA Supplemental Downlink; Technical Report; Qualcomm: San Diego, CA, USA, 2014. [Google Scholar]
ns-3. ns-3 Repositories. Available online: http://code.nsnam.org/ (accessed on 5 December 2019).
ns-3. LTE LBT Wi-Fi Coexistence Module Documentation (Release ns-3-lbt rev. b48bfc33132b). Technical Report. 2016. Available online: https://www.nsnam.org/~tomh/ns-3-lbt-documents/lbt-wifi-coexistence.pdf (accessed on 5 December 2019).
Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; Cambridge University Press: London, UK, 2018. [Google Scholar]
Watkins, C. Learning From Delayed Rewards. Ph.D. Thesis, King’s College, London, UK, May 1989. [Google Scholar]
3GPP. LTE; Evolved Universal Terrestrial Radio Access Network (E-UTRAN); X2 Application Protocol (X2AP); ETSI: Sophia Antipolis, France, 2014. [Google Scholar]
Lee, D.; Seo, H.; Clerckx, B.; Hardouin, E.; Mazzarese, D.; Nagata, S.; Sayana, K. Coordinated multipoint transmission and reception in LTE-advanced: Deployment scenarios and operational challenges. IEEE Commun. Mag. 2012, 50, 148–155. [Google Scholar] [CrossRef]
Lee, J.; Kim, Y.; Lee, H.; Ng, B.L.; Mazzarese, D.; Liu, J.; Xiao, W.; Zhou, Y. Coordinated multipoint transmission and reception in LTE-advanced systems. IEEE Commun. Mag. 2012, 50, 44–50. [Google Scholar] [CrossRef]

Figure 1. Coexistence flowchart of long-term evolution-unlicensed (LTE-U) adapted from [15].

Figure 2. LTE duty cycle approach.

Figure 3. Evaluation indoor scenario proposed by the 3rd generation partnership project (3GPP).

Figure 4. Data rate vs duty cycle (DC) for 2 Mbps of offered data rate per station.

Figure 5. Data rate vs DC for 4 Mbps of offered data rate per station.

Figure 6. Agent–Environment interaction in a Markov decision process (MDP) adapted from [62].

Figure 7. Proposed solution flowchart.

Figure 8. Aggregated transmitted data rate vs DC value including the proposed framework.

Figure 9. Cumulative density function (CDF) of transmitted data rate per user vs Q-Learning for

D C = 0.6

.

Figure 9. Cumulative density function (CDF) of transmitted data rate per user vs Q-Learning for

D C = 0.6

.

Figure 10. CDF of transmitted data rate per user vs Q-Learning for

D C = 0.7

.

Figure 10. CDF of transmitted data rate per user vs Q-Learning for

D C = 0.7

.

Figure 11. 10th-percentile user’s transmitted data rate.

Figure 12. 90th-percentile user’s transmitted data rate.

Table 1. Simulation parameters.

Wi-Fi parameter (802.11n-HT PHY/MAC)
Bandwidth	20 MHz
CCA—Energy Detection threshold	−62 dBm
CCA—Carrier sense threshold	−82 dBm
Bit Error Rate (BER) target	$10^{- 6}$
LTE parameters
Bandwidth	20 MHz
Packet scheduler	Proportional fair
ABS pattern duration	40 ms
Duty cycle values	{0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}
Common parameters
Tx power	−18 dBm
Traffic model	UDP full buffer
Mobility	Constant position
Scenario
d	5 m
bs_space	25 m
Number of LTE-U APs	4
Number of LTE-U stations	20
Number of Wi-Fi APs	4
Number of Wi-Fi stations	20
Path loss and Shadowing	ITU InH
Cell selection criteria	For Wi-Fi, AP with strongest RSS.
	For LTE-U, cell with the strongest RSRP.
UDPRate	{2 Mbps, 4 Mbps}.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

de C. Neto, J.M.; Neto, S.F.G.; M. de Santana, P.; de Sousa, V.A., Jr. Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework. Sensors 2020, 20, 1855. https://doi.org/10.3390/s20071855

AMA Style

de C. Neto JM, Neto SFG, M. de Santana P, de Sousa VA Jr. Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework. Sensors. 2020; 20(7):1855. https://doi.org/10.3390/s20071855

Chicago/Turabian Style

de C. Neto, José M., Sildolfo F. G. Neto, Pedro M. de Santana, and Vicente A. de Sousa, Jr. 2020. "Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework" Sensors 20, no. 7: 1855. https://doi.org/10.3390/s20071855

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Cell LTE-U/Wi-Fi Coexistence Evaluation Using a Reinforcement Learning Framework

Abstract

1. Introduction

2. Related Works

3. Wi-Fi and LTE-U MAC Layer

3.1. The 802.11 MAC Layer

3.2. The LTE-U Coexistence Mechanisms

4. System Model, Evaluation Scenario and Preliminary Results

4.1. System Model and Evaluation Scenario

4.2. Preliminaries Results

5. Proposed Reinforcement Learning Framework

5.1. Q-Learning

5.2. Proposed Framework

6. Proposed Solution Evaluation

7. Conclusions and Future Works

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI