greenMAC Protocol: A Q-Learning-Based Mechanism to Enhance Channel Reliability for WLAN Energy Savings

Ali, Rashid; Sohail, Muhammad; Almagrabi, Alaa Omran; Musaddiq, Arslan; Kim, Byung-Seo

doi:10.3390/electronics9101720

Open AccessArticle

greenMAC Protocol: A Q-Learning-Based Mechanism to Enhance Channel Reliability for WLAN Energy Savings

¹

School of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Korea

²

Faculty of Engineering Science, Technology and Management, Ziauddin University, Karachi 74700, Pakistan

³

Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia

⁴

Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si 38541, Korea

⁵

Department of Software and Communications Engineering, Hongik University, Sejong 30016, Korea

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2020, 9(10), 1720; https://doi.org/10.3390/electronics9101720

Submission received: 19 September 2020 / Revised: 13 October 2020 / Accepted: 14 October 2020 / Published: 19 October 2020

(This article belongs to the Section Networks)

Download

Browse Figures

Versions Notes

Abstract

:

We have seen a promising acceptance of wireless local area networks (WLANs) in our day-to-day communication devices, such as handheld smartphones, tablets, and laptops. Energy preservation plays a vital role in WLAN communication networks. The efficient use of energy remains one of the most substantial challenges to WLAN devices. Several approaches have been proposed by the industrial and institutional researchers to save energy and reduce the overall power consumption of WLAN devices focusing on static/adaptive energy saving methods. However, most of the approaches save energy at the cost of throughput degradation due to either increased sleep-time or reduced number of transmissions. In this paper, we recognize the potentials of reinforcement learning (RL) techniques, such as the Q-learning (QL) model, to enhance the WLAN’s channel reliability for energy saving. QL is one of the RL techniques, which utilizes the accumulated reward of the actions performed in the state-action model. We propose a QL-based energy-saving MAC protocol, named greenMAC protocol. The proposed greenMAC protocol reduces the energy consumption by utilizing accumulated reward value to optimize the channel reliability, which results in reduced channel collision probability of the network. We assess the degrees of channel congestion in collision probability as a reward function for our QL-based greenMAC protocol. The comparative results show that greenMAC protocol achieves enhanced system throughput performance with additional energy savings compared to existing energy-saving mechanisms in WLANs.

Keywords:

WLANs; MAC protocols; energy saving; green communication; Q learning

1. Introduction

Recently, energy harvesting and saving have become vital subjects of interest for researchers working on wireless communication technologies. Essentially, a wireless local area network (WLAN) device (also referred as a WLAN station—STA) is assumed to have the capacity to save energy while performing most of its important tasks—such as accessing the medium access control (MAC) layer channel—and resource allocation mechanisms. Such techniques are more prominent and needful when these STAs are low power and energy-constrained. Energy-saving techniques expand the lifetime of an STA and make them self-sustainable. Moreover, these techniques help lower carbon dioxide emissions to fight for climate change, and thus can also be included as green technology [1].

The WLAN radio interface is a fundamental source for the energy utilization of the STAs; for example, a Wi-Fi (such as IEEE 802.11n [2]) radio consumes over

70 %

of total energy in an STA with an screen-off state [3]. However, this energy consumption reduces to

44.5 %

for screen-off state and

50 %

for the screen-on state in the power saving mode (PSM) implemented by WLAN technologies [4]. In a WLAN STA, a wireless radio interface remains in one of the following states: transmit (TX), receive (RX), idle (IDL), or sleep (SLP). Most of the WLAN devices consume maximum energy in their active states (that is, TX and RX) and consume minimum energy in the SLP and IDL states [2]. However, in the IDL state, an STA needs to sense the channel continuously for the availability of the resources. Thus, a considerable amount of energy is used even in the IDL state. That happens in the carrier sense multiple access with collision avoidance (CSMA/CA) mechanism in IEEE 802.11, which is one of the distributed coordination functions (DCFs) in WLANs. In the CSMA/CA mechanism, each STA in the network must continually sense the channel for contention. PSMmechanisms [5,6,7] allow an STA to enter the sleep mode by powering off its wireless radio interface if the STA is not engaged in transmission.

The MAC layer decides how STAs share the transmission medium in a WLAN and controls the activities of their radio interfaces. Thus, it plays a significant role in accomplishing high throughput, lower delay, and energy efficiency. Current MAC layers of WLANs are classified either as contention-free (CF) or contention-based (CB) [8]. The CF uses predefined transmission slots to enable STAs to transmit without contention, while in CB, an STA proficiently uses CSMA/CA to contend for the channel with other STAs in the network. These CB schemes are more adaptable and efficient in dealing with the channel resources in a disseminated way for scarce networks (low density of STAs). However, for highly dense networks, there are high chances of collisions [9] due to an increase in channel contention. In WLANs, collisions are assumed if the acknowledgment (ACK) is not received in response to a data packet. For every collision, the STA must re-sense the channel to perform a re-transmission, where channel bandwidth and energy of the sender and receiver are unnecessarily consumed. Thus, an efficient and intelligent MAC layer channel access mechanism can limit the chances of transmission collisions, resulting in a reduced channel access delay and power consumption.

Motivated by Q-learning (QL), which is one of the prevailing reinforcement learning (RL) models [10], we propose a channel observation-based, energy-efficient PSM mechanism for WLANs, named the greenMAC protocol. QL is a behaviorist learning technique that learns from its environment with iterative interactions and exploits the accumulated experience. Figure 1 shows a typical QL-based intelligent STA that interacts with the wireless medium to learn its optimal actions. Observation-based channel collision probability reflects the density of the WLAN—that is, the higher the number of contenders, the higher the collision probability. The key contribution in the greenMAC protocol is to choose to go for SLP mode based on the channel collision probability.

The rest of the paper is organized as follows. Section 2 discusses related research work. In Section 3, we present our proposed QL-based PSM mechanism and a brief description of the QL model and its elements. Section 4 includes a performance evaluation of the proposed greenMAC protocol. Finally, we present our conclusion and future work in Section 5.

2. Research

The PSM mechanism is enhanced in WLAN energy saving by distinguishing the delay-sensitive data traffic, delay tolerance data traffic [11], and priority-based data traffic [12]. Many researchers have investigated the active/SLP mode scheduling that ensures the lower delay requirements [13,14,15,16]. Several techniques for increasing the power performance of WLANs have been suggested. Vukadinovic et al. [17] proposed a traffic alert approach for updating the standard PSM approach of ad hoc WLANs: when a data frame is transmitted over several hops, only the next hop STA is informed of the pending frame; STAs on other hops remain in doze mode and thus cause a long end-to-end (E2E) delay. Their proposed scheme requires each STA along the routing route to forward a traffic alert to its downstream neighbor. It results in the transmission of the data frames in a single beacon cycle over several hops, and the E2E delay in the multi-hop transfer is significantly minimized. Radwan et al. [18] proposed a solution where STAs are required to work together to take advantage of the strong channel capacity of short-range (SR) connections to minimize transmission time and preserve energy. The neighboring STAs form a cluster, and a head of the cluster is chosen to relay data traffic. Instead of transmitting data directly through the long-range (LR) communication protocol to AP, the STAs send their data to the cluster head using SR networking. The cluster head will then, on behalf of other STAs, relay the traffic to the AP using the LR communication. Tang et al. [19] provided a power-saving protocol for reduced power consumption of APs. This strategy helps AP to reach SLP state while there is no transmission traffic for a while. Equipped wake-up transceivers will relay wake-up signals to AP. This strategy reduces APs’ power consumption by reducing the amount of time spent in IDL state. However, their suggested strategy includes installing new radios with STAs, and additional methods are often required to handle the operation of the radios. Lin et al. [20] points out that WLAN STAs also waste their power on IDL state in communication mode, as the STAs can constantly feel the channel and overhear the continuing transmissions of the other STAs. Additionally, the collisions between the STAs which wake up for data recovery at the same time may cause power loss. The authors proposed a DeepSleep scheme for energy-harvesting systems to improve the WLAN PSM, where STAs short of power will reach long-term IDL state and only access the channel with a higher priority. In [21], He et al. present a TDMA-based MAC protocol to decrease contention among WLAN STAs. An AP divides a BI into many equal-time slices and allocates the slices to single or groups of STAs. Therefore, each STA wakes up in its allocated time slot for data retrieval instead of contending for channel access. By eliminating channel contention, this approach effectively decreases the energy consumption of PSM devices. However, if a PSM system does not wake up in its time slot, it will waste the allocated channel. Moreover, because all time slots have the same length without considering frame length or traffic load, in the case of short frames or light traffic, the allocated time slots can be used ineffectively. Eun-Sun et al. [22] proposed an improved PSM (IPSM) to change the size of the ATIM window accordingly. During the predefined ATIM window, when a certain number of IDL channels are sensed, STAs may terminate the ATIM window and start transmitting data frames. Otherwise, if the current ATIM window length is too short, STAs will dynamically expand the window size for a given scale. While this protocol can efficiently enhance the WLAN PSM performance, a hidden terminal problem is missing. Lei et al. [23] suggest a reservation scheme for back-off counters (BC), which is paired with a neighboring polling solution. The authors propose a BC reservation system for STAs to minimize the risk of selecting the same BC at random. Based on the proposal, devices that have successfully transmitted a control frame, ATIM frame, will reserve a BC by that frame, and use the reserved BC to continue with the following data transmissions. Furthermore, the authors present a neighboring polling scheme to minimize the hidden STA problems. As the wireless transmission has a broadcast nature, the STAs located at the transmission range of the transmitter will overhear an ongoing transmission colliding with a transmission from a hidden STA. It helps one of the neighbors to poll the transmitter again, using continuing transmission from the neighbor. However, a BC reservation scheme has similar results to a fixed channel access mechanism, such as TDMA.

Hence, from the above-related research discussion, we see that the PSM-based MAC protocols can provide higher performance with low energy consumption. However, this performance enhancement is greatly affected by the increase of the number of STAs in the WLAN. It increases the contention among the STAs, which increases the channel sensing time of the STAs.

3. Proposed QL-Based PSM Mechanism

3.1. Existing PSM

In the PSM mechanism, the channel access and transmission time are divided into beacon intervals (BI). At the start of each BI, the access point (AP) communicates a traffic indication message (TIM) to inform a PSM-interested STA. Thus, the STAs with data packets to send for the AP remain awake during the BI period and ask the AP to receive the data packets. The STAs that have data packets in the queue for transmission to the AP likewise remain awake and send their data packets to the AP during the BI [2]. Figure 2a shows the working of a PSM mechanism in a WLAN, where an AP initiates BI with the transmission of a TIM beacon. Once a TIM is successfully received, STAs around the AP proceed for channel contention to transmit a PS-poll message after observing a DCF interframe space (DIFS) idle period. During the contention, a standard BEB mechanism is used. An STA recognizes its transmission as successful if the acknowledgment (ACK) message is received from the AP. After a successful PS-poll transmission, STA awakens for data reception on the channel, as shown in the figure. Once the data are successfully received, an STA changes its state to SLP mode for energy saving purposes, while AP may remain busy in its other tasks, such as transmission to other STAs in the WLAN.

3.2. Q Learning Model: Environment and Elements

A QL model has three primary elements related to its environment; strategy, reward function, and Q-value function.

3.2.1. Strategy/Policy

A strategy (also known as a policy) portrays the learning agent’s (which is an STA in our case) way of taking action at a given time. Moreover, a policy is a mapping between the actions and their evident states, which analyzes a set of action–response relationships. Generally, a policy is a mathematical function or a simple lookup matrix, and it may include complex computation, such as the pursuit process. A policy is the essence of a QL-enabled STA and alone is enough to choose its behavior.

3.2.2. Reward

A reward describes the aim of the QL agent (STA). At each time step, the system generates a feedback value called the reward of the action taken. STA’s core goal is to maximize the total reward collected from the environment. Thus, it is the reward which is the fundamental purpose behind changing the policy at any state; if action is chosen by the policy that brings a small reward, the policy may be changed to pick some other action for that state later on.

3.2.3. Q-Value Function

Reward shows the instant response of the action in any state, whereas a Q-value function shows what is best at last by accumulating each time system’s reward for this action in this specific state. A state may have a high Q-value even though it yields a low reward if it is visited by the agent very often. Thus, we seek to move us to the highest value states, not the highest reward states.

QL always tries to find and follow an optimal policy for any finite Markov decision process (MDP), especially for a murky environment model [10].

3.3. GreenMAC Protocol

An intelligent QL-based PSM mechanism based on the channel observation approach is used to resolve the energy deprivation issue due to high contention in the WLANs caused by the CSMA/CA of the conventional PSM mechanism. The proposed greenMAC protocol guarantees energy savings while preserving the throughput of the network. In greenMAC protocol, the competing STAs observe the number of busy slots

S_{b u s y}^{i}

in

B_{o b s}^{i}

, the number of back-off slots, that is,

B_{o b s}^{i} = S_{b u s y}^{i} + S_{i d l e}^{i}

, as shown in Figure 2b. As we see in Figure 2a, back-off is performed at least two times when an STA is willing to transmit/receive data, once for PS-poll, and after that for data packet transmission/reception. Therefore, an STA must observe and measure the channel density probability

p_{d}

every time it proceeds for the back-off, as shown in Figure 2b. Hence,

p_{d}

is determined as follows,

p_{d} = \frac{\sum_{i = 1}^{n} S_{b u s y}^{i}}{\sum_{i = 1}^{n} B_{o b s}^{i}}, i = {1, 2, 3, \dots, n},

(1)

where n is the number of times a channel is observed (or the number of times the back-off procedure is performed). For example, as shown in Figure 2b, STA randomly selects its back-off value

B = 12

for its first back-off stage (that is, twelve idle slots,

S_{i d l e}^{i} = 12

) and

B = 9

(that is, nine idle slots,

S_{i d l e}^{i + 1} = 9

). The STA observes three busy slots in first back-off (that is,

S_{b u s y}^{i} = 3

) and two busy slots in the second back-off (that is,

S_{b u s y}^{i + 1} = 2

). The total number of observed slots is equal to the sum of all the idle and busy slots during these two back-off stages (that is,

B_{o b s} = S_{i d l e}^{i} + S_{i d l e}^{i + 1} + S_{b u s y}^{i} + S_{b u s y}^{i + 1}

). Thus, according to Equation (1), where

B_{o b s} = 12 + 9 + 3 + 2 = 26

and

S_{b u s y} = 3 + 2 = 5

, we obtain the value of

p_{d}

as

p_{d} = \frac{5}{26} = 0.192

.

The proposed QL-based mechanism considers the number of back-offs as an available set of states, which is

S = {1, 2, 3, \dots, n}

, and the decision to go to sleep or awake is considered as an action set, that is,

A = {0, 1}

(where 0 means sleep mode and 1 denotes awake mode). At time t, an action

a_{t}

is performed in a particular state

s_{t}

to obtain a reward

R_{t} (s_{t}, a_{t})

, with the aim of exploiting the accumulated Q-value function,

Q_{t} (s_{t}, a_{t})

. This Q-value is accumulated every time an STA performs the action and perceives the resulting reward. With this action

a_{t}

, an STA moves from state

s_{t}

to

s_{t + 1}

. The QL-based mechanism aims to discover an optimal policy that can exploit the accumulated reward. The Q-value function

Q_{t} (s_{t}, a_{t})

is updated as follows [10]:

Q_{t} (s_{t}, a_{t}) = (1 - α) \times Q_{t} (s_{t}, a_{t}) + α \times Δ Q_{t} (s_{t}, a_{t}),

(2)

In Equation (2),

α

is a learning rate and defined as

0 < α < 1

. The convergence of the QL-based algorithm is based on the learning estimate

Δ Q_{t} (s_{t}, a_{t})

, and is given by

Δ Q_{t} (s_{t}, a_{t}) = {R_{t} (s_{t}, a_{t}) + γ \times m a x_{a} Q_{t + 1} (s_{t + 1}, a_{t + 1})} - Q_{t} (s_{t}, a_{t}),

(3)

where

γ

is the discount factor (

0 < γ < 1

) to determine the importance of future Q-value, that is,

Q_{t + 1} (s_{t + 1}, a_{t + 1})

. In Equation (3),

m a x_{a}

shows the exploitation of Q-value function with respect to a.

One of the key characteristics of a QL algorithm is to maximize the instant reward by continuous exploitation (known as exploitation or greedy strategy). A reasonable alternative is to exploit it more often; however, sometimes, the learning STA must explore the environment for changes (known as exploration or non-greedy strategy). The QL algorithm uses a probabilistic combination of greedy and non-greedy strategies, known as the

ε

-greedy mechanism [10], which uses

ε

probability for exploration and

1 - ε

probability for exploitation.

Finally, we consider

p_{d}

estimated during the number of back-offs performed as the reward of an action. Thus, a reward given by action

a_{t}

taken at state

s_{t}

is given by,

R_{t} (s_{t}, a_{t}) = p_{d} .

(4)

The proposed greenMAC protocol defines a threshold value

T_{v a l u e}

. In the exploitation, an STA checks if the Q-value is higher than the

T_{v a l u e}

before going to sleep. A higher Q-value shows a high density of STAs in a WLAN; thus, an STA may choose to go to SLP mode more often to avoid collisions in the network. It results in decreased network collision and reduced energy consumption as well. Figure 3 shows the flowchart of the functionalities of our proposed greenMAC protocol.

4. Performance Evaluation

In this section, the performance of our proposed greenMAC protocol is evaluated based on ns3 (network simulator 3) simulations [24], with an IEEE 802.11ax WLAN model. Table 1 shows a few of the important simulation parameters used for the performance evaluation of greenMAC protocol. In our simulation environment, every STA measures the channel density probability

p_{d}

by counting the number of transmissions on the channel during the back-off mechanism. An STA senses others’ transmissions on the channel and increments the

S_{b u s y}

counter if it is found to be busy, while an idle channel

S_{i d l e}

is decremented. Since most of the 802.11 WLANs are of limited mobility, the wireless channel in our experiments is considered as stationary. However, the network dynamics, such as mobility and increase/decrease of STAs within the WLAN would be interesting to know as well, which we consider as our future works.

We simulated 10 contending STAs for 20 learning episodes, varying between

α

and

γ

with a small value (0.2), a medium value (0.5), and a large value (0.9). For a balanced exploration and exploitation, the probability of

ε

was set to 0.5. Figure 4a shows the learning estimate (

Δ Q

) convergence from Equation (3) for different learning rate values (

α

) while keeping the discount factor as

γ = 0.9

. Similarly, Figure 4b shows the learning estimate (

Δ Q

) convergence for different discount factor values (

γ

) while keeping the learning rate as

α = 0.9

. The figures show how a greater value of

α

and

γ

makes the convergence of

Δ Q

faster.

Figure 5 represents the throughput (Mbps) comparison of the existing PSM mechanism and our proposed greenMAC protocol in a WLAN. In the figure, we see that the throughput of the WLAN network environment decreases with the increase of contending STAs, which is obvious due to increased contention. Our proposed greenMAC protocol enhances the throughput performance of the WLAN as well. This performance enhancement in terms of throughput is very slight. However, this performance gain indicates that the proposed mechanism does not scarify its performance for energy-saving purposes.

We evaluate the performance of our proposed greenMAC protocol compared with the existing PSM mechanism and two of the approaches from the related research works—DeepSleep [20] and BC-counter-based [23]. The energy consumption of the STAs in a WLAN environment increases as the number of contending STAs increases due to the massive contention process for data frames transmission. Therefore, in Figure 6 we compare the the total energy consumed in the network. The figure illustrates that existing PSM approach consumes higher total energy as compared to the other approaches—the proposed greenMAC, DeepSleep, and BC-counter. The enhanced energy savings of greenMAC protocol in this figure show that a machine learning-based approach can learn the network environment and optimize the power-saving procedure. From the evaluation of Figure 5 and Figure 6, we observe that the existing PSM mechanism decides to sleep so often to save energy consumption that it results in decreased throughput. However, our proposed greenMAC protocol chooses to stay awake longer if the density of the STAs in WLAN is less and decides to sleep more often if the network is highly dense. The decision to be awake or to sleep is based on the QL mechanism, which allows an STA to converge its channel observation-based collision probability. That ultimately results in the lesser energy consumption in the network.

5. Conclusions and Future Work

Recently, it was observed that the use of WLAN-enabled devices dramatically increased. The issue of high energy consumption by WLAN-enabled devices (STAs) remains a highly critical challenge regardless of the substantial growing recognition of WLANs. Researchers from institutes and industries, have highlighted numerous weaknesses within the existing PSM mechanism and proposed many approaches addressing this issue. Most of the approaches proposed by the researchers to save energy and reduce the overall energy consumption of WLAN STA focus on static/adaptive PSM mechanisms. Researchers have addressed several issues and limitations concerning their energy utilization and network performance degradation. In this paper, we recognize the potential of machine-learning techniques, such as the QL algorithm, to enhance the WLAN’s channel reliability for energy saving. A QL-based energy-saving MAC protocol, called the greenMAC protocol, is proposed for this purpose. Our proposed greenMAC protocol mainly depends on the density of the WLAN environment by measuring channel observation-based collision probability. The density of the WLAN environment is assessed as a reward function for our QL-based greenMAC protocol. The proposed greenMAC protocol chooses to turn sleep mode on or off based on the channel density probability, which results in reduced channel collision probability of the network and helps to save energy of the STAs. The comparative simulation results show that greenMAC protocol achieves enhanced system throughput performance with additional energy savings compared to the existing PSM mechanism of WLANs and other related approaches.

For future work, we aim to explore our current QL-based protocol for dynamic WLAN environments and to enhance its potential. We will specifically focus on implementing the evaluation of the proposed mechanism and comparison with existing PSM mechanisms proposed by the researchers. We also aim to evaluate the performance of the QL-based PSM mechanism for a delay-sensitive WLAN environment with QoS requirements.

Author Contributions

Conceptualization, R.A. and M.S.; methodology, R.A., M.S. and A.O.A.; validation, R.A., A.O.A. and B.-S.K.; formal analysis, R.A., A.M. and B.-S.K.; investigation, B.-S.K.; resources, R.A. and B.-S.K.; writing—original draft preparation, R.A.; writing—review and editing, A.M., A.O.A. and B.-S.K.; supervision, B.S.K.; project administration, B.-S.K.; funding acquisition, B.-S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Electronics and Telecommunications Research Institute (ETRI) grant funded by the ICT R&D program of MSIT/IITP (number 2018-0-01411, A Micro-Service IoTWare Framework Technology Development for Ultra small IoT Device).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

QL	Q-learning
WLAN	Wireless Local Area Networks
STA	Station
MAC	Medium Access Control
PSM	Power Saving Mode
TX	Transmit
RX	Receive
IDL	Idle
SLP	Sleep
CSMA/CA	Carrier Sense Multiple Access with Collision Avoidance
DCF	Distributed Coordination Function
CF	Contention-Free
CB	Contention-Based
ACK	Acknowledgment
RL	Reinforcement Learning
MDP	Markov Decision Process

References

Ren, J.; Yue, S.; Zhang, D.; Zhang, Y.; Cao, J. Joint channel assignment and stochastic energy management for rf-powered ofdma wsns. IEEE Trans. Veh. Technol. 2019, 68, 1578–1592. [Google Scholar] [CrossRef]
IEEE 802.11 WG. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications; IEEE Std. 802.11n; The Institute of Electrical and Electronics Engineers: New York, NY, USA, 2012; pp. 1003–1005, 1432–1441. [Google Scholar]
Pering, T.; Agarwal, Y.; Gupta, R.; Want, R. CoolSpots: Reducing the power consumption of wireless mobile devices with multiple radio interfaces. In Proceedings of the 4th International Conference on Mobile Systems, Applications and Services, Uppsala, Sweden, 19–22 June 2006; pp. 220–232. [Google Scholar]
Malekshan, K.R.; Zhuang, W.; Lostanlen, Y. An Energy Efficient MAC Protocol for Fully Connected Wireless Ad Hoc Networks. IEEE Trans. Wirel. Commun. 2014, 13, 5729–5740. [Google Scholar] [CrossRef] [Green Version]
Chen, B.; Jamieson, K.; Balakrishnan, H.; Morris, R. Span: An energyefficient coordination algorithm for topology maintenance in Ad Hoc wireless networks. ACM Trans. Wirel. Netw. 2002, 8, 481–494. [Google Scholar] [CrossRef]
Rodoplu, V.; Meng, T.H. Minimum energy mobile wireless networks. IEEE J. Sel. Areas Commun. 1999, 17, 1333–1344. [Google Scholar] [CrossRef] [Green Version]
Jung, E.-S.; Vaidya, N.H. An energy efficient MAC protocol for wireless LANs. In Proceedings of the Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, New York, NY, USA, 23–27 June 2002; pp. 1756–1764. [Google Scholar]
Ali, R.; Kim, S.W.; Kim, B.; Park, Y. Design of MAC layer resource allocation schemes for IEEE 802.11ax: Future directions. IETE Tech. Rev. 2018, 35, 28–52. [Google Scholar] [CrossRef]
Ali, R.; Shahin, N.; Bajracharya, R.; Kim, B.; Kim, S.W. A self-scrutinized back-off mechanism for IEEE 802.11ax in 5G unlicensed networks. Sustainability 2018, 10, 1201. [Google Scholar] [CrossRef] [Green Version]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
Dogar, F.R.; Steenkiste, P.; Papagiannaki, K. Catnap: Exploiting high bandwidth wireless interfaces to save energy for mobile devices. In Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, Seoul, Korea, 17–21 June 2010; pp. 107–122. [Google Scholar]
Rozner, E.; Navda, V.; Ramjee, R.; Rayanchu, S. NAPman: Network-assisted power management for Wi-Fi devices. In Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, San Francisco, CA, USA, 15–18 June 2010; pp. 91–106. [Google Scholar]
Anand, M.; Nightingale, E.B.; Flinn, J. Self-tuning wireless network power management. Wirel. Netw. 2005, 11, 451–469. [Google Scholar] [CrossRef] [Green Version]
Qiao, D.; Shin, K.G. Smart power-saving mode for IEEE 802.11 wireless LANs. In Proceedings of the IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, Miami, FL, USA, 13–17 March 2005; pp. 1573–1583. [Google Scholar]
Namboodiri, V.; Gao, L. Energy-efficient VoIP over wireless LANs. IEEE Trans. Mob. Comput. 2010, 9, 566–581. [Google Scholar] [CrossRef]
Krashinsky, R.; Balakrishnan, H. Minimizing energy for wireless web access with bounded slowdown. Wirel. Netw. 2005, 11, 135–148. [Google Scholar] [CrossRef] [Green Version]
Vukadinovic, V.; Glaropoulos, I.; Mangold, S. Enhanced power saving mode for low-latency communication in multi-hop 802.11 networks. Ad Hoc Netw. 2014, 23, 18–33. [Google Scholar] [CrossRef]
Radwan, A.; Rodriguez, J. Energy saving in multi-standard mobile terminals through short-range cooperation. EURASIP J. Wirel. Commun. Netw. 2012, 2012, 159. [Google Scholar] [CrossRef] [Green Version]
Tang, S.; Yomo, H.; Kondo, Y.; Obana, S. Wake-up receiver for radio-on-demand wireless LANs. EURASIP J. Wirel. Commun. Netw. 2012, 2012, 42. [Google Scholar] [CrossRef] [Green Version]
Lin, H.H.; Shih, M.J.; Wei, H.Y.; Vannithamby, R. DeepSleep: IEEE 802.11 enhancement for energy-harvesting machine-to-machine communications. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Anaheim, CA, USA, 3–7 December 2012. [Google Scholar]
He, Y.; Yuan, R. A novel scheduled power saving mechanism for 802.11 wireless LANs. IEEE Trans. Mob. Comput. 2009, 8, 1368–1383. [Google Scholar]
Jung, E.-S. Improving IEEE 802.11 power saving mechanism. Wirel. Netw. 2007, 14, 375–391. [Google Scholar] [CrossRef]
Lei, X.; Rhee, S.H. Improving the IEEE 802.11 power-saving mechanism in the presence of hidden terminals. EURASIP J. Wirel. Commun. Netw. 2016, 2016, 26. [Google Scholar] [CrossRef] [Green Version]
The Network Simulator ns-3. Available online: https://www.nsnam.org/ (accessed on 5 March 2018).

Figure 1. A QL-based WLAN station (STA) interacting with its QL environment for learning.

Figure 2. (a) Existing PSM mechanism in a WLAN. (b) Channel observation-based collision probability measurement.

Figure 3. Flowchart to describe functionalities of greenMAC protocol.

Figure 4. (a) Learning estimate (

Δ Q

) convergence varying value of

α

(

ε = 0.5

)), and (b) learning estimate (

Δ Q

) convergence varying value of

γ

(

ε = 0.5

)).

Figure 4. (a) Learning estimate (

Δ Q

) convergence varying value of

α

(

ε = 0.5

)), and (b) learning estimate (

Δ Q

) convergence varying value of

γ

(

ε = 0.5

)).

Figure 5. Throughput (Mbps) comparison of the proposed greenMAC protocol and the existing PSM mechanism.

Figure 6. Comparison of total energy consumption in joules (J) of the proposed greenMAC, existing PSM, DeppSleep, and BC-counter mechanisms in a network of thirty STAs.

Table 1. A few of the important simulation parameters.

Parameters	Value
WLAN	IEEE 802.11ax
Frequency	5 GHz
Modulation and Coding Scheme (MCS) number	6
Channel Bandwidth	40 MHz
Data rate	154.9 Mbps
Simulation Time	10/100 s
Guard Interval (GI)	800 ns
Data Payload	1472 Bytes
Distance Between AP and STA	1.0 m
Learning rate ( $α$ )	0.2, 0.5, 0.9
Discount factor ( $γ$ )	0.2, 0.5, 0.9
Exploration/Exploitation ( $ε$ )	0.5

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ali, R.; Sohail, M.; Almagrabi, A.O.; Musaddiq, A.; Kim, B.-S. greenMAC Protocol: A Q-Learning-Based Mechanism to Enhance Channel Reliability for WLAN Energy Savings. Electronics 2020, 9, 1720. https://doi.org/10.3390/electronics9101720

AMA Style

Ali R, Sohail M, Almagrabi AO, Musaddiq A, Kim B-S. greenMAC Protocol: A Q-Learning-Based Mechanism to Enhance Channel Reliability for WLAN Energy Savings. Electronics. 2020; 9(10):1720. https://doi.org/10.3390/electronics9101720

Chicago/Turabian Style

Ali, Rashid, Muhammad Sohail, Alaa Omran Almagrabi, Arslan Musaddiq, and Byung-Seo Kim. 2020. "greenMAC Protocol: A Q-Learning-Based Mechanism to Enhance Channel Reliability for WLAN Energy Savings" Electronics 9, no. 10: 1720. https://doi.org/10.3390/electronics9101720

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

greenMAC Protocol: A Q-Learning-Based Mechanism to Enhance Channel Reliability for WLAN Energy Savings

Abstract

1. Introduction

2. Research

3. Proposed QL-Based PSM Mechanism

3.1. Existing PSM

3.2. Q Learning Model: Environment and Elements

3.2.1. Strategy/Policy

3.2.2. Reward

3.2.3. Q-Value Function

3.3. GreenMAC Protocol

4. Performance Evaluation

5. Conclusions and Future Work

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI