Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning

Uploaded by

Serene Banerjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views

Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning

Uploaded by

Serene Banerjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

1

Contention Window Optimization in IEEE 802.11ax

Networks with Deep Reinforcement Learning
Witold Wydmaski and Szymon Szott

Abstract—The proper setting of contention window (CW) val- control theory1 . A recent example of applying RL to wireless
ues has a significant impact on the efficiency of Wi-Fi networks. local area networks include a jamming countermeasure [5]
Unfortunately, the standard method used by 802.11 networks and an ML-enabling architecture [6]. RL performance can
is not scalable enough to maintain stable throughput for an
increasing number of stations, despite 802.11ax being designed be further improved by using deep artificial neural networks
arXiv:2003.01492v1 [cs.NI] 3 Mar 2020

to improve Wi-Fi performance in dense scenarios. To this end with their potential for interpolation and superior scalability. A
we propose a new method of CW control which leverages deep recent example of using deep RL (DRL) in wireless networks
reinforcement learning principles to learn the correct settings is an adaptable MAC protocol [7]. The authors of [8] also
under different network conditions. Our method supports two claim to use DRL in the area of CW optimization. However, a
trainable control algorithms, which, as we demonstrate through
simulations, offer efficiency close to optimal while keeping com- careful reading reveals that they use Q-learning (a typical RL
putational cost low. method) but without the neural network (deep) component.
Thus, we conclude that DRL has not yet been successfully
Index Terms—Contention window, IEEE 802.11, MAC proto-
cols, reinforcement learning, WLAN applied to study IEEE 802.11 CW optimization.
In this letter, we describe CCOD (Centralized Contention
window Optimization with DRL), our proposed method of
I. I NTRODUCTION applying DRL to the task of optimizing saturation throughput
of 802.11 networks by correctly predicting CW values. While
T HE latest IEEE 802.11 amendment (802.11ax) is sched-
uled for release in 2020, with the goal of increasing
Wi-Fi network efficiency. However, to ensure backward com-
CCOD is universally applicable to any 802.11 network, we
exhibit its operation under 802.11ax using two DRL methods:
patibility, one efficiency-related aspect remains unchanged in Deep Q-Network (DQN) [9] and Deep Deterministic Policy
802.11ax: the basic channel access method [1]. This method Gradient (DDPG) [10]. The former is considered a showcase
is an implementation of carrier-sense multiple access with DRL algorithm, while the latter is a more advanced method,
collision avoidance (CSMA/CA) wherein each station backs able to directly learn the optimal policy, which we expect will
off by waiting a certain number of time slots before accessing lead to increased network performance, especially in dense
the channel. This number is chosen at random from 0 to CW scenarios. Additionally, we demonstrate how we applied time
(the contention window). To reduce the probability of several series analysis to the recurrent neural networks of both DRL
stations selecting the same random number, CW is doubled methods. Finally, we provide the complete source code so that
after each collision. IEEE 802.11 defines static CW minimum the work can serve as a stepping stone for further development
and maximum values and this approach, while being robust to of DRL-based methods in 802.11 networks2 .
network changes and requiring few computations, can lead to
inefficient operation, especially in dense networks [2]. II. DRL BACKGROUND
CW optimization has a direct impact on network perfor- In general, RL is based on interactions, in which the agent
mance and has been frequently studied (e.g., using control and environment exchange information regarding the state
theory [3]). With the proliferation of network devices with of the environment, the action the agent can take, and the
high computational capabilities, CW optimization can now be reward given to the agent by the environment. Through a
analyzed using reinforcement learning (RL) [4]. RL is well- training process, the agent enhances its decision-making policy
suited to the problem of improving the performance of wireless until it learns the best possible decision in every state of the
networks because it deals with intelligent software agents environment that the agent can visit. In DRL, the agent’s policy
(network nodes) taking actions (e.g., optimizing parameters) is based on a deep neural network which requires training. We
in an environment (wireless radio) to maximize a reward consider two DRL methods differing on their action space:
(e.g., throughput) [4]. RL is an example of model-free policy discrete (DQN) and continuous (DDPG).
optimization, offering better generalization capabilities than DQN is based on Q-learning [4], which attempts to predict
conventional, model-based optimization approaches such as an expected reward for each action, making it an example of
W. Wydmaski and S. Szott are with AGH University, Krakow, Poland. This
a value-based method. DQN’s additional deep neural network
work was supported by the Polish Ministry of Science and Higher Education
1 This means that while RL algorithms try to directly learn an optimal policy
with the subvention funds of the Faculty of Computer Science, Electronics
and Telecommunications of AGH University. This research was supported in without learning the model of the environment, model-based approaches need
part by PLGrid Infrastructure. The authors wish to thank Jakub Mojsiejuk for to make assumptions about the model’s next state before choosing an action.
his remarks on an early draft of the paper. 2 https://github.com/wwydmanski/RLinWiFi

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
2

allows for more efficient extrapolation of rewards for yet CCOD operates in three phases. In the first, pre-learning
unseen states than in basic Q-learning. phase, the Wi-Fi network is controlled by legacy 802.11. This
Conversely, DDPG is an example of a policy-based method, serves as a warm-up for CCOD’s DRL algorithms. Afterwards,
because it tries to learn the optimal policy directly. Addition- in the learning phase, the agent undertakes decisions regarding
ally, it can produce unbounded continuous output meaning that the CW value following the TRAIN procedure of Algorithm
it can recognize that the action space is an ordered set (as in 1. The preprocessing in the algorithm consists of calculating
the case of CW optimization)3 . DDPG comprises two neural the mean and standard deviation of the history of recently
networks: an actor and a critic. The actor makes decisions observed collision probabilities H(pcol ) (of length h) using
based on the environment state, while the critic is a DQN-like a moving window of a fixed size and stride. This operation
neural network that tries to learn the expected reward for the changes the data’s shape from one- to two- dimensional (each
actor’s actions. step of the moving window yields two data points). This
collection can then be interpreted as a time series, which
III. A PPLYING DRL TO W I -F I means it can be analysed by a recurrent neural network. Their
design allows for a more in-depth understanding of both the
To apply DRL principles to Wi-Fi networks, we propose the
immediate and indirect relations between agent actions and
CCOD method, which comprises an agent, the environment
network congestion compared to a one-dimensional analysis
states, the available actions, and the received rewards. In
with a dense neural network.
summary, the CCOD agent is a module which observes the
state of the Wi-Fi network, selects appropriate CW values To enable exploration, each action is modified by a noise
(from the available actions) in order to maximize network factor, which decays over the course of the learning phase. For
performance (the reward). DQN, noise is the probability of overriding the agent’s action
The agent is located in the access point (AP), because with a random action. For DDPG, noise is sampled from a
the AP has a global view of the network, it can control its Gaussian distribution and added to the decision of the agent.
associated stations in a centralized manner (through beacon The final, operational phase starts after completing training,
frames), and it can handle the computational requirements which is determined by a user-set time limit. The agent is
of DRL. Furthermore, a CCOD AP can potentially exchange considered to be fully trained and will no longer receive any
information with other APs and become part of an SDN-based updates, so rewards are no longer needed. In this phase, CW
multi-agent Wi-Fi architecture [2]. is updated using the OPTIMIZE procedure of Algorithm 1.
We define the environment states as the current collision Once an agent is trained, it can be shared among APs.
probability pcol observed in the network calculated based on
the number of transmitted frames Nt x and correctly received Algorithm 1 CW optimisation using CCOD
frames Nr x : 1: procedure T RAIN(H(pcol ), load, a)
Nt x − Nr x
pcol = . (1) 2: . load - data sent since last interaction
Nt x 3: . a - previous action
The pcol measurements are done within predefined interaction 4: . s - state
periods and reflect the performance of the currently selected 5: s ← preprocess(H(pcol ))
CW value. In practice, pcol is not immediately available to the 6: r ← normalize(load)
agent, but since the AP takes part in all frame transmissions (as 7: agent.step(s, a, r) . Train the neural network
sender or recipient), the agent requires only obtaining Nt x from 8: a 0 ← agent.act(s) + noise
CW ← 2a +4
0
each station, which can be piggybacked onto data frames (Nr x 9:
is known at the AP based on the number of sent or received 10: return CW
acknowledgement frames). Note that this overhead is required 11: end procedure
only in the learning phase (described below).
The action of the agent is to configure the AP by setting 12: procedure O PTIMIZE(H(pcol )). Observed collision prob.
CW = 2a+4 − 1, where a ∈ [0, 6]. This range was chosen so 13: s ← preprocess(H(pcol ))
that CW fits into the original span of 802.11 values: from 15 14: a ← agent.act(s) . Pass through neural network
to 1023. We explore two algorithms with different outputs: 15: CW ← 2a+4
discrete (a ∈ N) for DQN and continuous (a ∈ R) for DDPG. 16: return CW
We use network throughput (the number of successfully 17: end procedure
delivered bits per second) as the reward in CCOD. This is
indicative of current network performance and can be observed
The application of DRL algorithms also requires configuring
at the AP. Since rewards in DRL should be a real number
certain key parameters. First, the performance of RL algo-
between 0 and 1, we normalize the throughput based on the
rithms depends on their reward discounts γ, which correspond
expected maximum throughput so that the rewards are cen-
to the importance of long term rewards over immediate ones.
tered around 0.5 (i.e., rewards above 0.5 indicate throughput
Second, the introduction of deep learning into RL algorithms
exceeding expectations).
creates an impediment in the form of many new hyperparam-
3 Discrete algorithms, like DQN, consider all possible actions as abstract eters so each neural network requires configuring a learning
alternatives. rate as an update coefficient. Third, since the learning is done
3

TABLE I of 60-second simulations (the first 14 rounds constituted

CCOD’ S DRL SETTINGS the learning phase, the last – the operational phase). Each
simulation consisted of 10 ms interaction periods, between
Paramter Value
which Algorithm 1 was run.
Interaction period 10 ms
History length h 300
DQN’s learning rate 4 × 10−4 V. R ESULTS
DDPG’s actor learning rate 4 × 10−4
DDPG’s critic learning rate 4 × 10−3 CCOD was evaluated in two different scenarios, for a static
Batch size 32 and dynamic number of stations, to assess various performance
Reward discount γ 0.7 aspects. We used two baselines for comparison: (a) the current
Replay buffer B size 18,000
operation of 802.11ax, denoted as standard 802.11, in which
CWmin = 24 − 1 and CWmax = 210 − 1, and (b) an idealized
case of a look-up table in which CWmin = CWmax = CW and
by mini-batch stochastic gradient descent, the correct choice CW ∈ {2x − 1|x ∈ [4, 10]}, where x depends on the number of
of batch size is also critical. Finally, both algorithms use a stations currently in the network. The look-up table (a mapping
replay buffer B, which records every interaction between the between the number of stations and CW) was prepared a priori
agent and the environment, and serves as a base for mini-batch by determining (with simulations) which CW values provide
sampling. best network performance (for multiples of five stations).

IV. S IMULATION MODEL

A. Static Scenario
We implemented CCOD in ns3-gym [11], which is a frame-
In the static scenario, there was a fixed number of stations
work for connecting ns-3 (a network simulator) with OpenAI
connected to the AP throughout the simulation. In theory, a
Gym (a tool for DRL analysis). The neural networks of DDPG
constant value of CW should be optimal in these conditions
and DQN were implemented in Pytorch and Tensorflow,
[2]. This scenario was designed to test whether CCOD’s
respectively.
algorithms are able to recognize this value and what is the
The ns-3 simulations used the following settings: error-free improvement over standard 802.11. For the look-up table
radio channels, IEEE 802.11ax at the PHY/MAC layers, the approach, the CW values remained static throughout the
highest modulation and coding scheme (1024-QAM with a experiment.
5/6 coding rate), single-user transmissions, a 20 MHz channel,
The results show that while 802.11 performance degenerates
frame aggregation disabled4 , and constant bit-rate UDP uplink
for larger networks, CCOD with both DDPG and DQN can
traffic to a single AP, with 1500 B packets and equal offered
optimize the CW value in static network conditions (Fig. 1a).
load calibrated to saturate the network. Also, we assumed
The improvement over standard 802.11 ranges from 1.5%
perfect and immediate transfer of state information to the
(for 5 stations) to 40% (for 50 stations). As anticipated,
agent (i.e., the current values of Nt x and Nr x are known at
CCOD’s operation reflects the performance of the look-up
the AP) as well as the immediate setting of CW at each
table approach.
station separately.5 The idealized simulation settings allow for
Fig. 1b presents the mean CW value selected by both
assessing the base performance of CCOD before moving to
CCOD’s algorithms in each round of simulating the static
more realistic topologies.
scenario for 30 stations. These results are from a single
The DRL algorithms were run with the parameters in experiment run and evidently 14 rounds of the learning phase
Table I, which were determined empirically through a lengthy are enough to converge to stable CW values.
simulation campaign to provide good performance for both
algorithms (their universality is left for further study). The
neural network architecture was the same for both algorithms: B. Dynamic scenario
one recurrent long short-term memory layer followed by two In the dynamic scenario, the number of transmitting stations
dense layers resulting in a [8-128-64] configuration. Using steadily increased from 5 to Ntot al stations, increasing the
a recurrent layer with a wide history window allowed the collision rate in the network. This scenario was designed to test
algorithms to take previous observations into account. The whether the algorithms are able to react to network changes.
preprocessing window was set to h2 with a stride of h4 , where For the look-up table approach, the CW values were updated
h is the history length. after every 5 stations joined the network.
Randomness was incorporated into both agent behavior and Fig. 2a shows how the number of stations increased in a sim-
network simulation. Each experiment was run for 15 rounds ulation run and how the CW values were updated accordingly.
CCOD, with both algorithms, decides on increasing the CW
4 Frame aggregation was disabled to speed up the experiments at the
value with the increasing number of stations. DQN strongly
cost of throughput. This does not qualitatively affect the network behavior
because if frame aggregation was enabled, the improvement would have been relies on oscillations between two (discrete) neighboring CW
proportional to the gain in throughput. values as a way of increasing throughput. DDPG’s continuous
5 In practice, relaxing the former assumption would require an overhead of
approach is able to follow the network behavior more closely,
around 100-200 B/s sent from the stations to the AP while relaxing the latter
assumption would require dissemination of CW values by the AP through and (in this run) settled on a lower final CW value. The change
periodic beacon frames. in CW in each simulation run is reflected in the change of
4

42 1000
CCOD w/ DQN
40 CCOD w/ DDPG
800

Network throughput [Mb/s]

38
36 600

Mean CW
34
400
32
Standard 802.11
30 Look-up table 200
28 CCOD w/ DQN
CCOD w/ DDPG 0
26
0 10 20 30 40 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of stations Round

(a) (b)
Fig. 1. Static scenario results: (a) network throughput, (b) mean CW values selected in each round (for 30 stations).

50 42 42
500 CW
CCOD w/ DQN 40 40
Network throughput [Mb/s]
CCOD w/ DDPG 40

Network throughput [Mb/s]

400 38 38
Number of stations

30 36 Standard 802.11 36
300
34 CCOD w/ DQN 34
CW

20 CCOD w/ DDPG
200 32 32
Standard 802.11
10 30 30 Look-up table
100 28 CCOD w/ DQN
28 CCOD w/ DDPG
Number of stations 0
26 26
0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50
Simulation time [s] Simulation time [s] Number of stations

(a) (b) (c)

Fig. 2. Dynamic scenario results. For a single round of CCOD’s operational phase and Nt ot al = 50: evolution of (a) number of stations and CW values
and (b) instantaneous network throughput. For varying Nt ot al : (c) network throughput.

instantaneous throughput (Fig. 2b). Standard 802.11 leads to simplifying assumptions. Also worth investigating are other
a decrease of up to 28% of the network throughput with the DRL algorithms as well as implementing a distributed version
increasing number of stations. CCOD is able to maintain the of CCOD.
efficiency on a similar level – the decrease of throughput
moving from 5 to 50 stations is only about 1% for both R EFERENCES
DDPG and DQN. Ultimately, the operation of both CCOD’s [1] B. Bellalta and K. Kosek-Szott, “AP-initiated multi-user transmissions
algorithms in the dynamic scenario lead to improved network in IEEE 802.11ax WLANs,” Ad Hoc Networks, vol. 85, pp. 145–159,
performance (Fig. 2c), both exceeding standard 802.11 and 2019.
[2] P. Gallo, K. Kosek-Szott, S. Szott, and I. Tinnirello, “CADWAN: A
matching the look-up table approach. Control Architecture for Dense WiFi Access Networks,” IEEE Commu-
nications Magazine, vol. 56, no. 1, pp. 194–201, 2018.
[3] P. Serrano et al., “Control theoretic optimization of 802.11 WLANs: Im-
VI. C ONCLUSIONS plementation and experimental evaluation,” Computer Networks, vol. 57,
We have presented CCOD – a method which leverages no. 1, pp. 258–272, 2013.
[4] C. Zhang, P. Patras, and H. Haddadi, “Deep learning in mobile and
deep reinforcement learning principles to learn the correct wireless networking: A survey,” IEEE Communications Surveys &
CW settings for 802.11ax under varying network conditions Tutorials, vol. 21, no. 3, pp. 2224–2287, 2019.
using two trainable control algorithms: DQN and DDPG. Our [5] F. Yao and L. Jia, “A collaborative multi-agent reinforcement learning
anti-jamming algorithm in wireless networks,” IEEE Wireless Commu-
experiments have shown that DRL can be successfully applied nications Letters, vol. 8, no. 4, pp. 1024–1027, 2019.
to the problem of CW optimization: both algorithms offer [6] F. Wilhelmi, S. Barrachina-Muoz, B. Bellalta, C. Cano, A. Jonsson, and
efficiency close to optimal (with DDPG being only slightly V. Ram, “A Flexible Machine Learning-Aware Architecture for Future
WLANs,” arXiv preprint http://arxiv.org/abs/1910.03510, 2019.
better than DQN), keeping the computational cost low (around [7] Y. Yu, T. Wang, and S. C. Liew, “Deep-reinforcement learning multiple
22 kflops, according to our estimations, excluding the one- access for heterogeneous wireless networks,” IEEE Journal on Selected
time training cost). As a result of the learning process, we Areas in Communications, vol. 37, no. 6, pp. 1277–1290, 2019.
[8] R. Ali et al., “Deep Reinforcement Learning Paradigm for Performance
have obtained a trained agent which can be directly installed Optimization of Channel ObservationBased MAC Protocols in Dense
in an 802.11ax AP. WLANs,” IEEE Access, vol. 7, pp. 3500–3511, 2019.
We conclude that the problem of CW optimization has [9] V. Mnih et al., “Human-level control through deep reinforcement learn-
ing,” Nature, vol. 518, no. 7540, p. 529, 2015.
provided the opportunity to showcase the features of DRL. Fu- [10] D. Silver, “Deterministic policy gradient algorithms,” Proceedings of
ture studies should focus on analyzing more realistic network ICML’14, vol. 32, pp. I–387–I–395, 2014.
conditions, where we expect DRL to outperform any analytical [11] P. Gawłowicz and A. Zubow, “ns-3 meets OpenAI Gym: The Playground
for Machine Learning in Networking Research,” in ACM MSWiM, 2019.
model-based CW optimization methods which are based on

ICI Overview
No ratings yet
ICI Overview
40 pages
Concise Guide to OTN optical transport networks
From Everand
Concise Guide to OTN optical transport networks
alasdair gilchrist
4/5 (2)
Concise Guide to DWDM
From Everand
Concise Guide to DWDM
alasdair gilchrist
5/5 (2)
Indoor Radio Planning: A Practical Guide for 2G, 3G and 4G
From Everand
Indoor Radio Planning: A Practical Guide for 2G, 3G and 4G
Morten Tolstrup
5/5 (1)
DeltaV SIS Module - Level Parameters
No ratings yet
DeltaV SIS Module - Level Parameters
1 page
Decentralized_Deep_Reinforcement_Learning_Approach
No ratings yet
Decentralized_Deep_Reinforcement_Learning_Approach
6 pages
RL_DRL_2019_Multiple Access for Heterogeneous Wireless Networks
No ratings yet
RL_DRL_2019_Multiple Access for Heterogeneous Wireless Networks
14 pages
!Reinforcement Learning for Optimizing Wi-Fi Access Channel Selection
No ratings yet
!Reinforcement Learning for Optimizing Wi-Fi Access Channel Selection
14 pages
2 - Deep Reinforcement Learning Based Rate Adaptation For Wi-Fi Networks
No ratings yet
2 - Deep Reinforcement Learning Based Rate Adaptation For Wi-Fi Networks
5 pages
5 - Application-Level Data Rate Adaptation in Wi-Fi
No ratings yet
5 - Application-Level Data Rate Adaptation in Wi-Fi
9 pages
Adaptive IEEE 802.15.4 Protocol For Reliable and Timely Communications
No ratings yet
Adaptive IEEE 802.15.4 Protocol For Reliable and Timely Communications
14 pages
Deep-Reinforcement Learning Multiple Access For Heterogeneous Wireless Networks
No ratings yet
Deep-Reinforcement Learning Multiple Access For Heterogeneous Wireless Networks
7 pages
Wu 2021
No ratings yet
Wu 2021
15 pages
Robust Deep Learning For Wireless Network Optimization
No ratings yet
Robust Deep Learning For Wireless Network Optimization
7 pages
Detection of Misbehavior Nodes in Wifi Networks
No ratings yet
Detection of Misbehavior Nodes in Wifi Networks
5 pages
Queue-Aware Adaptive Resource Allocation For OFDMA Systems Supporting Mixed Services
No ratings yet
Queue-Aware Adaptive Resource Allocation For OFDMA Systems Supporting Mixed Services
7 pages
Deep Reinforcement Learning for Multi-user
No ratings yet
Deep Reinforcement Learning for Multi-user
33 pages
A Reinforcement Learning Approach For Scheduling in Mmwave Networks
No ratings yet
A Reinforcement Learning Approach For Scheduling in Mmwave Networks
6 pages
Performance Analysis of Congestion-Aware Q-Routing Algorithm For Network On Chip
No ratings yet
Performance Analysis of Congestion-Aware Q-Routing Algorithm For Network On Chip
9 pages
VXLAN Network Virtualization Guide: Definitive Reference for Developers and Engineers
From Everand
VXLAN Network Virtualization Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
2504.03804v1
No ratings yet
2504.03804v1
7 pages
QoE-Driven_Adaptive_Deployment_Strategy_of_Multi-UAV_Networks_Based_on_Hybrid_Deep_Reinforceme
No ratings yet
QoE-Driven_Adaptive_Deployment_Strategy_of_Multi-UAV_Networks_Based_on_Hybrid_Deep_Reinforceme
14 pages
Game Theory and Learning at The Medium Access Control Layer For Distributed Radio Resource Sharing in Random Access Wireless Networks
No ratings yet
Game Theory and Learning at The Medium Access Control Layer For Distributed Radio Resource Sharing in Random Access Wireless Networks
8 pages
Machine Learning For Wifi
No ratings yet
Machine Learning For Wifi
45 pages
Stateless Reinforcement Learning
No ratings yet
Stateless Reinforcement Learning
5 pages
Performance Analysis of Collision Using IEEE802.11: Meghana Solanki, Sandhya Sharma, Ashish Sharma
No ratings yet
Performance Analysis of Collision Using IEEE802.11: Meghana Solanki, Sandhya Sharma, Ashish Sharma
5 pages
progress
No ratings yet
progress
30 pages
A Q-Learning-Based Distributed Queuing Mac Protoco
No ratings yet
A Q-Learning-Based Distributed Queuing Mac Protoco
27 pages
Delay Based Ton S7
No ratings yet
Delay Based Ton S7
14 pages
Cross-Layer Optimized Congestion, Contention and Power Control in Wireless Ad Hoc Networks
No ratings yet
Cross-Layer Optimized Congestion, Contention and Power Control in Wireless Ad Hoc Networks
10 pages
Cross Layering Using Reinforcement Learning in Cognitive Radio-Based Industrial Internet of Ad-Hoc Sensor Network
No ratings yet
Cross Layering Using Reinforcement Learning in Cognitive Radio-Based Industrial Internet of Ad-Hoc Sensor Network
17 pages
Efficient Contention Window Control With Two-Element Array
No ratings yet
Efficient Contention Window Control With Two-Element Array
6 pages
DRL For WSN Book
No ratings yet
DRL For WSN Book
78 pages
Cheng 2019
No ratings yet
Cheng 2019
27 pages
Adhoc Final
No ratings yet
Adhoc Final
21 pages
Energy and Bandwidth Constrained Qos Enabled Routing For Manets
No ratings yet
Energy and Bandwidth Constrained Qos Enabled Routing For Manets
11 pages
Industrial Internet of Things - Challenges 2018
No ratings yet
Industrial Internet of Things - Challenges 2018
11 pages
An_Efficient_Method_for_Optimal_Allocation_of_Resources_in_LPWAN_Using_Hybrid_Coati-Energy_Valley_Optimization_Algorithm_Based_on_Reinforcement_Learning
No ratings yet
An_Efficient_Method_for_Optimal_Allocation_of_Resources_in_LPWAN_Using_Hybrid_Coati-Energy_Valley_Optimization_Algorithm_Based_on_Reinforcement_Learning
14 pages
Adaptive Reinforcement Learning-Based Routing Protocol For Wireless Multihop Networks
No ratings yet
Adaptive Reinforcement Learning-Based Routing Protocol For Wireless Multihop Networks
10 pages
UBICC Journal 174 174
No ratings yet
UBICC Journal 174 174
16 pages
A STUDY OF QOS 6LOWPAN FOR THE INTERNET OF THINGS
No ratings yet
A STUDY OF QOS 6LOWPAN FOR THE INTERNET OF THINGS
11 pages
A Simple Transmit Diversity Technique For Wireless Communication
No ratings yet
A Simple Transmit Diversity Technique For Wireless Communication
6 pages
Neely 2010
No ratings yet
Neely 2010
211 pages
Cross Layer Optimization
No ratings yet
Cross Layer Optimization
43 pages
Research On Topology Planning For Wireless Mesh Networks Based On Deep Reinforcement Learning
No ratings yet
Research On Topology Planning For Wireless Mesh Networks Based On Deep Reinforcement Learning
6 pages
Shah2020 Article AQoSModelForReal-TimeApplicati
No ratings yet
Shah2020 Article AQoSModelForReal-TimeApplicati
20 pages
A New Backoff Algorithm For Ieee 802.11 DCF Mac Protocol in Mobile Ad Hoc Networks
No ratings yet
A New Backoff Algorithm For Ieee 802.11 DCF Mac Protocol in Mobile Ad Hoc Networks
10 pages
Reinforcement Learning For Resource Allocation in Cognitive Radio Networks
No ratings yet
Reinforcement Learning For Resource Allocation in Cognitive Radio Networks
18 pages
CAN Bus Neural
No ratings yet
CAN Bus Neural
6 pages
Cognitive Radio: Brain-Empowered Wireless Communcations: Matt Yu, EE360 Presentation, February 15 2012
No ratings yet
Cognitive Radio: Brain-Empowered Wireless Communcations: Matt Yu, EE360 Presentation, February 15 2012
22 pages
UBICC Journal 174 174
No ratings yet
UBICC Journal 174 174
19 pages
Flannel Networking Essentials: Definitive Reference for Developers and Engineers
From Everand
Flannel Networking Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Presented by - (Group-2) Shikha Gupta (11536019) Divya Sachan (11536009) Anuj Rajput (11536005)
No ratings yet
Presented by - (Group-2) Shikha Gupta (11536019) Divya Sachan (11536009) Anuj Rajput (11536005)
20 pages
Deep Reinforcement Learning For RAN Optimization and Control
No ratings yet
Deep Reinforcement Learning For RAN Optimization and Control
6 pages
1 Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning
No ratings yet
1 Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning
14 pages
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
From Everand
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
10 1109@comst 2020 2965856 PDF
No ratings yet
10 1109@comst 2020 2965856 PDF
46 pages
Access Control in NB-IoT Networks a Deep Reinforcement Learning Strategy
No ratings yet
Access Control in NB-IoT Networks a Deep Reinforcement Learning Strategy
16 pages
Kyasanur-Misbehaviour On Mac
No ratings yet
Kyasanur-Misbehaviour On Mac
15 pages
Toward Packet Routing With Fully-Distributed Multi-Agent Deep Reinforcement Learning
No ratings yet
Toward Packet Routing With Fully-Distributed Multi-Agent Deep Reinforcement Learning
8 pages
Congestion Control For Wireless Ad-Hoc Networks
No ratings yet
Congestion Control For Wireless Ad-Hoc Networks
6 pages
Channel Selection for Wi-Fi 7 Multi-Link Operation
No ratings yet
Channel Selection for Wi-Fi 7 Multi-Link Operation
6 pages
Whitepaper 5g Wireless Access
No ratings yet
Whitepaper 5g Wireless Access
20 pages
Artificial Intelligence and Machine Learning in Next-Generation Systems
No ratings yet
Artificial Intelligence and Machine Learning in Next-Generation Systems
15 pages
Contribution To The Design and The Implementation of A Cloud Radio Access Network
No ratings yet
Contribution To The Design and The Implementation of A Cloud Radio Access Network
7 pages
Thesis LTE ULPC
No ratings yet
Thesis LTE ULPC
57 pages
5g Radio Access Network Architecture Ericsson PDF
No ratings yet
5g Radio Access Network Architecture Ericsson PDF
14 pages
Research Article: Analytical Model For Estimating The Impact of Changing The Nominal Power Parameter in LTE
No ratings yet
Research Article: Analytical Model For Estimating The Impact of Changing The Nominal Power Parameter in LTE
7 pages
SequentialDecisionMaking PDF
No ratings yet
SequentialDecisionMaking PDF
50 pages
SequentialDecisionMaking PDF
No ratings yet
SequentialDecisionMaking PDF
50 pages
Research Article: Analytical Model For Estimating The Impact of Changing The Nominal Power Parameter in LTE
No ratings yet
Research Article: Analytical Model For Estimating The Impact of Changing The Nominal Power Parameter in LTE
7 pages
20160301VedantaKesari PDF
No ratings yet
20160301VedantaKesari PDF
56 pages
Self-Attentive Sequential Recommendation: Wang-Cheng Kang, Julian Mcauley Uc San Diego (Wckang, Jmcauley) @ucsd - Edu
No ratings yet
Self-Attentive Sequential Recommendation: Wang-Cheng Kang, Julian Mcauley Uc San Diego (Wckang, Jmcauley) @ucsd - Edu
10 pages
Impact of Reciprocal Path Loss On Uplink Power Control For LTE
No ratings yet
Impact of Reciprocal Path Loss On Uplink Power Control For LTE
9 pages
20160301VedantaKesari PDF
No ratings yet
20160301VedantaKesari PDF
56 pages
Multimedia Information
No ratings yet
Multimedia Information
33 pages
Feature Audit - MTN Irancell
No ratings yet
Feature Audit - MTN Irancell
30 pages
Practice Questions for CWE
No ratings yet
Practice Questions for CWE
4 pages
SQL Notes by Apna College
No ratings yet
SQL Notes by Apna College
29 pages
Shanmugaraj Resume
No ratings yet
Shanmugaraj Resume
6 pages
Ipsec VPN PDF
No ratings yet
Ipsec VPN PDF
6 pages
How I Cracked The AWS Solution Architect Cloud Quest. - DEV Community
No ratings yet
How I Cracked The AWS Solution Architect Cloud Quest. - DEV Community
8 pages
22617-2022-Summer-model-answer-paper[Msbte study resources] - converted
No ratings yet
22617-2022-Summer-model-answer-paper[Msbte study resources] - converted
20 pages
Management Information System: - Networks and Data
No ratings yet
Management Information System: - Networks and Data
40 pages
Course Outline: ECDL Module 5
No ratings yet
Course Outline: ECDL Module 5
2 pages
Moodle Comparison Table
No ratings yet
Moodle Comparison Table
4 pages
RH2288H V3 IBMC V399 Release Notes 01
No ratings yet
RH2288H V3 IBMC V399 Release Notes 01
83 pages
A Practical Grammar of The Latin Language
100% (2)
A Practical Grammar of The Latin Language
737 pages
Presidency University, Bengaluru: Sem 4 - Cse 204 - Lab Midterm Exam Schedule On Tuesday, 5Th May 2020
No ratings yet
Presidency University, Bengaluru: Sem 4 - Cse 204 - Lab Midterm Exam Schedule On Tuesday, 5Th May 2020
2 pages
Learning Activity 2.2
No ratings yet
Learning Activity 2.2
3 pages
Lpic 1
No ratings yet
Lpic 1
296 pages
Speech Recognition Using Correlation Tec
No ratings yet
Speech Recognition Using Correlation Tec
8 pages
Rozxy
No ratings yet
Rozxy
288 pages
Mobile Phone: Service Manual
No ratings yet
Mobile Phone: Service Manual
174 pages
Ban Quiz Answer
No ratings yet
Ban Quiz Answer
12 pages
Foresight NV Training Intro v.1
No ratings yet
Foresight NV Training Intro v.1
93 pages
The Relational Database Model
No ratings yet
The Relational Database Model
60 pages
031 - Btech - 08 Sem PDF
No ratings yet
031 - Btech - 08 Sem PDF
163 pages
Addis Ababa University Addis Ababa Institute of Technology School of Electrical and Computer Engineering
No ratings yet
Addis Ababa University Addis Ababa Institute of Technology School of Electrical and Computer Engineering
6 pages
BAHRIA UNIVERSITY, (Karachi Campus) : Department of Software Engineering
No ratings yet
BAHRIA UNIVERSITY, (Karachi Campus) : Department of Software Engineering
11 pages
Home Audio System: Operating Instructions GB Mode D'emploi FR AR
No ratings yet
Home Audio System: Operating Instructions GB Mode D'emploi FR AR
83 pages
Unit-1 Introduction To Cloud Computing
No ratings yet
Unit-1 Introduction To Cloud Computing
58 pages
AC6966B4
No ratings yet
AC6966B4
14 pages

Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning

Uploaded by

Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning

Uploaded by

1

Contention Window Optimization in IEEE 802.11ax

TABLE I of 60-second simulations (the first 14 rounds constituted

IV. S IMULATION MODEL

Network throughput [Mb/s]

Network throughput [Mb/s]

(a) (b) (c)

You might also like