智能交通控制:QMIX辅助路由与基于节点向量的域内QoS路由
1. QMIX辅助路由在基于社交的容迟网络中的应用
1.1 QMIX算法伪代码
QMIX算法是一种用于基于社交的容迟网络(DTN)的合作多智能体强化学习(MARL)算法,其伪代码如下:
Algorithm 4.2 Cooperative MARL algorithm for social-based DTN
1: Initialize replay buffer
2: for episode = 1, M do
3:
Initialize network environment
4:
for step = 1, T do
5:
Each agent obtains its observation o
6:
Execute the action to get each new observation o′ and reward r of each agent
7:
Store (o, a, r, o′) to replay buffer
8:
end for
9:
for agent t = 1, N do
10:
Randomly extract a batch from replay buffer
11:
Calculate Qi(τi, ai; θi) and maxai′ ¯Qi(τi′, a′i; θi′) by DRQN
12:
end for
13:
Input all