当前位置: X-MOL 学术arXiv.cs.SY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Path Design and Resource Management for NOMA enhanced Indoor Intelligent Robots
arXiv - CS - Systems and Control Pub Date : 2020-11-23 , DOI: arxiv-2011.11745
Ruikang Zhong, Xiao Liu, Yuanwei Liu, Yue Chen, Xianbin Wang

A communication enabled indoor intelligent robots (IRs) service framework is proposed, where non-orthogonal multiple access (NOMA) technique is adopted to enable highly reliable communications. In cooperation with the ultramodern indoor channel model recently proposed by the International Telecommunication Union (ITU), the Lego modeling method is proposed, which can deterministically describe the indoor layout and channel state in order to construct the radio map. The investigated radio map is invoked as a virtual environment to train the reinforcement learning agent, which can save training time and hardware costs. Build on the proposed communication model, motions of IRs who need to reach designated mission destinations and their corresponding down-link power allocation policy are jointly optimized to maximize the mission efficiency and communication reliability of IRs. In an effort to solve this optimization problem, a novel reinforcement learning approach named deep transfer deterministic policy gradient (DT-DPG) algorithm is proposed. Our simulation results demonstrate that 1) With the aid of NOMA techniques, the communication reliability of IRs is effectively improved; 2) The radio map is qualified to be a virtual training environment, and its statistical channel state information improves training efficiency by about 30%; 3) The proposed DT-DPG algorithm is superior to the conventional deep deterministic policy gradient (DDPG) algorithm in terms of optimization performance, training time, and anti-local optimum ability.

中文翻译:

NOMA增强型室内智能机器人的路径设计和资源管理

提出了一种具有通信功能的室内智能机器人(IR)服务框架,其中采用非正交多路访问(NOMA)技术来实现高度可靠的通信。结合国际电信联盟(ITU)最近提出的超现代室内频道模型,提出了一种乐高建模方法,该方法可以确定性地描述室内布局和频道状态,以构建无线电地图。被调查的无线电地图被用作虚拟环境来训练强化学习代理,这可以节省训练时间和硬件成本。以建议的交流模型为基础,共同优化需要到达指定任务目的地的IR的运动及其相应的下行链路功率分配策略,以最大化IR的任务效率和通信可靠性。为了解决此优化问题,提出了一种新的强化学习方法,称为深度转移确定性策略梯度(DT-DPG)算法。仿真结果表明:1)借助NOMA技术,有效提高了IR的通信可靠性;2)无线​​电地图符合虚拟训练环境的条件,其统计信道状态信息可将训练效率提高30%左右;3)提出的DT-DPG算法在优化性能,训练时间,
更新日期:2020-11-25
down
wechat
bug