当前位置: X-MOL 学术Phys. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Intelligent spectrum management based on reinforcement learning schemes in cooperative cognitive radio networks
Physical Communication ( IF 2.2 ) Pub Date : 2020-10-17 , DOI: 10.1016/j.phycom.2020.101226
Amandeep Kaur , Krishan Kumar

Cognitive Radio (CR) and Cooperative Communication provide key technologies for efficient utilization of available unused spectrum bands (called resources) to achieve a spectral efficient system with high throughput. But to achieve its full potential, it is essential to empower the brain of CR that is Cognitive Engine (CE), using machine learning algorithms to control the operation and adapt parameters according to the dynamic environment. However, in practical scenarios, it is difficult to formulate network model beforehand due to complex network dynamics. To address this issue, the most favorable machine learning scheme, Reinforcement Learning (RL) based schemes are proposed to empower CE without forming an explicit network model. The proposed schemes, Comparison based Cooperative Q-Learning (CCopQL) and Comparison based Cooperative State-Action-Reward-(next) State-(next) Action (CCopSARSA) for resource allocation, allows each CR to learn cooperatively. The cooperation among CRs is in the form of comparing and then exchanging Q-values to obtain an optimal policy. Though these schemes involve information exchange among CRs as compared to independent Q-Leaning and SARSA but it provides improved system performance with high system throughput. Numerical results reveal the significant benefits from exploiting the cooperative feature with RL, both proposed schemes outperform other existing schemes in terms of system throughput and expedite the convergence than individual CR learning with CCopSARSA and CCopQL respectively.



中文翻译:

协同认知无线电网络中基于强化学习方案的智能频谱管理

认知无线电(CR)和合作通信为有效利用可用的未使用频谱带(称为资源)提供了关键技术,以实现具有高吞吐量的频谱高效系统。但是要发挥其全部潜能,至关重要的是使用机器学习算法控制CR的大脑,即认知引擎(CE),以根据动态环境控制操作并调整参数。但是,在实际情况下,由于复杂的网络动态,很难预先建立网络模型。为了解决这个问题,提出了最有利的机器学习方案,即基于强化学习(RL)的方案,以在不形成显式网络模型的情况下增强CE的能力。拟议的计划,基于比较的协作Q学习(CCopQL)和基于比较的协作状态-动作-奖励-(下一个)状态-(下一个)动作(CCopSARSA)用于资源分配,允许每个CR协作学习。CR之间的合作是以比较然后交换Q值以获得最佳策略的形式。尽管与独立的Q-Leaning和SARSA相比,这些方案涉及CR之间的信息交换,但它以较高的系统吞吐量提供了改进的系统性能。数值结果表明,利用RL的协作功能可以显着受益,与CCopSARSA和CCopQL的单个CR学习相比,所提出的方案在系统吞吐量方面均优于其他现有方案,并且可以加快收敛速度​​。允许每个CR合作学习。CR之间的合作是以比较然后交换Q值以获得最佳策略的形式。尽管与独立的Q-Leaning和SARSA相比,这些方案涉及CR之间的信息交换,但它以较高的系统吞吐量提供了改进的系统性能。数值结果表明,利用RL的协作功能可以显着受益,与CCopSARSA和CCopQL的单个CR学习相比,所提出的方案在系统吞吐量方面均优于其他现有方案,并且可以加快收敛速度​​。允许每个CR合作学习。CR之间的合作是以比较然后交换Q值以获得最佳策略的形式。尽管与独立的Q-Leaning和SARSA相比,这些方案涉及CR之间的信息交换,但它以较高的系统吞吐量提供了改进的系统性能。数值结果表明,利用RL的协作功能可以显着受益,与CCopSARSA和CCopQL的单个CR学习相比,所提出的方案在系统吞吐量方面均优于其他现有方案,并且可以加快收敛速度​​。尽管与独立的Q-Leaning和SARSA相比,这些方案涉及CR之间的信息交换,但它以较高的系统吞吐量提供了改进的系统性能。数值结果表明,利用RL的协作功能可以显着受益,与CCopSARSA和CCopQL的单个CR学习相比,所提出的方案在系统吞吐量方面均优于其他现有方案,并且可以加快收敛速度​​。尽管与独立的Q-Leaning和SARSA相比,这些方案涉及CR之间的信息交换,但它以较高的系统吞吐量提供了改进的系统性能。数值结果表明,利用RL的协作功能可以显着受益,与CCopSARSA和CCopQL的单个CR学习相比,所提出的方案在系统吞吐量方面均优于其他现有方案,并且可以加快收敛速度​​。

更新日期:2020-10-29
down
wechat
bug