当前位置: X-MOL 学术IEEE Trans. Cognit. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Reinforcement Learning Control for Radar Detection and Tracking in Congested Spectral Environments
IEEE Transactions on Cognitive Communications and Networking ( IF 7.4 ) Pub Date : 2020-12-01 , DOI: 10.1109/tccn.2020.3019605
Charles E. Thornton , Mark A. Kozy , R. Michael Buehrer , Anthony F. Martone , Kelly D. Sherbondy

This work addresses dynamic non-cooperative coexistence between a cognitive pulsed radar and nearby communications systems by applying nonlinear value function approximation via deep reinforcement learning (Deep RL) to develop a policy for optimal radar performance. The radar learns to vary the bandwidth and center frequency of its linear frequency modulated (LFM) waveforms to mitigate interference with other systems for improved target detection performance while also sufficiently utilizing available frequency bands to achieve a fine range resolution. We demonstrate that this approach, based on the Deep ${Q}$ -Learning (DQL) algorithm, enhances several radar performance metrics more effectively than policy iteration or sense-and-avoid (SAA) approaches in several realistic coexistence environments. The DQL-based approach is also extended to incorporate Double ${Q}$ -learning and a recurrent neural network to form a Double Deep Recurrent ${Q}$ -Network (DDRQN), which yields favorable performance and stability compared to DQL and policy iteration. The practicality of the proposed scheme is demonstrated through experiments performed on a software defined radar (SDRadar) prototype system. Experimental results indicate that the proposed Deep RL approach significantly improves radar detection performance in congested spectral environments compared to policy iteration and SAA.

中文翻译:

用于拥挤光谱环境中雷达检测和跟踪的深度强化学习控制

这项工作通过深度强化学习 (Deep RL) 应用非线性值函数近似来制定优化雷达性能的策略,从而解决了认知脉冲雷达与附近通信系统之间的动态非合作共存问题。雷达学会改变其线性调频 (LFM) 波形的带宽和中心频率,以减轻与其他系统的干扰,从而提高目标检测性能,同时充分利用可用频段来实现精细的距离分辨率。我们证明了这种方法,基于 Deep ${Q}$ - 学习 (DQL) 算法,在几个现实的共存环境中,比策略迭代或感知与避免 (SAA) 方法更有效地增强了几个雷达性能指标。基于 DQL 的方法也被扩展为包含 Double ${Q}$ -学习和循环神经网络形成双深度循环 ${Q}$ -Network (DDRQN),与 DQL 和策略迭代相比,它具有良好的性能和稳定性。通过在软件定义雷达 (SDRadar) 原型系统上进行的实验证明了所提出方案的实用性。实验结果表明,与策略迭代和 SAA 相比,所提出的深度强化学习方法显着提高了拥挤频谱环境中的雷达检测性能。
更新日期:2020-12-01
down
wechat
bug