空军工程大学信息与导航学院,陕西西安 710077
[ "彭翔 男,1998年4月生,山西大同人.现为空军工程大学信息与导航学院博士研究生.主要研究方向为通信对抗、强化学习、智能决策. E-mail: pengxiang0538@163.com" ]
[ "许华 男,1976年4月生,湖北宜昌人.现为空军工程大学信息与导航学院教授、博士生导师.主要研究方向为通信对抗、信号盲处理、智能决策. E-mail: 13720720010@139.com" ]
[ "蒋磊 男,1974年6月生,江苏无锡人.现为空军工程大学信息与导航学院副教授、硕士生导师.主要研究方向为通信对抗、无线通信技术. E-mail: jleimail@126.com" ]
[ "张悦 男,1989年12月生,陕西西安人.现为空军工程大学信息与导航学院在站博士后、讲师、硕士生导师.主要研究方向为深度学习、强化学习、博弈论、通信资源分配. E-mail: catchmeifyoucan@uestc.edu.cn" ]
[ "饶宁 男,1997年8月生,江西上饶人.现为空军工程大学信息与导航学院博士研究生.主要研究方向为通信对抗、强化学习、智能决策. E-mail: raoningmabma@163.com" ]
收稿:2022-04-12,
修回:2022-08-23,
纸质出版:2023-05-25
移动端阅览
彭翔,许华,蒋磊等.一种基于深度强化学习的动态自适应干扰功率分配方法[J].电子学报,2023,51(05):1223-1234.
PENG Xiang,XU Hua,JIANG Lei,et al.A Dynamic Adaptive Jamming Power Allocation Method Based on Deep Reinforcement Learning[J].ACTA ELECTRONICA SINICA,2023,51(05):1223-1234.
彭翔,许华,蒋磊等.一种基于深度强化学习的动态自适应干扰功率分配方法[J].电子学报,2023,51(05):1223-1234. DOI: 10.12263/DZXB.20220391.
PENG Xiang,XU Hua,JIANG Lei,et al.A Dynamic Adaptive Jamming Power Allocation Method Based on Deep Reinforcement Learning[J].ACTA ELECTRONICA SINICA,2023,51(05):1223-1234. DOI: 10.12263/DZXB.20220391.
针对传统干扰功率分配方法在干扰目标策略未知的情况下容易造成资源浪费和干扰效费比低的问题,本文提出一种基于深度强化学习的动态自适应干扰功率分配方法.在目标通信功率及功率控制策略完全未知的情况下,该方法将空间分布的侦察节点的观测值作为连续状态输入,利用深度强化学习方法进行干扰功率的辅助决策,可通过对目标策略的有效学习实现自适应稳定干扰.为进一步提升算法性能,本文设计了基于时序误差的优先经验回放机制和自适应探索策略.仿真结果表明,所提方法在与传统干扰功率分配方法干扰效果相当的情况下可节约42.5%的功率资源,提升了干扰效费比,且成功率和功率损耗皆优于对比的智能算法.
To solve the problem that traditional jamming power allocation methods are prone to waste resources and low jamming effectiveness-cost-ratio when the jamming target strategy is unknown
a dynamic adaptive jamming power allocation method based on deep reinforcement learning is proposed. When the communication power of the target and its power control strategy is completely unknown
the method takes the observation values of spatially distributed reconnaissance nodes as continuous state input and uses the deep reinforcement learning method to assist the decision-making of jamming power. It can achieve the adaptive stable jamming by the effective learning of target strategy. To further improve the performance of the algorithm
a prioritized experience replay mechanism based on temporal-difference error and an adaptive exploration strategy are designed. The simulation results show the proposed method can save 42.5% of power resources and improve the jamming effectiveness-cost-ratio when the jamming effect is equivalent to that of the traditional jamming power distribution method. The success rate and power cost of the proposed algorithm are better than those of the comparative intelligent algorithms.
XIONG X , ZHENG K , LEI L , et al . Resource allocation based on deep reinforcement learning in IoT edge computing [J]. IEEE Journal on Selected Areas in Communications , 2020 , 38 ( 6 ): 1133 - 1146 .
SHI W S , LI J L , WU H Q , et al . Drone-cell trajectory planning and resource allocation for highly mobile networks: A hierarchical DRL approach [J]. IEEE Internet of Things Journal , 2021 , 8 ( 12 ): 9800 - 9813 .
ZHAO B K , LIU J H , WEI Z L , et al . A deep reinforcement learning based approach for energy-efficient channel allocation in satellite internet of things [J]. IEEE Access , 2020 , 8 : 62197 - 62206 .
LEI W L , YE Y , XIAO M . Deep reinforcement learning-based spectrum allocation in integrated access and backhaul networks [J]. IEEE Transactions on Cognitive Communications and Networking , 2020 , 6 ( 3 ): 970 - 979 .
HE C F , HU Y , CHEN Y , et al . Joint power allocation and channel assignment for NOMA with deep reinforcement learning [J]. IEEE Journal on Selected Areas in Communications , 2019 , 37 ( 10 ): 2200 - 2210 .
ALWARAFY A , CIFTLER B S , ABDALLAH M , et al . DeepRAT: A DRL-based framework for multi-RAT assignment and power allocation in HetNets [C]// 2021 IEEE International Conference on Communications Workshops(ICC Workshops) . Montreal : IEEE , 2021 : 1 - 6 .
MENG F , CHEN P , WU L , et al . Power allocation in multi-user cellular networks: Deep reinforcement learning approaches [J]. IEEE Transactions on Wireless Communications , 2020 , 19 ( 10 ): 6255 - 6267 .
宗思光 , 刘涛 , 梁善永 . 基于改进遗传算法的干扰资源分配问题研究 [J]. 电光与控制 , 2018 , 25 ( 05 ): 41 - 45 .
ZONG Si-guang , LIU Tao , LIANG Shan-yong . Research on interference resource allocation based on improved genetic algorithm [J]. Electronics Optics & Control , 2018 , 25 ( 05 ): 41 - 45 . (in Chinese)
WANG Q Y , JIAO D Z , SHI S , et al . Improved ant colony optimization algorithm for jamming resource allocation [J]. Journal of System Simulation , 2021 , 33 ( 12 ): 2967 - 2974 .
黄星源 , 李岩屹 . 基于双Q学习算法的干扰资源分配策略 [J]. 系统仿真学报 , 2021 , 33 ( 08 ): 1801 - 1808 .
HUANG Xing-yuan , LI Yan-yi . Interference resource allocation strategy based on double-Q learning algorithm [J]. Journal of System Simulation , 2021 , 33 ( 08 ): 1801 - 1808 . (in Chinese)
许华 , 宋佰霖 , 蒋磊 , 等 . 一种通信对抗干扰资源分配智能决策算法 [J]. 电子与信息学报 , 2021 , 43 ( 11 ): 3086 - 3095 .
XU Hua , SONG Bai-lin , JIANG Lei , et al . An intelligent decision-making algorithm for communication countermeasure jamming resource allocation [J]. Journal of Electronics & Information Technology , 2021 , 43 ( 11 ): 3086 - 3095 . (in Chinese)
饶宁 , 许华 , 齐子森 , 等 . 基于最大策略熵深度强化学习的通信干扰资源分配方法 [J]. 西北工业大学学报 , 2021 , 39 ( 05 ): 1077 - 1086 .
RAO Ning , XU Hua , QI Zi-sen , et al . Allocation method of communication interference resource based on deep reinforcement learning of maximum policy entropy [J]. Journal of Northwestern Polytechnical University , 2021 , 39 ( 05 ): 1077 - 1086 . (in Chinese)
粟平 , 赵国庆 , 杨小牛 , 等 . 信息对抗技术 [M]. 北京 : 清华大学出版社 , 2008 .
LEI L , YUAN D , HO C K , et al . Joint optimization of power and channel allocation with non-orthogonal multiple access for 5G cellular systems [C]// 2015 IEEE Global Communications Conference(GLOBECOM) . San Diego : IEEE , 2015 : 1 - 6 .
TAN J , LIANG Y C , ZHANG L , et al . Deep reinforcement learning for joint channel selection and power control in D2D networks [J]. IEEE Transactions on Wireless Communications , 2020 , 20 ( 2 ): 1363 - 1378 .
NIE H R , LI S S , LIU Y . Multi-agent deep reinforcement learning for resource allocation in the multi-objective HetNet [C]// 2021 International Wireless Communications and Mobile Computing(IWCMC) . Harbin : IEEE , 2021 : 116 - 121 .
邓兵 , 张韫 , 李炳荣 . 通信对抗原理及应用 [M]. 北京 : 电子工业出版社 , 2017 : 35 - 156 .
YICK J , MUKHERJEE B , GHOSAL D . Wireless sensor network survey [J]. Computer Networks , 2008 , 52 ( 12 ): 2292 - 2330 .
LI X , FANG J , CHENG W , et al . Intelligent power control for spectrum sharing in cognitive radios: A deep reinforcement learning approach [J]. IEEE Access , 2018 , 6 : 25463 - 25473 .
VOLODYMYR M , KORAY K , DAVID S , et al . Playing atari with deep reinforcement learning [C]// 2013 Conference and Workshop on Neural Information Processing Systems(NIPS) . Lake Tahoe : MIT Press , 2013 : 1 - 9 .
VOLODYMYR M , KORAY K , DAVID S , et al . Human-level control through deep reinforcement learning [J]. Nature , 2015 , 518 ( 7540 ): 529 - 533 .
TOM S , JOHN Q , IOANNIS A , et al . Prioritized experience replay [C]// 2016 International Conference on Learning Representations . Caribe Hilton : ICLR , 2016 : 1 - 21 .
AREF M A , JAYAWEERA S K , YEPEZ E . Survey on cognitive anti-jamming communications [J]. IET Communications , 2020 , 14 ( 18 ): 3110 - 3127 .
KESKAR N S , NOCEDAL J , TANG P T P , et al . On large-batch training for deep learning: Generalization gap and sharp minima [C]// 2017 International Conference on Learning Representations . Toulon : ICLR , 2017 : 1 - 16 .
0
浏览量
26
下载量
2
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621