1 |
王沙飞, 鲍雁飞, 李岩. 认知电子战体系结构与技术[J]. 中国科学: 信息科学, 2018, 48(12): 1603-1613, 1709.
|
|
WANG S F, BAO Y F, LI Y. The architecture and technology of cognitive electronic warfare[J]. Scientia Sinica(Informationis), 2018, 48(12): 1603-1613, 1709. (in Chinese)
|
2 |
BAYRAM S, VANLI N D, DULEK B, et al. Optimum power allocation for average power constrained jammers in the presence of non-Gaussian noise[J]. IEEE Communications Letters, 2012, 16(8): 1153-1156.
|
3 |
XU C, SHENG M, WANG X J, et al. Distributed subchannel allocation for interference mitigation in OFDMA femtocells: A utility-based learning approach[J]. IEEE Transactions on Vehicular Technology, 2015, 64(6): 2463-2475.
|
4 |
GOMADAM K, CADAMBE V R, JAFAR S A. Approaching the capacity of wireless networks through distributed interference alignment[C]//2008 IEEE Global Telecommunications Conference. New Orleans: IEEE, 2008: 1-6.
|
5 |
AMURU S, TEKIN C, SCHAAR M VAN DER, et al. Jamming bandits—A novel learning method for optimal jamming[J]. IEEE Transactions on Wireless Communications, 2016, 15(4): 2792-2808.
|
6 |
颛孙少帅, 杨俊安, 刘辉, 等. 基于正强化学习和正交分解的干扰策略选择算法[J]. 系统工程与电子技术, 2018, 40(3): 518-525.
|
|
ZHUANSUN S S, YANG J N, LIU H, et al. Jamming strategy learning based on positive reinforcement learning and orthogonal decomposition[J]. Systems Engineering and Electronics, 2018, 40(3): 518-525. (in Chinese)
|
7 |
AMURU S, BUEHRER R M. Optimal jamming using delayed learning[C]//2014 IEEE Military Communications Conference. Baltimore: IEEE, 2014: 1528-1533.
|
8 |
黄志清, 曲志伟, 张吉, 等. 基于深度强化学习的端到端无人驾驶决策[J]. 电子学报, 2020, 48(9): 1711-1719.
|
|
HUANG Z Q, QU Z W, ZHANG J, et al. End-to-end autonomous driving decision based on deep reinforcement learning[J]. Acta Electronica Sinica, 2020, 48(9): 1711-1719. (in Chinese)
|
9 |
SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489.
|
10 |
VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575(7782): 350-354.
|
11 |
陈思光, 陈佳民, 赵传信. 基于深度强化学习的云边协同计算迁移研究[J]. 电子学报, 2021, 49(1): 157-166.
|
|
CHEN S G, CHEN J M, ZHAO C X. Deep reinforcement learning based cloud-edge collaborative computation offloading mechanism[J]. Acta Electronica Sinica, 2021, 49(1): 157-166. (in Chinese)
|
12 |
LI S, YAN Y H, REN J, et al. A sample-efficient actor-critic algorithm for recommendation diversification[J]. Chinese Journal of Electronics, 2020, 29(1): 89-96.
|
13 |
杨启萌, 禹龙, 田生伟, 等. 基于深度强化学习的维吾尔语人称代词指代消解[J]. 电子学报, 2020, 48(6): 1077-1083.
|
|
YANG Q M, YU L, TIAN S W, et al. Anaphora resolution of uyghur personal pronouns based on deep reinforcement learning[J]. Acta Electronica Sinica, 2020, 48(6): 1077-1083. (in Chinese)
|
14 |
LUONG N C, HOANG D T, GONG S M, et al. Applications of deep reinforcement learning in communications and networking: A survey[J]. IEEE Communications Surveys & Tutorials, 2019, 21(4): 3133-3174.
|
15 |
ZHAO D, QIN H, SONG B, et al. A graph convolutional network-based deep reinforcement learning approach for resource allocation in a cognitive radio network[J]. Sensors(Basel, Switzerland), 2020, 20(18): 5216-5239.
|
16 |
WANG S X, LIU H P, GOMES P H, et al. Deep reinforcement learning for dynamic multichannel access in wireless networks[J]. IEEE Transactions on Cognitive Communications and Networking, 2018, 4(2): 257-265.
|
17 |
XU Z Y, WANG Y Z, TANG J, et al. A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs[C]//2017 IEEE International Conference on Communications. Paris: IEEE, 2017: 1-6.
|
18 |
GUO D L, TANG L, ZHANG X G, et al. Joint optimization of handover control and power allocation based on multi-agent deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2020, 69(11): 13124-13138.
|
19 |
刘婷婷, 罗义南, 杨晨阳. 基于多智能体深度强化学习的分布式干扰协调[J]. 通信学报, 2020, 41(7): 38-48.
|
|
LIU T T, LUO Y N, YANG C Y. Distributed interference coordination based on multi-agent deep reinforcement learning[J]. Journal on Communications, 2020, 41(7): 38-48. (in Chinese)
|
20 |
NASIR Y S, GUO D N. Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(10): 2239-2250.
|
21 |
ZHAO N, LIANG Y C, NIYATO D, et al. Deep reinforcement learning for user association and resource allocation in heterogeneous cellular networks[J]. IEEE Transactions on Wireless Communications, 2019, 18(11): 5141-5152.
|
22 |
MENG F, CHEN P, WU L N, et al. Power allocation in multi-user cellular networks: Deep reinforcement learning approaches[J]. IEEE Transactions on Wireless Communications, 2020, 19(10): 6255-6267.
|
23 |
ZHANG K Q, YANG Z R, BAŞAR T. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms[M/OL]. [2021]. .
|
24 |
NGUYEN T T, NGUYEN N D, NAHAVANDI S. Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications[J]. IEEE Transactions on Cybernetics, 2020, 50(9): 3826-3839.
|
25 |
冯小平, 李鹏, 杨绍全. 通信对抗原理[M]. 西安: 西安电子科技大学出版社, 2009.
|
26 |
FOERSTER J, FARQUHAR G, AFOURAS T, et al. Counterfactual multi-agent policy gradients[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence(AAAI). New Orleans:ACM, 2018: 2974-2983.
|
27 |
LOWE R, WU Y, TAMAR A, et al. Multiagent actor-critic for mixed cooperative-competitive environments[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems(NIPS). Long Beach: MIT Press, 2017: 6379-6390.
|
28 |
HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//Proceedings of the 35th International Conference on Machine Learning(ICML). Stockholm: IMLS, 2018: 1861-1870.
|
29 |
HAARNOJA T, TANG H, ABBEEL P, et al. Reinforcement learning with deep energy-based policies[C]//Proceedings of the 34th International Conference on Machine Learning(ICML). Sydney: IMLS, 2017: 1352-1361.
|
30 |
MNIHL V, KAVUKCUOGLU K, SLIVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540):529-533.
|
31 |
FUJIMOTO S, HOOF H, MEGER M. Addressing function approximation error in actor-critic methods[C]//Proceedings of the 35th International Conference on Machine Learning(ICML). Stockholm: IMLS, 2018: 1587-1596.
|