

浏览全部资源
扫码关注微信
1.北京航空航天大学沈元学院,北京 100191
2.北京航空航天大学自动化科学与电气工程学院,北京 100191
3.北京航空航天大学人工智能学院,北京 100191
Received:26 January 2026,
Accepted:24 February 2026,
Published:25 March 2026
移动端阅览
李中杨, 曹筱可, 蔡奕辰, 等. 基于图注意力网络的异构多智能体系统动态任务分配方法[J]. 电子学报, 2026, 54(03): 927-937.
LI Zhongyang, CAO Xiaoke, CAI Yichen, et al. Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks[J]. Acta Electronica Sinica, 2026, 54(03): 927-937.
李中杨, 曹筱可, 蔡奕辰, 等. 基于图注意力网络的异构多智能体系统动态任务分配方法[J]. 电子学报, 2026, 54(03): 927-937. DOI:10.12263/DZXB.20251257
LI Zhongyang, CAO Xiaoke, CAI Yichen, et al. Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks[J]. Acta Electronica Sinica, 2026, 54(03): 927-937. DOI:10.12263/DZXB.20251257
异构多智能体系统任务分配问题是多智能体领域的核心问题之一。该问题要求将具有不同能力类型的异构智能体合理分配到需多智能体协作完成的任务中,在实际应用场景中存在的任务新增、智能体失效等动态事件,进一步增加了问题的复杂性。针对现有方法计算代价高昂、难以有效建模异构个体与任务间的复杂依赖关系,以及动态场景自适应决策能力差的问题,本文提出了一种基于图注意力网络的异构多智能体系统动态任务分配方法。该方法引入了动态图构建机制建模异构智能体与任务间的复杂交互关系,并通过节点与边的实时更新实现对动态变化场景的表征。同时本文设计了基于图注意力机制的编解码架构,通过为不同边分配独立的注意力通道解耦异构节点的特征语义,并结合指针式解码器实现了能力与需求的匹配及对变长输入的适应。针对大规模任务分配下的稀疏奖励难题,本文提出了涵盖任务规模与环境动态性双维度的多阶段课程学习策略,通过平滑优化曲面引导策略逐步收敛。仿真实验结果表明,所提方法在动态场景下保持100%的成功率,完成时间较基于学习的对比方法降低了4%~8%,较贪婪算法降低约23%,在大规模场景下仍能保持毫秒级决策速度和高质量的分配结果,验证了方法在动态适应性、规模扩展性和分配方案质量方面的综合优势。
The task allocation problem in heterogeneous multi-agent systems is one of the core issues in the multi-agent domain. This problem requires the rational allocation of heterogeneous agents with distinct capabilities to tasks that demand multi-agent collaboration. Moreover
dynamic events in real-world applications
such as the arrival of new tasks and agent failures
further exacerbate the complexity of this problem. To address the limitations of existing methods—such as high computational costs
difficulties in effectively modeling the complex dependencies between heterogeneous agents and tasks
and poor adaptive decision-making capabilities in dynamic scenarios—this paper proposes a dynamic task allocation method for heterogeneous multi-agent systems based on Graph Attention Networks. This method introduces a dynamic graph construction mechanism to model the complex interaction relationships between heterogeneous agents and tasks
explicitly characterizing dynamically evolving scenarios through real-time updates of nodes and edges. Furthermore
an encoder-decoder architecture based on graph attention mechanisms is designed. By assigning independent attention channels to different interaction edges
it decouples the feature semantics of heterogeneous nodes. Combined with a pointer-based decoder
this architecture achieves precise matching between capabilities and requirements
as well as adaptation to variable-length inputs. In addition
to overcome the sparse reward challenge in large-scale task allocation
this paper proposes a multi-stage curriculum learning strategy covering both task scale and environmental dynamics dimensions
which guides the policy to converge progressively by smoothing the optimization landscape. Simulation results demonstrate that the proposed method maintains a 100% allocation success rate across various dynamic scenarios. The task completion time is reduced by 4% to 8% compared to learning-based baselines
and by approximately 23% compared to the greedy algorithm. Even in large-scale scenarios
the method maintains millisecond-level decision-making speeds and yields high-quality allocation results
thereby verifying its comprehensive advantages in dynamic adaptability
scalability
and the quality of allocation schemes.
饶凌风 , 耿娜 , 张勇 , 等 . 不确定环境下无人机任务分配的种群交互式粒子群算法 [J ] . 电子学报 , 2025 , 53 ( 8 ): 2678 - 2690 .
Rao Lingfeng , Geng Na , Zhang Yong , et al . Population interactive particle swarm optimization algorithm for UAV task allocation in uncertain environments [J ] . Acta Electronica Sinica , 2025 , 53 ( 8 ): 2678 - 2690 . (in Chinese)
Shi Qinru , Liu Meiqin , Zhang Senlin , et al . Reinforcement learning for multi-agent path finding in large-scale warehouses via distributed policy evolution [J ] . IEEE Robotics and Automation Letters , 2025 , 10 ( 8 ): 7843 - 7850 . DOI: 10.1109/lra.2025.3579647 http://dx.doi.org/10.1109/lra.2025.3579647
Li Liuchun , Yang Bisheng , Chen Chi , et al . Intelligent multi-robot exploration in non-exposed spaces: Methods and challenges [J ] . Artificial Intelligence Review , 2025 , 58 ( 12 ): 394 . DOI: 10.1007/s10462-025-11395-4 http://dx.doi.org/10.1007/s10462-025-11395-4
Athira K A , Divya Udayan J , Subramaniam U . A systematic literature review on multi-robot task allocation [J ] . ACM Computing Surveys , 2025 , 57 ( 3 ): 68 . DOI: 10.1145/3700591 http://dx.doi.org/10.1145/3700591
Khamis A , Hussein A , ELMOGY A . Multi-robot task allocation: A review of the state-of-the-art [M ] //Koubâa A, Martínez-De Dios J R. Cooperative robots and sensor networks 2015 . Heidelberg : Springer , 2015 : 31 - 51 . DOI: 10.1007/978-3-319-18299-5_2 http://dx.doi.org/10.1007/978-3-319-18299-5_2
Suslova E , Fazli P . Multi-robot task allocation with time window and ordering constraints [C ] // 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 6909 - 6916 . DOI: 10.1109/iros45743.2020.9341247 http://dx.doi.org/10.1109/iros45743.2020.9341247
Choi H L , Brunet L , How J P . Consensus-based decentralized auctions for robust task allocation [J ] . IEEE Transactions on Robotics , 2009 , 25 ( 4 ): 912 - 926 . DOI: 10.1109/tro.2009.2022423 http://dx.doi.org/10.1109/tro.2009.2022423
Zhang Yudong , Wang Shuihua , Ji Genlin . A comprehensive survey on particle swarm optimization algorithm and its applications [J ] . Mathematical Problems in Engineering , 2015 , 2015 ( 1 ): 931256 . DOI: 10.1155/2015/931256 http://dx.doi.org/10.1155/2015/931256
Bezerra L C D , Dos Santos A M G , Park S . Learning policies for dynamic coalition formation in multi-robot task allocation [J ] . IEEE Robotics and Automation Letters , 2025 , 10 ( 9 ): 9216 - 9223 . DOI: 10.1109/lra.2025.3592080 http://dx.doi.org/10.1109/lra.2025.3592080
Kargar E , Kyrki V . MACRPO: Multi-agent cooperative recurrent policy optimization [J ] . Frontiers in Robotics and AI , 2024 , 11 : 1394209 . DOI: 10.3389/frobt.2024.1394209 http://dx.doi.org/10.3389/frobt.2024.1394209
Dai Weiheng , Bidwai A , Sartoretti G . Dynamic coalition formation and routing for multirobot task allocation via reinforcement learning [C ] // 2024 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2024 : 16567 - 16573 . DOI: 10.1109/icra57147.2024.10611244 http://dx.doi.org/10.1109/icra57147.2024.10611244
Kool W , Van Hoof H , Welling M . Attention, learn to solve routing problems! [PP/OL ] . V3.arVix ( 2019-02-07 )[ 2026-01-26 ] . https://arxiv.org/abs/1803.08475 https://arxiv.org/abs/1803.08475 . DOI: 10.1007/978-3-031-08011-1_14 http://dx.doi.org/10.1007/978-3-031-08011-1_14
Nazari M , Oroojlooy A , Takáč M , et al . Reinforcement learning for solving the vehicle routing problem [PP/OL ] . V2.arVix ( 2018-05-21 )[ 2026-01-26 ] . https://arxiv.org/abs/1802.04240 https://arxiv.org/abs/1802.04240 .
Vaswani A , Shazeer N , Parmar N , et al . Attention is all you need [PP/OL ] . V7.arVix ( 2023-08-02 )[ 2026-01-26 ] . https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762 . DOI: 10.65215/ysbyhc05 http://dx.doi.org/10.65215/ysbyhc05
Wang Zheyuan , Gombolay M . Learning scheduling policies for multi-robot coordination with graph attention networks [J ] . IEEE Robotics and Automation Letters , 2020 , 5 ( 3 ): 4509 - 4516 . DOI: 10.1109/lra.2020.3002198 http://dx.doi.org/10.1109/lra.2020.3002198
Jose W J , Zhang Hao . Learning for dynamic subteaming and voluntary waiting in heterogeneous multi-robot collaborative scheduling [C ] // 2024 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2024 : 4569 - 4576 . DOI: 10.1109/icra57147.2024.10610342 http://dx.doi.org/10.1109/icra57147.2024.10610342
Dai Weiheng , Rai U , Chiun J , et al . Heterogeneous multi-robot task allocation and scheduling via reinforcement learning [J ] . IEEE Robotics and Automation Letters , 2025 , 10 ( 3 ): 2654 - 2661 . DOI: 10.1109/lra.2025.3534682 http://dx.doi.org/10.1109/lra.2025.3534682
Veličković P , Cucurull G , Casanova A , et al . Graph attention networks [PP/OL ] . V3.arVix ( 2018-02-04 )[ 2026-01-26 ] . https://arxiv.org/abs/1710.10903 https://arxiv.org/abs/1710.10903 .
袁丁 , 李源 , 孟羽倩 , 等 . 基于时空注意力Transformer的自动驾驶运动规划方法 [J ] . 电子学报 , 2025 , 53 ( 7 ): 2418 - 2427 .
Yuan Ding , Li Yuan , Meng Yuqian , et al . A motion planning method for autonomous driving based on spatiotemporal attention transformer [J ] . Acta Electronica Sinica , 2025 , 53 ( 7 ): 2418 - 2427 . (in Chinese)
Zhang Jiani , Shi Xingjian , Xie Junyuan , et al . GaAN: Gated attention networks for learning on large and spatiotemporal graphs [PP/OL ] . V1.arVix ( 2018-03-20 )[ 2026-01-26 ] . https://arxiv.org/abs/1803.07294 https://arxiv.org/abs/1803.07294 .
Wang Xiao , Ji Houye , Shi Chuan , et al . Heterogeneous graph attention network [C ] // Proceedings of the World Wide Web Conference . New York : ACM , 2019 : 2022 - 2032 . DOI: 10.1145/3308558.3313562 http://dx.doi.org/10.1145/3308558.3313562
Peng Juntong , Viswanath H , Bera A . Graph-based decentralized task allocation for multi-robot target localization [J ] . IEEE Robotics and Automation Letters , 2024 , 9 ( 11 ): 10676 - 10683 . DOI: 10.1109/lra.2024.3475013 http://dx.doi.org/10.1109/lra.2024.3475013
Du Wei , Ding Shifei , Zhang Chenglong , et al . Multiagent reinforcement learning with heterogeneous graph attention network [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2023 , 34 ( 10 ): 6851 - 6860 . DOI: 10.1109/tnnls.2022.3215774 http://dx.doi.org/10.1109/tnnls.2022.3215774
Zhang Zhenqiang , Jiang Xiangyuan , Yang Zhenfa , et al . Scalable multi-robot task allocation using graph deep reinforcement learning with graph normalization [J ] . Electronics , 2024 , 13 ( 8 ): 1561 . DOI: 10.3390/electronics13081561 http://dx.doi.org/10.3390/electronics13081561
Lu Zehui , Zhou Tianyu , Mou Shaoshuai . Real-time multi-robot mission planning in cluttered environment [J ] . Robotics , 2024 , 13 ( 3 ): 40 . DOI: 10.3390/robotics13030040 http://dx.doi.org/10.3390/robotics13030040
Williams R J . Simple statistical gradient-following algorithms for connectionist reinforcement learning [J ] . Machine Learning , 1992 , 8 ( 3/4 ): 229 - 256 . DOI: 10.1007/bf00992696 http://dx.doi.org/10.1007/bf00992696
赵世钰 . 强化学习的数学原理 [M ] . 北京 : 清华大学出版社 , 2024 .
Zhao Shiyu . Mathematical foundations of reinforcement learning [M ] . Beijing : Tsinghua University Press , 2024 . (in Chinese)
Kwon Y D , Choo J , Kim B , et al . POMO: Policy optimization with multiple optima for reinforcement learning [PP/OL ] . V3.arVix ( 2021-07-13 )[ 2026-01-26 ] . https://arxiv.org/abs/2010.16011 https://arxiv.org/abs/2010.16011 .
Shin H S , Li Teng , Lee H I , et al . Sample greedy based task allocation for multiple robot systems [J ] . Swarm Intelligence , 2022 , 16 ( 3 ): 233 - 260 . DOI: 10.1007/s11721-022-00213-0 http://dx.doi.org/10.1007/s11721-022-00213-0
Lagoudakis M G , Berhault M , Koenig S , et al . Simple auctions with performance guarantees for multi-robot task allocation [C ] // 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2004 : 698 - 705 . DOI: 10.1109/iros.2004.1389311 http://dx.doi.org/10.1109/iros.2004.1389311
0
Views
58
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621