Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks

LI Zhongyang; CAO Xiaoke; CAI Yichen; SUN Guibin; LIU Kexin

doi:10.12263/DZXB.20251257

您当前的位置：

首页 >

文章列表页 >

Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks

The Theory and Application of Swarm Intelligence Technology in the Information\-Rich Era | 更新时间：2026-06-16

- Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks
- ACTA ELECTRONICA SINICA Vol. 54, Issue 3, Pages: 927-937(2026)
- 作者机构：
  
  1.北京航空航天大学沈元学院，北京 100191
  2.北京航空航天大学自动化科学与电气工程学院，北京 100191
  3.北京航空航天大学人工智能学院，北京 100191
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62503028;62373019);Beijing Natural Science Foundation(QY25228)
- DOI：10.12263/DZXB.20251257
  CLC： TP18;TP242
- Received：26 January 2026，
  
  Accepted：24 February 2026，
  
  Published：25 March 2026
- 稿件说明：
移动端阅览
李中杨, 曹筱可, 蔡奕辰, 等. 基于图注意力网络的异构多智能体系统动态任务分配方法[J]. 电子学报, 2026, 54(03): 927-937.

LI Zhongyang, CAO Xiaoke, CAI Yichen, et al. Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks[J]. Acta Electronica Sinica, 2026, 54(03): 927-937.
李中杨, 曹筱可, 蔡奕辰, 等. 基于图注意力网络的异构多智能体系统动态任务分配方法[J]. 电子学报, 2026, 54(03): 927-937. DOI：10.12263/DZXB.20251257

LI Zhongyang, CAO Xiaoke, CAI Yichen, et al. Dynamic Task Allocation Method for Heterogeneous Multi-Agent Systems Based on Graph Attention Networks[J]. Acta Electronica Sinica, 2026, 54(03): 927-937. DOI：10.12263/DZXB.20251257

摘要

异构多智能体系统任务分配问题是多智能体领域的核心问题之一。该问题要求将具有不同能力类型的异构智能体合理分配到需多智能体协作完成的任务中，在实际应用场景中存在的任务新增、智能体失效等动态事件，进一步增加了问题的复杂性。针对现有方法计算代价高昂、难以有效建模异构个体与任务间的复杂依赖关系，以及动态场景自适应决策能力差的问题，本文提出了一种基于图注意力网络的异构多智能体系统动态任务分配方法。该方法引入了动态图构建机制建模异构智能体与任务间的复杂交互关系，并通过节点与边的实时更新实现对动态变化场景的表征。同时本文设计了基于图注意力机制的编解码架构，通过为不同边分配独立的注意力通道解耦异构节点的特征语义，并结合指针式解码器实现了能力与需求的匹配及对变长输入的适应。针对大规模任务分配下的稀疏奖励难题，本文提出了涵盖任务规模与环境动态性双维度的多阶段课程学习策略，通过平滑优化曲面引导策略逐步收敛。仿真实验结果表明，所提方法在动态场景下保持100%的成功率，完成时间较基于学习的对比方法降低了4%～8%，较贪婪算法降低约23%，在大规模场景下仍能保持毫秒级决策速度和高质量的分配结果，验证了方法在动态适应性、规模扩展性和分配方案质量方面的综合优势。

Abstract

The task allocation problem in heterogeneous multi-agent systems is one of the core issues in the multi-agent domain. This problem requires the rational allocation of heterogeneous agents with distinct capabilities to tasks that demand multi-agent collaboration. Moreover

dynamic events in real-world applications

such as the arrival of new tasks and agent failures

further exacerbate the complexity of this problem. To address the limitations of existing methods—such as high computational costs

difficulties in effectively modeling the complex dependencies between heterogeneous agents and tasks

and poor adaptive decision-making capabilities in dynamic scenarios—this paper proposes a dynamic task allocation method for heterogeneous multi-agent systems based on Graph Attention Networks. This method introduces a dynamic graph construction mechanism to model the complex interaction relationships between heterogeneous agents and tasks

explicitly characterizing dynamically evolving scenarios through real-time updates of nodes and edges. Furthermore

an encoder-decoder architecture based on graph attention mechanisms is designed. By assigning independent attention channels to different interaction edges

it decouples the feature semantics of heterogeneous nodes. Combined with a pointer-based decoder

this architecture achieves precise matching between capabilities and requirements

as well as adaptation to variable-length inputs. In addition

to overcome the sparse reward challenge in large-scale task allocation

this paper proposes a multi-stage curriculum learning strategy covering both task scale and environmental dynamics dimensions

which guides the policy to converge progressively by smoothing the optimization landscape. Simulation results demonstrate that the proposed method maintains a 100% allocation success rate across various dynamic scenarios. The task completion time is reduced by 4% to 8% compared to learning-based baselines

and by approximately 23% compared to the greedy algorithm. Even in large-scale scenarios

the method maintains millisecond-level decision-making speeds and yields high-quality allocation results

thereby verifying its comprehensive advantages in dynamic adaptability

scalability

and the quality of allocation schemes.

关键词

Keywords

references

饶凌风 , 耿娜 , 张勇 , 等 . 不确定环境下无人机任务分配的种群交互式粒子群算法 [J ] . 电子学报 , 2025 , 53 ( 8 ): 2678 - 2690 .

Rao Lingfeng , Geng Na , Zhang Yong , et al . Population interactive particle swarm optimization algorithm for UAV task allocation in uncertain environments [J ] . Acta Electronica Sinica , 2025 , 53 ( 8 ): 2678 - 2690 . (in Chinese)

Shi Qinru , Liu Meiqin , Zhang Senlin , et al . Reinforcement learning for multi-agent path finding in large-scale warehouses via distributed policy evolution [J ] . IEEE Robotics and Automation Letters , 2025 , 10 ( 8 ): 7843 - 7850 . DOI: 10.1109/lra.2025.3579647 http://dx.doi.org/10.1109/lra.2025.3579647

Li Liuchun , Yang Bisheng , Chen Chi , et al . Intelligent multi-robot exploration in non-exposed spaces: Methods and challenges [J ] . Artificial Intelligence Review , 2025 , 58 ( 12 ): 394 . DOI: 10.1007/s10462-025-11395-4 http://dx.doi.org/10.1007/s10462-025-11395-4

Athira K A , Divya Udayan J , Subramaniam U . A systematic literature review on multi-robot task allocation [J ] . ACM Computing Surveys , 2025 , 57 ( 3 ): 68 . DOI: 10.1145/3700591 http://dx.doi.org/10.1145/3700591

Khamis A , Hussein A , ELMOGY A . Multi-robot task allocation: A review of the state-of-the-art [M ] //Koubâa A, Martínez-De Dios J R. Cooperative robots and sensor networks 2015 . Heidelberg : Springer , 2015 : 31 - 51 . DOI: 10.1007/978-3-319-18299-5_2 http://dx.doi.org/10.1007/978-3-319-18299-5_2

Suslova E , Fazli P . Multi-robot task allocation with time window and ordering constraints [C ] // 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2020 : 6909 - 6916 . DOI: 10.1109/iros45743.2020.9341247 http://dx.doi.org/10.1109/iros45743.2020.9341247

Choi H L , Brunet L , How J P . Consensus-based decentralized auctions for robust task allocation [J ] . IEEE Transactions on Robotics , 2009 , 25 ( 4 ): 912 - 926 . DOI: 10.1109/tro.2009.2022423 http://dx.doi.org/10.1109/tro.2009.2022423

Zhang Yudong , Wang Shuihua , Ji Genlin . A comprehensive survey on particle swarm optimization algorithm and its applications [J ] . Mathematical Problems in Engineering , 2015 , 2015 ( 1 ): 931256 . DOI: 10.1155/2015/931256 http://dx.doi.org/10.1155/2015/931256

Bezerra L C D , Dos Santos A M G , Park S . Learning policies for dynamic coalition formation in multi-robot task allocation [J ] . IEEE Robotics and Automation Letters , 2025 , 10 ( 9 ): 9216 - 9223 . DOI: 10.1109/lra.2025.3592080 http://dx.doi.org/10.1109/lra.2025.3592080

Kargar E , Kyrki V . MACRPO: Multi-agent cooperative recurrent policy optimization [J ] . Frontiers in Robotics and AI , 2024 , 11 : 1394209 . DOI: 10.3389/frobt.2024.1394209 http://dx.doi.org/10.3389/frobt.2024.1394209

Dai Weiheng , Bidwai A , Sartoretti G . Dynamic coalition formation and routing for multirobot task allocation via reinforcement learning [C ] // 2024 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2024 : 16567 - 16573 . DOI: 10.1109/icra57147.2024.10611244 http://dx.doi.org/10.1109/icra57147.2024.10611244

Kool W , Van Hoof H , Welling M . Attention, learn to solve routing problems! [PP/OL ] . V3.arVix ( 2019-02-07 )[ 2026-01-26 ] . https://arxiv.org/abs/1803.08475 https://arxiv.org/abs/1803.08475 . DOI: 10.1007/978-3-031-08011-1_14 http://dx.doi.org/10.1007/978-3-031-08011-1_14

Nazari M , Oroojlooy A , Takáč M , et al . Reinforcement learning for solving the vehicle routing problem [PP/OL ] . V2.arVix ( 2018-05-21 )[ 2026-01-26 ] . https://arxiv.org/abs/1802.04240 https://arxiv.org/abs/1802.04240 .

Vaswani A , Shazeer N , Parmar N , et al . Attention is all you need [PP/OL ] . V7.arVix ( 2023-08-02 )[ 2026-01-26 ] . https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762 . DOI: 10.65215/ysbyhc05 http://dx.doi.org/10.65215/ysbyhc05

Wang Zheyuan , Gombolay M . Learning scheduling policies for multi-robot coordination with graph attention networks [J ] . IEEE Robotics and Automation Letters , 2020 , 5 ( 3 ): 4509 - 4516 . DOI: 10.1109/lra.2020.3002198 http://dx.doi.org/10.1109/lra.2020.3002198

Jose W J , Zhang Hao . Learning for dynamic subteaming and voluntary waiting in heterogeneous multi-robot collaborative scheduling [C ] // 2024 IEEE International Conference on Robotics and Automation (ICRA) . Piscataway : IEEE , 2024 : 4569 - 4576 . DOI: 10.1109/icra57147.2024.10610342 http://dx.doi.org/10.1109/icra57147.2024.10610342

Dai Weiheng , Rai U , Chiun J , et al . Heterogeneous multi-robot task allocation and scheduling via reinforcement learning [J ] . IEEE Robotics and Automation Letters , 2025 , 10 ( 3 ): 2654 - 2661 . DOI: 10.1109/lra.2025.3534682 http://dx.doi.org/10.1109/lra.2025.3534682

Veličković P , Cucurull G , Casanova A , et al . Graph attention networks [PP/OL ] . V3.arVix ( 2018-02-04 )[ 2026-01-26 ] . https://arxiv.org/abs/1710.10903 https://arxiv.org/abs/1710.10903 .

袁丁 , 李源 , 孟羽倩 , 等 . 基于时空注意力Transformer的自动驾驶运动规划方法 [J ] . 电子学报 , 2025 , 53 ( 7 ): 2418 - 2427 .

Yuan Ding , Li Yuan , Meng Yuqian , et al . A motion planning method for autonomous driving based on spatiotemporal attention transformer [J ] . Acta Electronica Sinica , 2025 , 53 ( 7 ): 2418 - 2427 . (in Chinese)

Zhang Jiani , Shi Xingjian , Xie Junyuan , et al . GaAN: Gated attention networks for learning on large and spatiotemporal graphs [PP/OL ] . V1.arVix ( 2018-03-20 )[ 2026-01-26 ] . https://arxiv.org/abs/1803.07294 https://arxiv.org/abs/1803.07294 .

Wang Xiao , Ji Houye , Shi Chuan , et al . Heterogeneous graph attention network [C ] // Proceedings of the World Wide Web Conference . New York : ACM , 2019 : 2022 - 2032 . DOI: 10.1145/3308558.3313562 http://dx.doi.org/10.1145/3308558.3313562

Peng Juntong , Viswanath H , Bera A . Graph-based decentralized task allocation for multi-robot target localization [J ] . IEEE Robotics and Automation Letters , 2024 , 9 ( 11 ): 10676 - 10683 . DOI: 10.1109/lra.2024.3475013 http://dx.doi.org/10.1109/lra.2024.3475013

Du Wei , Ding Shifei , Zhang Chenglong , et al . Multiagent reinforcement learning with heterogeneous graph attention network [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2023 , 34 ( 10 ): 6851 - 6860 . DOI: 10.1109/tnnls.2022.3215774 http://dx.doi.org/10.1109/tnnls.2022.3215774

Zhang Zhenqiang , Jiang Xiangyuan , Yang Zhenfa , et al . Scalable multi-robot task allocation using graph deep reinforcement learning with graph normalization [J ] . Electronics , 2024 , 13 ( 8 ): 1561 . DOI: 10.3390/electronics13081561 http://dx.doi.org/10.3390/electronics13081561

Lu Zehui , Zhou Tianyu , Mou Shaoshuai . Real-time multi-robot mission planning in cluttered environment [J ] . Robotics , 2024 , 13 ( 3 ): 40 . DOI: 10.3390/robotics13030040 http://dx.doi.org/10.3390/robotics13030040

Williams R J . Simple statistical gradient-following algorithms for connectionist reinforcement learning [J ] . Machine Learning , 1992 , 8 ( 3/4 ): 229 - 256 . DOI: 10.1007/bf00992696 http://dx.doi.org/10.1007/bf00992696

赵世钰 . 强化学习的数学原理 [M ] . 北京 : 清华大学出版社 , 2024 .

Zhao Shiyu . Mathematical foundations of reinforcement learning [M ] . Beijing : Tsinghua University Press , 2024 . (in Chinese)

Kwon Y D , Choo J , Kim B , et al . POMO: Policy optimization with multiple optima for reinforcement learning [PP/OL ] . V3.arVix ( 2021-07-13 )[ 2026-01-26 ] . https://arxiv.org/abs/2010.16011 https://arxiv.org/abs/2010.16011 .

Shin H S , Li Teng , Lee H I , et al . Sample greedy based task allocation for multiple robot systems [J ] . Swarm Intelligence , 2022 , 16 ( 3 ): 233 - 260 . DOI: 10.1007/s11721-022-00213-0 http://dx.doi.org/10.1007/s11721-022-00213-0

Lagoudakis M G , Berhault M , Koenig S , et al . Simple auctions with performance guarantees for multi-robot task allocation [C ] // 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Piscataway : IEEE , 2004 : 698 - 705 . DOI: 10.1109/iros.2004.1389311 http://dx.doi.org/10.1109/iros.2004.1389311

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Fake News Detection via Multi-Channel Feature Enhancement and Visual-Textual Similarity Awareness

Fine-Grained Inference Task Offloading for Large Language Model in Industrial Edge-Cloud Collaborative Scenarios

Enhancing Multimodal Aspect-Based Sentiment Analysis with Adaptive Noise and Aspect Graph Association Learning

A Causal Tree-of-Thought-Based Model for Battery State-of-Charge Prediction in Electric Vehicles

A Deception Defense Timing Selection Method Based on FlipIt Game with Time Delay and Multi-Agent Reinforcement Learning

Related Author

ZHANG Shi-bin

CAI Song-rui

YANG Min

CHEN Shi-hang

LIAO Ling-ling

TAO Ming

XIE Ren-ping

ZHANG Yin

Related Institution

College of Artificial Intelligence (Xin Gu Industrial College), Chengdu University of Information Technology

School of Cybersecurity (Xin Gu Industrial College), Chengdu University of Information Technology

Advanced Cryptography and System Security Key Laboratory of Sichuan Province

SUGON Industrial Control and Security Center

School of Computer Science and Technology (School of Cyberspace Security), Dongguan University of Technology

⁰