一种基于奖励机制的agent联盟形成策略

李剑; 景博; 杨义先

您当前的位置：

首页 >

文章列表页 >

一种基于奖励机制的agent联盟形成策略

学术论文 | 更新时间：2025-07-16

- 一种基于奖励机制的agent联盟形成策略
- A Strategy to Form Agent Coalition Based on Encouragement
- 电子学报 2008年36卷第S1期页码：71-75
- 作者机构：
  
  1. 北京邮电大学灾备技术国家工程实验室,北京,100876
  2. 北京应用气象研究所计算机室,北京,100029
  3. 北京邮电大学灾备技术国家工程实验室,北京,100876
  4. 北京应用气象研究所计算机室,北京,100029
- 作者简介：
- 基金信息：
  
  国家973高技术研究发展计划 (No.2007CB310704);国家863高技术研究发展计划 (No.2007AA01Z466)
- DOI：
  中图分类号： TP301
- 纸质出版：2008
- 稿件说明：
移动端阅览
李剑, 景博, 杨义先. 一种基于奖励机制的agent联盟形成策略[J]. 电子学报, 2008,36(S1):71-75.

LI Jian, JING Bo, YANG Yi-xian. A Strategy to Form Agent Coalition Based on Encouragement[J]. Acta Electronica Sinica, 2008, 36(S1): 71-75.
李剑, 景博, 杨义先. 一种基于奖励机制的agent联盟形成策略[J]. 电子学报, 2008,36(S1):71-75. DOI：

LI Jian, JING Bo, YANG Yi-xian. A Strategy to Form Agent Coalition Based on Encouragement[J]. Acta Electronica Sinica, 2008, 36(S1): 71-75. DOI：

摘要

为了解决多智能体系统中agent在形成联盟的时候不能同时保持系统全局优化解和联盟的稳定性问题

提出了一种联盟形成时的奖励策略

对于在联盟中执行任务的agent给以适当奖励

从而使得联盟在达到全局最优化解的同时保持稳定.在实验中

以Postman问题作为例子

对三种联盟形成策略即Shapley值策略、均分策略和奖励策略进行了比较.数据表明Shapley值策略和均分策略的时效性差

并且不能保证联盟的稳定性.相反

奖励策略是最有效的

它可以使得联盟达到全局优化解的同时保持稳定

并且时效性好.最后对奖励策略进行了性能分析

从理论上证明了奖励策略的优越性.

Abstract

An encouragement strategy which can achieve global optimal and stable solution simultaneously is presented.In the experiment

the postman problem is selected as the example

and the three strategies which include Shapley value strategy、average share strategy and encouragement strategy are used to compare with.The experimental data show that the Shapley value strategy and average share strategy are very inefficient

and can not ensure the stability of the coalition.However the encouragement strategy is very efficient and can achieve global optimal and stable solution simultaneously.In the end

the superiority of the strategy is analyzed.

关键词

Keywords

references

浏览量

1063

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于冲突代价Bayesian权重的改进PBS多智能体路径规划算法

面向数据库参数调优的协作型多智能体模型

挺进深蓝：从单体仿生到群体智能

一种基于蚁群算法的多任务联盟串行生成算法