多智能体强化学习在足球机器人中的研究与应用

刘春阳; 谭应清; 柳长安; 马莹巍

您当前的位置：

首页 >

文章列表页 >

多智能体强化学习在足球机器人中的研究与应用

科研通信 | 更新时间：2025-07-16

- 多智能体强化学习在足球机器人中的研究与应用
- Application of Multi-Agent Reinforcement Learning in Robot Soccer
- 电子学报 2010年38卷第8期页码：1958-1962
- 作者机构：
  
  1. 华北电力大学控制与计算机工程学院,北京,102206
  2. 北京科技大学信息工程学院,北京,100083
  3. 华北电力大学控制与计算机工程学院北京,102206
  4. 北京科技大学信息工程学院北京,100083
- 作者简介：
- 基金信息：
  
  国家自然科学基金 (No.60775058);华北电力大学青年教师科研基金项目 (No.200721006)
- DOI：
  中图分类号： TP242.6
- 纸质出版：2010
- 稿件说明：
移动端阅览
刘春阳, 谭应清, 柳长安, 等. 多智能体强化学习在足球机器人中的研究与应用[J]. 电子学报, 2010,38(8):1958-1962.

LIU Chun-yang, TAN Ying-qing, LIU Chang-an, et al. Application of Multi-Agent Reinforcement Learning in Robot Soccer[J]. Acta Electronica Sinica, 2010, 38(8): 1958-1962.
刘春阳, 谭应清, 柳长安, 等. 多智能体强化学习在足球机器人中的研究与应用[J]. 电子学报, 2010,38(8):1958-1962. DOI：

LIU Chun-yang, TAN Ying-qing, LIU Chang-an, et al. Application of Multi-Agent Reinforcement Learning in Robot Soccer[J]. Acta Electronica Sinica, 2010, 38(8): 1958-1962. DOI：

摘要

本文提出一种基于投票的多智能体强化学习方法

使球队在比赛中学会协作

自动适应环境

提高实时性和进球数.首先通过定义称为策略的联合行为

将协作问题转化为对策略的学习

简化问题的处理;然后对球场进行划分

以区域表示位置

有效减少了状态空间维数

加快了学习速度;接下来通过区分环境状态并只考虑协作状态

减小状态空间

进一步提高了学习速度;并使用投票的方式综合各个队员的决策

达到协作的目的.最后通过实验结果表明了该方法的正确性和有效性.

Abstract

A multi-agent reinforcement learning method based on voting to solve the collaboration problem of team members is presented.The method translates the collaboration problem into learning strategies by defining joint actions which called the strategies and then can simplify the problem.Through dividing of the playground

the location can be measured by a lot of numbered regions and then can effectively reduce the state-space dimensions to speed up the pace of learning.By distinguishing the environment states and taking the collaboration status into account

that causing the reduction of the state-action space

the learning speed can be further improved.Using a voting process that combines the decisions of the agents can realize the collaboration.At last

experimental results show the effectiveness and correctness of the method.

关键词

Keywords

references

浏览量

2484

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于图组合优化的高效社区搜索

知识数据协同的多对手智能空中博弈策略设计

基于强化学习的免调参即插即用单光子图像重建方法

网络攻击下考虑状态受限的微电网安全运行与控制

基于强化学习的离散事件系统最优定向监控