1. 中国矿业大学信息与电气工程学院,江苏,徐州,221116
2. 中国科学院自动化研究所,北京,100190
3. 中国矿业大学信息与电气工程学院江苏徐州,221116
4. 中国科学院自动化研究所北京,100190
纸质出版:2009
移动端阅览
王雪松, 张依阳, 程玉虎. 基于高斯过程分类器的连续空间强化学习[J]. 电子学报, 2009,37(6):1153-1158.
WANG Xue-song, ZHANG Yi-yang, CHENG Yu-hu. Reinforcement Learning for Continuous Spaces Based on Gaussian Process Classifier[J]. Acta Electronica Sinica, 2009, 37(6): 1153-1158.
如何将强化学习方法推广到大规模或连续空间
是决定强化学习方法能否得到广泛应用的关键.不同于已有的值函数逼近法
把强化学习构建为一个简单的二分类问题
利用分类算法来得到强化学习中的策略
提出一种基于高斯过程分类器的连续状态和连续动作空间强化学习方法.首先将连续动作空间离散化为确定数目的离散动作
然后利用高斯分类器对系统的连续状态-离散动作对进行正负分类
对判定为正类的离散动作按其概率值进行加权求和
进而得到实际作用于系统的连续动作.小船靠岸问题的仿真结果表明所提方法能够有效解决强化学习的连续空间表示问题.
The generalization of reinforcement learning methods to large-scale or continuous spaces has become a major focus in the research field of reinforcement learning.Unlike the present reinforcement learning methods for continuous spaces based on a value-function approximation method
the reinforcement learning is constructed as a simple binary-class problem.A kind of reinforcement learning method for continuous state and action spaces based on a Gaussian process classifier is proposed using a classification algorithm to obtain a control policy.At first
a continuous action space is discretized into discrete actions with definite number
and the Gaussian process classifier is used to predict the probability of class for a continuous-state-discrete-action pair.Then a continuous action is generated based on a weighted operation of the positive actions with their probability values.Computer simulations involving a boat problem illustrate the validity of the proposed reinforcement learning method.
0
浏览量
2887
下载量
9
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621