
基于多层级视觉融合的图像描述模型
Image Captioning Model Based on Multi‑Level Visual Fusion
传统方法在视觉策略网络中只关注实体,不能够推理出实体和属性之间的联系,在语言策略网络存在暴露偏差和误差累计问题.为此,提出了一个基于强化学习的多层级视觉融合网络模型.在视觉策略网络中通过多层级神经网络模块将视觉特征转化为视觉知识的特征集.融合网络生成使描述语句更加流畅的虚词,用于视觉策略网络和语言策略网络的互动.在语言策略网络中使用基于强化学习的自批评策略梯度算法对视觉融合网络实现端到端的优化.实验结果表明,该模型在MS‑COCO数据集取得不错效果,将Karpathy分割测试中的CIDEr值从120.1提高到124.3.
Traditional methods only focus on entities in the visual strategy network and cannot deduce the relationship between entities and attributes. There are problems of exposure bias and error accumulation in the language strategy network. Therefore, this paper proposes a multi‑level visual fusion network model based on reinforcement learning. In the visual strategy network, multi‑level sub‑neural network module is used to transform visual features into feature sets of visual knowledge. The fusion network generates the function words which make the description sentences more fluent and can be used for the interaction between the visual strategy network and the language strategy network. The gradient algorithm of self‑criticism strategy based on reinforcement learning is used to optimize the visual fusion network end‑to‑end. The experimental results show that the model can get good results in MS‑COCO data set and improve the CIDEr value of Karpathy segmentation test from 120.1 to 124.3.
图像描述 / 视觉融合 / 强化学习 / 策略网络 / 机器学习 / 注意力机制 {{custom_keyword}} /
image captioning / visual fusion / reinforcement learning / strategy network / machine learning / attention mechanism {{custom_keyword}} /
1 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
2 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
3 |
张志昌,曾扬扬,庞雅丽. 融合语义角色和自注意力机制的中文文本蕴含识别[J]. 电子学报,2020,48(11): 2162 - 2169.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
4 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
5 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
6 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
7 |
汤鹏杰, 王瀚漓, 许恺晟. LSTM 逐层多目标优化及多层概率融合的图像描述[J]. 自动化学报, 2018, 44(7): 1237 - 1249.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
8 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
9 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
10 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
11 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
12 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
{{custom_ref.label}} |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
/
〈 |
|
〉 |