1. 武汉大学软件国家重点实验室,湖北,武汉,430072
2. 武汉大学计算机学院,湖北,武汉,430072
3. 湖南大学电气与信息工程学院,湖南,长沙,410082
4. 武汉大学软件国家重点实验室,湖北,武汉,430072
5. 武汉大学计算机学院,湖北,武汉,430072
6. 湖南大学电气与信息工程学院,湖南,长沙,410082
网络出版:2017-01-25,
纸质出版:2017
移动端阅览
周炫余, 刘娟, 卢笑, 等. 一种联合文本和图像信息的行人检测方法[J]. 电子学报, 2017,45(1):140-146.
ZHOU Xuan-yu, LIU Juan, LU Xiao, et al. A Method for Pedestrian Detection by Combining Textual and Visual Information[J]. Acta Electronica Sinica, 2017, 45(1): 140-146.
周炫余, 刘娟, 卢笑, 等. 一种联合文本和图像信息的行人检测方法[J]. 电子学报, 2017,45(1):140-146. DOI: 10.3969/j.issn.0372-2112.2017.01.020.
ZHOU Xuan-yu, LIU Juan, LU Xiao, et al. A Method for Pedestrian Detection by Combining Textual and Visual Information[J]. Acta Electronica Sinica, 2017, 45(1): 140-146. DOI: 10.3969/j.issn.0372-2112.2017.01.020.
针对纯视觉行人检测方法存在的误检、漏检率高,遮挡目标以及小尺度目标检测精度低等问题,提出一种联合文本和图像信息的行人检测方法.该方法首先利用图像分析的方法初步获取图像目标的候选框,其次通过文本分析的方法获取文本中有关图像目标的实体表达,并提出一种基于马尔科夫随机场的模型用于推断图像候选框与文本实体表达之间的共指关系(Coreference Relation),以此达到联合图像和文本信息以辅助机器视觉提高交通场景下行人检测精度的目的.在增加了图像文本描述的加州理工大学行人检测数据集上进行的测评结果表明,该方法不仅可以在图像信息的基础上联合文本信息提高交通场景中的行人检测精度,也能在文本信息的基础上联合图像信息提高文本中的指代消解(Anaphora Resolution)精度.
Existing vision-based pedestrian detection methods encounter many flaws
such as high false and miss detection rates
low detection accuracy on partial occluded and small scale objects
etc.In this paper
we propose a pedestrian detection method combining textual and visual information together.First
we use a vision-based method to initially localize the candidate visual objects.Second
we analyze the text information to get the text mentions corresponding to the visual objects.Finally
we propose a Markov random field-based model to infer the coreference relations between the candidate visual objects and textual mentions
so that the visual and textual information can be fused efficiently.The experimental results on the Caltech pedestrian detection benchmark enriched with textual description information have shown that the proposed method can not only improve the pedestrian detection accuracy by combining textual information with visual information
but also outperform the baseline anaphora resolution model by combining visual information with textual information.
0
浏览量
1061
下载量
5
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621