电子学报 ›› 2022, Vol. 50 ›› Issue (8): 2018-2036.DOI: 10.12263/DZXB.20211359

• 综述评论 • 上一篇    下一篇

深度学习框架下群组行为识别算法综述

邓海刚1, 王传旭2, 李成伟1, 林晓萌2   

  1. 1.哈尔滨工业大学仪器科学与工程学院,黑龙江 哈尔滨 150006
    2.青岛科技大学信息科学技术学院,山东 青岛 266061
  • 收稿日期:2021-10-09 修回日期:2022-01-11 出版日期:2022-08-25
    • 作者简介:
    • 邓海刚 男,1985年8月生,山东菏泽人.现为哈尔滨工业大学仪器科学与工程学院博士研究生.主要研究方向为仪器科学与技术.E-mail: 307140082@qq.com
      王传旭 男,1968年生,山东邹城人.现为青岛科技大学信息科学技术学院教授、硕士生导师.主要研究方向为计算机视觉与模式识别.E-mail: wangchuanxu_qd@qust.edu.cn
      李成伟 男,1963年生,黑龙江哈尔滨人.现为哈尔滨工业大学仪器科学与工程学院教授、博士生导师.主要研究方向为仪器科学与技术.E-mail: chengweili@hit.edu.cn
      林晓萌 女,1995 年 11 月生,山东潍坊人 .青岛科技大学信息科学技术学院硕士 . 研究方向为计算机视觉与模式识别. E-mail: 1104612139@qq.com
    • 基金资助:
    • 国家自然科学基金 (61672035)

Summarization of Group Activity Recognition Algorithms Based on Deep Learning Frame

DENG Hai-gang1, WANG Chuan-xu2, LI Cheng-wei1, LIN Xiao-meng2   

  1. 1.School of Instrumentation Science and Engineering,Harbin Institute of Technology,Harbin,Heilongjiang 150006,China
    2.School of Information Science and Technology,Qingdao University of Science and Technology,Qingdao,Shandong 266061,China
  • Received:2021-10-09 Revised:2022-01-11 Online:2022-08-25 Published:2022-09-08

摘要:

群组行为识别目前是计算机视觉领域的一个研究热点,在智能安防监控、社会角色理解和体育运动视频分析等方面具有广泛的应用价值.本文主要针对基于深度学习框架下的群组行为识别算法进行综述.首先,依据群组行为识别方法中“是否包含组群成员交互关系建模”这一核心技术环节,将现有算法划分为“无交互关系建模的群组行为识别”和“基于交互关系描述的群组行为识别”两大类.其次,鉴于“无交互关系建模的群组行为识别方法”主要是聚焦于如何对“群组行为时序过程的整体时空特征的计算和提纯”进行设计的,故本文从“多流时空特征计算融合”“个人/群体多层级时空特征计算合并”“基于注意力机制的群组行为时空特征提纯”3类典型算法进行概述.再次,对于“基于交互关系建模的群组行为识别”,依据对交互关系描述方法的不同,将其归纳为“基于组群成员全局交互关系建模”“基于组群分组下的交互关系建模”和“基于关键人物为主的核心成员间交互关系建模”3种类别分别概述.然后,对群组行为识别相关的数据集进行介绍,并对不同识别方法在各个数据集的测试性能进行了对比和总结.最后,分别从群组行为类别定义的二元性、交互关系建模的难点与不足、群组行为数据集弱监督标注和自学习、视角变化以及场景信息综合利用等方面概述了几个具有挑战性的问题和未来研究的方向.

关键词: 群组行为识别, 分组交互关系, 全局交互关系, 关键人物建模, 多流层级网络

Abstract:

Group behavior recognition is currently a research hotspot in the field of computer vision, and has a wide range of applications in intelligent security monitoring, social role understanding, and sports video analysis. This article mainly reviews group behavior recognition algorithms based on deep learning framework. Firstly, by judging “whether a method including group member interaction relationship modeling”, it can be classified as “group behavior recognition without interaction relationship modeling(GBRWIR)” or “group behavior recognition based on interaction relationship description(GBRBIR)”. Secondly, because GBRWIR mainly focuses on how to design “calculation and purification of overall spatiotemporal characteristics of a group behavior sequence”, this article summarizes it as the following three typical algorithms, which are “multi-stream spatiotemporal feature calculation fusion”, “individual/group multi-level spatiotemporal feature calculation and merging”, and “group behavior spatiotemporal feature purification based on attention mechanism” respectively. Thirdly, for GBRWIR algorithms, depending on its different descriptions of interaction relationship, it can be summarized respectively as “based on group member global interaction relationship modeling”, “based on group division and subgroup interaction modeling”, and “modeling of interactions between core members”. Then, the data sets related to group behavior recognition are introduced, and the test performances of different recognition methods in each data set are compared and summarized. Finally, several challenging issues and future research directions are discussed, which respectively are the duality of group behavior category definition, the difficulty of interactive relationship modeling, the weakly supervised labeling and self-learning of group behavior recognition, and the changes of viewpoint and the comprehensive utilization of scene information.

Key words: group behavior recognition, group interaction relation, overall interaction, key person modeling, multi-stream hierarchical network

中图分类号: