3D多支路聚合轻量网络视频行为识别算法研究

胡正平; 刁鹏成; 张瑞雪; 李淑芳; 赵梦瑶

doi:10.3969/j.issn.0372-2112.2020.07.003

您当前的位置：

首页 >

文章列表页 >

3D多支路聚合轻量网络视频行为识别算法研究

学术论文 | 更新时间：2025-07-08

- 3D多支路聚合轻量网络视频行为识别算法研究
- Research on 3D Multi-Branch Aggregated Lightweight Network Video Action Recognition Algorithm
- 电子学报 2020年48卷第7期页码：1261-1268
- 作者机构：
  
  1. 燕山大学信息科学与工程学院,河北,秦皇岛,066004
  2. 燕山大学河北省信息传输与信号处理重点实验室,河北,秦皇岛,066004
  3. 燕山大学信息科学与工程学院,河北,秦皇岛,066004
  4. 燕山大学河北省信息传输与信号处理重点实验室,河北,秦皇岛,066004
- 作者简介：
- 基金信息：
  
  国家自然科学基金面上项目 (No.61771420）;河北省自然科学基金 (No.F2016203422）
- DOI：10.3969/j.issn.0372-2112.2020.07.003
  中图分类号： TP391.4
- 网络出版：2020-07-25，
  
  纸质出版：2020
- 稿件说明：
移动端阅览
胡正平, 刁鹏成, 张瑞雪, 等. 3D多支路聚合轻量网络视频行为识别算法研究[J]. 电子学报, 2020,48(7):1261-1268.

Research on 3D Multi-Branch Aggregated Lightweight Network Video Action Recognition Algorithm[J]. Acta Electronica Sinica, 2020, 48(7): 1261-1268.
胡正平, 刁鹏成, 张瑞雪, 等. 3D多支路聚合轻量网络视频行为识别算法研究[J]. 电子学报, 2020,48(7):1261-1268. DOI： 10.3969/j.issn.0372-2112.2020.07.003.

Research on 3D Multi-Branch Aggregated Lightweight Network Video Action Recognition Algorithm[J]. Acta Electronica Sinica, 2020, 48(7): 1261-1268. DOI： 10.3969/j.issn.0372-2112.2020.07.003.

摘要

为构建拥有2D神经网络速度同时保持3D神经网络性能的视频行为识别模型，提出3D多支路聚合轻量网络行为识别算法.首先，利用分组卷积将神经网络分割成多个支路；其次，为促进支路间信息流动，加入具有信息聚合功能的多路复用模块；最后，引入自适应注意力机制，对通道与时空信息进行重定向.实验表明，本算法在UCF101数据集上的计算成本为11.5GFlops，准确率为96.2%；在HMDB51数据集上的计算成本为11.5GFlops，准确率为74.7%.与其他行为识别算法相比，提高了视频识别网络的效率，体现出一定识别速度和准确率优势.

Abstract

To construct a video action recognition model with 2D neural network speed while maintaining the performance of 3D neural network

the 3D multi-branch aggregation lightweight network action recognition algorithm is proposed. Firstly

the neural network is divided into multiple branches by using grouped convolution. Secondly

to promote the information exchange between branches

a multiplexer module with information aggregation function is added. Finally

the adaptive attention mechanism is introduced to redirect channel and spatio-temporal information. Experiments show that

the computational cost of the algorithm on the UCF101 dataset is 11.5GFlops

and the accuracy is 96.2%; the computational cost on the HMDB51 dataset is 11.5GFlops

and the accuracy is 74.7%. Compared with other action recognition algorithms

it improves the efficiency of the video recognition network and reflects certain recognition speed and accuracy advantages.

关键词

Keywords

references

浏览量

180

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于神经网络的图像风格迁移算法综述

基于深度学习和智能规划的行为识别

面向时序异常检测的可变视距多向扫描方法