基于多尺度增量学习的单人体操动作中关键点检测方法

江佳鸿; 夏楠; 李长吾; 周思瑶; 于鑫淼

doi:10.12263/DZXB.20230729

您当前的位置：

首页 >

文章列表页 >

基于多尺度增量学习的单人体操动作中关键点检测方法

学术论文 | 更新时间：2025-12-11

- 基于多尺度增量学习的单人体操动作中关键点检测方法
- Keypoint Detection Method for Single Person Gymnastics Actions Based on Multi-Scale Incremental Learning
- 电子学报 2024年52卷第5期页码：1730-1742
- 作者机构：
  
  大连工业大学信息科学与工程学院，辽宁大连 116034
- 作者简介：
  
  [ "江佳鸿男，1999年生，辽宁辽阳人.现为大连工业大学研究生.主要研究方向为图像处理，人体姿态估计.E-mail: jjh19990901@163.com" ]
  [ "夏楠男，1983年生，辽宁大连人.博士.大连工业大学信息科学与工程学院副教授.主要研究方向为图像处理、人体姿态估计等. E-mail: xia_nan0520@aliyun.com" ]
  李长吾男，1966年生，辽宁大连人.博士.大连工业大学校长，教授.主要研究方向为测试计量技术及仪器、数字信号处理，图像处理. E-mail: lichangwu123456@163.com
  周思瑶女，2000年生，辽宁锦州人.大连工业大学研究生.主要研究方向为计算机视觉. E-mail: 12916004471@qq.com
  于鑫淼男，2000年生，辽宁辽阳人.大连工业大学研究生.主要研究方向为目标检测. E-mail: 1179308761@qq.com
- 基金信息：
  
  教育部产学合作协同育人项目(220603231024713)
- DOI：10.12263/DZXB.20230729
  中图分类号： TP391.4;
- 收稿：2023-08-03，
  
  修回：2024-02-24，
  
  纸质出版：2024-05-25
- 稿件说明：
移动端阅览
江佳鸿, 夏楠, 李长吾, 等. 基于多尺度增量学习的单人体操动作中关键点检测方法[J]. 电子学报, 2024, 52(05): 1730-1742.

JIANG Jia-hong, XIA Nan, LI Chang-wu, et al. Keypoint Detection Method for Single Person Gymnastics Actions Based on Multi-Scale Incremental Learning[J]. Acta Electronica Sinica, 2024, 52(05): 1730-1742.
江佳鸿, 夏楠, 李长吾, 等. 基于多尺度增量学习的单人体操动作中关键点检测方法[J]. 电子学报, 2024, 52(05): 1730-1742. DOI：10.12263/DZXB.20230729

JIANG Jia-hong, XIA Nan, LI Chang-wu, et al. Keypoint Detection Method for Single Person Gymnastics Actions Based on Multi-Scale Incremental Learning[J]. Acta Electronica Sinica, 2024, 52(05): 1730-1742. DOI：10.12263/DZXB.20230729

摘要

人体关键点检测是计算机视觉的热点研究领域.目前，对于体操动作关键点检测，仍存在检测精度不足及缺乏细节部位检测能力等问题.为了提升检测精度，本文设计了一种多分辨率网络，该网络在浅层具备较大感受野，同时能够利用高分辨率通道增强细节特征的提取能力.为实现对手部及脚部关键点的检测，设计了一种增量学习网络.该网络融合了多分辨率网络的浅层特征并利用自建数据集计算深层特征以提升网络对手部及脚部关键点的检测能力.最后对两个网络输出结果进行合并.计算机仿真表明，多分辨率网络在COCO2017关键点检测数据集上达到了94.4%的准确率，并且增量学习网络能够在训练数据较少的情况下实现对细节部位关键点的准确检测.

Abstract

Keypoint detection of human body is a hot research area in computer vision. At present there exist some problems for keypoint detection in gymnastics actions

such as insufficient detection accuracy and lack of capability to detect detailed body parts. In order to improve the detection accuracy

this paper proposes a multi-resolution network that has a larger receptive field in the shallow layers and can utilize high-resolution channel to enhance the extraction of detailed features. To achieve the detection of keypoints of hands and feet

an incremental learning network is designed. The network fuses the shallow features of the multi-resolution network and computes deep features using a gymnastics actions self-built dataset

so that the detection ability of keypoints on hands and feet is improved. Finally

the output results of the two sub-networks are concated. Computer simulations demonstrate that the multi-resolution network achieves an accuracy rate of 94.4% on the COCO2017 keypoint detection dataset

and the incremental learning network can accurately detect keypoints of detailed body parts with fewer training data.

关键词

Keywords

references

ZHANG S Q , WANG C F , DONG W L , et al . A survey on depth ambiguity of 3D human pose estimation [J ] . Applied Sciences , 2022 , 12 ( 20 ): 10591 .

罗会兰 , 童康 , 孔繁胜 . 基于深度学习的视频中人体动作识别进展综述 [J ] . 电子学报 , 2019 , 47 ( 5 ): 1162 - 1173 .

LUO H L , TONG K , KONG F S . The progress of human action recognition in videos based on deep learning: A review [J ] . Acta Electronica Sinica , 2019 , 47 ( 5 ): 1162 - 1173 . (in Chinese)

刘世林 . 基于深度学习的舞蹈动作识别研究 [D ] . 成都 : 电子科技大学 , 2022 .

LIU S L . Research on Dance Action Recognition Based on Deep Learning [D ] . Chengdu : University of Electronic Science and Technology of China , 2022 . (in Chinese)

任笑圆 , 蒋李兵 , 钟卫军 , 等 . 基于视觉的非合作空间目标三维姿态估计方法 [J ] . 电子与信息学报 , 2021 , 43 ( 12 ): 3476 - 3485 .

REN X Y , JIANG L B , ZHONG W J , et al . A vision-based method for 3D pose estimation of non-cooperative space target [J ] . Journal of Electronics & Information Technology , 2021 , 43 ( 12 ): 3476 - 3485 . (in Chinese)

CHEN Y L , WANG Z C , PENG Y X , et al . Cascaded pyramid network for multi-person pose estimation [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7103 - 7112 .

FANG H S , XIE S Q , TAI Y W , et al . RMPE: Regional multi-person pose estimation [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 2353 - 2362 .

NEWELL A , YANG K Y , DENG J . Stacked hourglass networks for human pose estimation [C ] // European Conference on Computer Vision . Cham : Springer , 2016 : 483 - 499 .

SUN K , XIAO B , LIU D , et al . Deep high-resolution representation learning for human pose estimation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 5686 - 5696 .

CAO Z , HIDALGO G , SIMON T , et al . OpenPose: Realtime multi-person 2D pose estimation using part affinity fields [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 1 ): 172 - 186 .

INSAFUTDINOV E , PISHCHULIN L , ANDRES B , et al . DeeperCut: A deeper, stronger, and faster multi-person pose estimation model [C ] // European Conference on Computer Vision . Cham : Springer , 2016 : 34 - 50 .

CHENG B W , XIAO B , WANG J D , et al . Bottom-up higher-resolution networks for multi-person pose estimation [C ] // 2020 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1 - 10 .

KREISS S , BERTONI L , ALAHI A . PifPaf: Composite fields for human pose estimation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 11969 - 11978 .

WANG D , ZHANG S , HUANG W W , et al . Contextual instance decoupling for robust multiperson pose estimation [C ] // IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 11050 - 11058 .

LUO Z X , WANG Z C , HUANG Y , et al . Rethinking the heatmap regression for bottom-up human pose estimation [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 13259 - 13268 .

杨红红 , 王刘丽 , 张玉梅 , 等 . 基于序列多尺度特征融合表示的层级舞蹈动作姿态估计方法 [J ] . 电子学报 , 2021 , 49 ( 12 ): 2428 - 2436 .

YANG H H , WANG L L , ZHANG Y M , et al . Hierarchical dance pose estimation algorithm based on sequential multi-scale feature fusion [J ] . Acta Electronica Sinica , 2021 , 49 ( 12 ): 2428 - 2436 . (in Chinese)

沈栎 , 陈莹 . 带特征监控的高维信息编解码端到端无标记人体姿态估计网络 [J ] . 电子学报 , 2020 , 48 ( 8 ): 1528 - 1537 .

SHEN L , CHEN Y . Feature monitored high-dimension endecoder net for end to end markless human pose estimation [J ] . Acta Electronica Sinica , 2020 , 48 ( 8 ): 1528 - 1537 . (in Chinese)

WANG Y J , LUO Y M , BAI G H , et al . UformPose: A U-shaped hierarchical multi-scale keypoint-aware framework for human pose estimation [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2023 , 33 ( 4 ): 1697 - 1709 .

KE L P , CHANG M C , QI H G , et al . DetPoseNet: Improving multi-person pose estimation via coarse-pose filtering [J ] . IEEE Transactions on Image Processing , 2022 , 31 : 2782 - 2795 .

ZHAO L , XU J , GONG C , et al . Learning to acquire the quality of human pose estimation [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2021 , 31 ( 4 ): 1555 - 1568 .

LIN J L , ZHENG Z D , ZHONG Z , et al . Joint representation learning and keypoint detection for cross-view geo-localization [J ] . IEEE Transactions on Image Processing , 2022 , 31 : 3780 - 3792 .

储珺 , 束雯 , 周子博 , 等 . 结合语义和多层特征融合的行人检测 [J ] . 自动化学报 , 2022 , 48 ( 1 ): 282 - 291 .

CHU J , SHU W , ZHOU Z B , et al . Combining semantics with multi-level feature fusion for pedestrian detection [J ] . Acta Automatica Sinica , 2022 , 48 ( 1 ): 282 - 291 . (in Chinese)

SUN H M , YANG F S , MA J W . Seismic random noise attenuation via self-supervised transfer learning [J ] . IEEE Geoscience and Remote Sensing Letters , 2022 , 19 : 3146173 .

许啸 . 基于深度学习的舞蹈动作分析与生成 [D ] . 北京 : 北京工业大学 , 2021 .

XU X . Analysis and Generation of Dance Movements Based on Deep Learning [D ] . Beijing : Beijing University of Technology , 2021 . (in Chinese)

赵海燕 , 马权益 , 曹健 , 等 . 面向任务扩展的增量学习动态神经网络: 研究进展与展望 [J ] . 电子学报 , 2023 , 51 ( 6 ): 1710 - 1724 .

ZHAO H Y , MA Q Y , CAO J , et al . Dynamic neural network for incremental learning with task extended: Research progress and prospect [J ] . Acta Electronica Sinica , 2023 , 51 ( 6 ): 1710 - 1724 . (in Chinese)

ZHAO X , WANG Z D , GAO L , et al . Incremental face clustering with optimal summary learning via graph convolutional network [J ] . Tsinghua Science and Technology , 2021 , 26 ( 4 ): 536 - 547 .

ZHANG T , LIAN J X , WEN J T , et al . Multi-person pose estimation in the wild: Using adversarial method to train a top-down pose estimation network [J ] . IEEE Transactions on Systems, Man, and Cybernetics: Systems , 2023 , 53 ( 7 ): 3919 - 3929 .

WEN B , ZHU Q Y . Class-incremental learning based on big dataset pre-trained models [J ] . IEEE Access , 2023 , 11 : 62028 - 62038 .

FANG H S , LU G S , FANG X L , et al . Weakly and semi supervised human body part parsing via pose-guided knowledge transfer [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 70 - 78 .

LIN T Y , MAIRE M , BELONGIE S , et al . Microsoft COCO: Common objects in context [C ] // European Conference on Computer Vision . Cham : Springer , 2014 : 740 - 755 .

FANG H S , LI J F , TANG H Y , et al . AlphaPose: Whole-body regional multi-person pose estimation and tracking in real-time [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 6 ): 7157 - 7173 .

李超 . 基于卷积神经网络的人体行为分析与步态识别研究 [D ] . 杭州 : 浙江大学 , 2019 .

LI C . Human Motion Analysis and Gait Recognition Based on Deep Convolutional Neural Network [D ] . Hangzhou : Zhejiang University , 2019 . (in Chinese)

PANDUREVIC D , DRAGA P , SUTOR A , et al . Analysis of competition and training videos of speed climbing athletes using feature and human body keypoint detection algorithms [J ] . Sensors , 2022 , 22 ( 6 ): 2251 .

LIU K , CHEN L L , XIE L , et al . Auto calibration of multi‐camera system for human pose estimation [J ] . IET Computer Vision , 2022 , 16 ( 7 ): 607 - 618 .

XU L M , JIN S , LIU W T , et al . ZoomNAS: Searching for whole-body human pose estimation in the wild [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 4 ): 5296 - 5313 .

王敬宇 , 黄伟亭 , 刘聪 , 等 . 基于局部深度一致性的自监督手部姿态估计 [J ] . 电子学报 , 2023 , 51 ( 6 ): 1644 - 1653 .

WANG J Y , HUANG W T , LIU C , et al . Self-supervised hand pose estimation with regional depth correspondence [J ] . Acta Electronica Sinica , 2023 , 51 ( 6 ): 1644 - 1653 . (in Chinese)

YAN Q , XU Y , YANG X K . A robust homography estimation method based on keypoint consensus and appearance similarity [C ] // 2012 IEEE International Conference on Multimedia and Expo . Piscataway : IEEE , 2012 : 586 - 591 .

ZHANG T L , JIA S C , CHENG X , et al . Tuning convolutional spiking neural network with biologically plausible reward propagation [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2022 , 33 ( 12 ): 7621 - 7631 .

LANG Y Z , QIAN Y S , KONG X Y , et al . Effective enhancement method of low-light-level images based on the guided filter and multi-scale fusion [J ] . Journal of the Optical Society of America . A, Optics, Image Science, and Vision, 2023 , 40 ( 1 ): 1 - 9 .

CHEN M Z , WANG X , WANG M Z , et al . Estimating rainfall from surveillance audio based on parallel network with multi-scale fusion and attention mechanism [J ] . Remote Sensing , 2022 , 14 ( 22 ): 5750 .

李超 , 黄新宇 , 王凯 . 基于特征融合和自学习锚框的高分辨率图像小目标检测算法 [J ] . 电子学报 , 2022 , 50 ( 7 ): 1684 - 1695 .

LI C , HUANG X Y , WANG K . Small object detection of high-resolution images based on feature fusion and learnable anchor [J ] . Acta Electronica Sinica , 2022 , 50 ( 7 ): 1684 - 1695 . (in Chinese)

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

一种基于双层模型和指标分布的恶意网络流持续检测和分类方法

面向任务扩展的增量学习动态神经网络:研究进展与展望

一种基于降噪自动编码器和宽度学习的增量式疾病预测模型

具有合适拒识机制的高正确识别率分类器设计