基于局部异构聚合图卷积网络的跨模态行人重识别

孙锐; 张磊; 余益衡; 张旭东

doi:10.12263/DZXB.20220011

您当前的位置：

首页 >

文章列表页 >

基于局部异构聚合图卷积网络的跨模态行人重识别

学术论文 | 更新时间：2025-12-08

- 基于局部异构聚合图卷积网络的跨模态行人重识别
- Cross-Modality Person Re-identification Based on Locally Heterogeneous Polymerization Graph Convolutional Network
- 电子学报 2023年51卷第4期页码：810-825
- 作者机构：
  
  1.合肥工业大学计算机与信息学院，安徽合肥 230601
  2.工业安全与应急技术安徽省重点实验室，安徽合肥 230009
- 作者简介：
  
  [ "孙锐男，1976年出生于安徽省.现为合肥工业大学计算机与信息学院教授.主要研究方向为计算机视觉、机器学习.中国电子学会会员编号：E190005402S.E-mail: sunrui@hfut.edu.cn" ]
  [ "张磊男，1997年出生于安徽省. 现为合肥工业大学计算机与信息学院硕士研究生. 主要研究方向为图像信息处理、计算机视觉：.E-mail: 2020171121@mail.hfut.edu.cn" ]
  [ "余益衡男，1997年出生于浙江省.现为合肥工业大学计算机与信息学院硕士研究生.主要研究方向为图像信息处理、计算机视觉.E-mail: 2020111040@mail.hfut.edu.cn" ]
  [ "张旭东男，1966年出生于安徽省.现为合肥工业大学计算机与信息学院教授.主要研究方向为智能信息处理、机器视觉.E-mail: xudong@hfut.edu.cn" ]
- 基金信息：
  
  国家自然科学基金面上项目(61471154;61876057);安徽省重点研发计划-科技强警专项项目(202004d07020012)
- DOI：10.12263/DZXB.20220011
  中图分类号： TP391.41;
- 收稿：2021-12-29，
  
  修回：2022-05-13，
  
  纸质出版：2023-04-25
- 稿件说明：
移动端阅览
孙锐,张磊,余益衡等.基于局部异构聚合图卷积网络的跨模态行人重识别[J].电子学报,2023,51(04):810-825.

SUN Rui,ZHANG Lei,YU Yi-heng,et al.Cross-Modality Person Re-identification Based on Locally Heterogeneous Polymerization Graph Convolutional Network[J].ACTA ELECTRONICA SINICA,2023,51(04):810-825.
孙锐,张磊,余益衡等.基于局部异构聚合图卷积网络的跨模态行人重识别[J].电子学报,2023,51(04):810-825. DOI： 10.12263/DZXB.20220011.

SUN Rui,ZHANG Lei,YU Yi-heng,et al.Cross-Modality Person Re-identification Based on Locally Heterogeneous Polymerization Graph Convolutional Network[J].ACTA ELECTRONICA SINICA,2023,51(04):810-825. DOI： 10.12263/DZXB.20220011.

摘要

由于构建全天候视频监控系统的需要，基于可见光与红外的跨模态行人重识别问题受到学术界的广泛关注.因为类内变化和类间差异的影响，可见光与红外行人重识别是一项具有挑战性的任务.现有的工作主要集中在可见光-红外图像转换或跨模态的全局共享特征学习，而身体部位的局部特征和这些特征之间的结构关系在很大程度上被忽略了.我们认为局部关键点之间的图结构关系在模态内与模态间的变化是相对稳定的，充分挖掘与表示这种结构信息有助于解决跨模态行人重识别问题.本文提出了一种基于局部异构聚合图卷积网络的跨模态行人重识别方法，采用关键点提取网络提取图像的局部关键点特征，并构建了一种新颖的图卷积网络建模人体各部位之间的结构关系.该网络通过图内卷积层表征局部特征的高阶结构关系信息，提取具有辨别力的局部特征.网络中的跨图卷积层使两个异构图结构之间可以传递差异性特征，有助于减弱模态差异的影响.针对异构图结构的图匹配问题，设计了一种跨模态排列损失以更好地测度图结构的距离.本文方法在主流跨模态数据集RegDB和SYSU-MM01上的mAP/Rank-1为80.78%/80.55%和67.92%/66.49%，比VDCM算法的Rank-1分数高出7.58%和1.87%.

Abstract

The research of cross-modality person re-identification based on visible-infrared has attracted widespread attention from the academia due to the need to build an all-day video surveillance system. Visible-infrared person re-identification is a challenging task due to intra-class variation and cross-modality discrepancy. Existing work focused on visible-infrared modal transformations or global shared feature learning across modalities

while local features of body parts and the structural relationships between these features have been largely ignored. We consider that the graph structure relationship between local key-points is relatively stable within and between modality variations

and fully mining and representing this structural information can help solve the cross-modal person re-identification problem. Therefore

this paper proposes a cross-modal person re-identification method based on local heterogeneous polymerization graph convolutional networks. A key-points extraction network is used to extract the local key-points' features of the image

and then a novel graph convolutional network is constructed to model the structural relationships between various parts of the human body. The network characterizes the higher-order structural relationship information of local features through the intra-graph convolutional layer

and finally extracts discriminative local features. The cross-graph convolutional layer in the network enables the transfer of discriminative features between two heterogeneous graph structures

which helps to reduce the effect of modal differences. Finally

a cross-modality permutation loss is designed to better measure the distance of graph structures for the graph matching problem of heterogeneous graph structures. The mAP/Rank-1 of our method on the mainstream cross-modal datasets RegDB and SYSU-MM01 is 80.78%/80.55% and 67.92%/66.49%

which is 7.58% and 1.87% higher than the Rank-1 scores of the VDCM algorithm.

关键词

Keywords

references

罗浩 , 姜伟 , 范星 , 等 . 基于深度学习的行人重识别研究进展 [J ] . 自动化学报 , 2019 , 45 ( 11 ): 2032 - 2049 .

LUO H , JIANG W , FAN X , et al . A survey on deep learning based person re-identification [J ] . Acta Automatica Sinica , 2019 , 45 ( 11 ): 2032 - 2049 . (in Chinese)

YE M , SHEN J B , LIN G J , et al . Deep learning for person re-identification: A survey and outlook [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 6 ): 2872 - 2893 .

ALEXANDER H , LUCAS B , BASTIAN L . In defense of the triplet loss for person re-identification [EB/OL ] . ( 2017-03-22 )[ 2021-12 ] . https://arxiv.org/abs/1703.07737 https://arxiv.org/abs/1703.07737 .

ZHENG Z D , ZHENG L , YANG Y . A discriminatively learned CNN embedding for person reidentification [J ] . ACM Transactions on Multimedia Computing, Communications, and Applications , 2018 , 14 ( 1 ): 1 - 20 .

LI W , ZHU X T , GONG S G . Person re-identification by deep joint learning of multi-loss classification [C ] // Proceedings of the 26th International Joint Conference on Artificial Intelligence . Melbourne : AAAI Press , 2017 : 2194 - 2200 .

SUN Y F , ZHENG L , YANG Y , et al . Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline) [C ] // European Conference on Computer Vision - ECCV 2018 . Munich : Springer , 2018 : 501 - 518 .

TANG Z , NAPHADE M , LIU M Y , et al . CityFlow: A city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach : IEEE , 2019 : 8789 - 8798 .

WU Y , LIN Y T , DONG X Y , et al . Progressive learning for person re-identification with one example [J ] . IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society , 2019 : 2872 - 2881 .

ZHENG Z D , ZHENG L , YANG Y . Unlabeled samples generated by GAN improve the person re-identification baseline in vitro [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Venice : IEEE , 2017 : 3774 - 3782 .

GE Y X , LI Z W , ZHAO H Y , et al . FD-GAN: Pose-guided feature distilling GAN for robust person re-identification [C ] // Proceedings of the 32nd International Conference on Neural Information Processing Systems . Montréal : Curran Associates Inc. , 2018 : 1230 - 1241 .

LIU J X , NI B B , YAN Y C , et al . Pose transferrable person re-identification [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 4099 - 4108 .

QIAN X L , FU Y W , XIANG T , et al . Pose-normalized image generation for person re-identification [C ] // European Conference on Computer Vision - ECCV 2018 . Munich : Springer , 2018 : 661 - 678 .

ZHENG Z D , YANG X D , YU Z D , et al . Joint discriminative and generative learning for person re-identification [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach : IEEE , 2019 : 2133 - 2142 .

WU A C , ZHENG W S , YU H X , et al . RGB-infrared cross-modality person re-identification [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Venice : IEEE , 2017 : 5390 - 5399 .

KONG J , HE Q B , JIANG M , et al . Dynamic center aggregation loss with mixed modality for visible-infrared person re-identification [J ] . IEEE Signal Processing Letters , 2021 , 28 : 2003 - 2007 .

YE M , WANG Z , LAN X Y , et al . Visible thermal person re-identification via dual-constrained top-ranking [C ] // Proceedings of the 27th International Joint Conference on Artificial Intelligence . Stockholm : AAAI Press , 2018 : 1092 - 1099 .

YE M , LAN X Y , WANG Z , et al . Bi-directional center-constrained top-ranking for visible thermal person re-identification [J ] . IEEE Transactions on Information Forensics and Security , 2020 , 15 : 407 - 419 .

HAO Y , WANG N N , LI J , et al . HSME: Hypersphere manifold embedding for visible thermal person re-identification [C ] // Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence . Honolulu : AAAI Press , 2019 : 8385 - 8392 .

ZHU Y X , YANG Z , WANG L , et al . Hetero-Center loss for cross-modality person re-identification [J ] . Neurocomputing , 2020 , 386 : 97 - 109 .

DAI P Y , JI R R , WANG H B , et al . Cross-modality person re-identification with generative adversarial training [C ] // Proceedings of the 27th International Joint Conference on Artificial Intelligence . Stockholm : AAAI Press , 2018 : 677 - 683 .

WANG Z X , WANG Z , ZHENG Y Q , et al . Learning to reduce dual-level discrepancy for infrared-visible person re-identification [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach : IEEE , 2019 : 618 - 626 .

WANG G A , ZHANG T Z , YANG Y , et al . Cross-modality paired-images generation for RGB-infrared person re-identification [C ] // Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence and the Thirty-Second Conference on Innovative Applications of Artificial Intelligence and the Tenth Symposium on Educational Advances in Artificial Intelligence . New York : AAAI Press , 2020 : 12144 - 12151 .

NGUYEN D T , HONG H G , KIM K W , et al . Person recognition system based on a combination of body images from visible light and thermal cameras [J ] . Sensors (Basel, Switzerland) , 2017 , 17 ( 3 ): 605 .

孙锐 , 张磊 , 余益衡 , 等 . 一种基于异构融合图卷积网络的跨模态行人重识别方法 : CN113989851A [P ] . 2022-01-28 .

SUN R , ZHANG L , YU Y H , et al . Cross-modal pedestrian re-identification method based on heterogeneous fusion graph convolutional network : CN113989851A [P ] . 2022-01-28 . (in Chinese)

HE L X , LIANG J , LI H Q , et al . Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 7073 - 7082 .

ZHENG L , YANG Y , ALEXANDER G H . Person re-identification: Past, present and future [EB/OL ] . ( 2016-10-10 )[ 2021-12 ] . https://arxiv.org/abs/1610.02984 https://arxiv.org/abs/1610.02984 .

WANG G A , YANG S , LIU H Y , et al . High-order information matters: Learning relation and topology for occluded person re-identification [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle : IEEE , 2020 : 6448 - 6457 .

SHEN Y T , LI H S , YI S , et al . Person re--identification with deep similarity-guided graph neural network [C ] // European Conference on Computer Vision - ECCV 2018 . Munich : Springer , 2018 : 508 - 526 .

YAN Y C , ZHANG Q , NI B B , et al . Learning context graph for person search [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach : IEEE , 2019 : 2153 - 2162 .

WU Y M , BOURAHLA O E F , LI X , et al . Adaptive graph representation learning for video person re-identification [J ] . IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society , 2020 , PP: 10.1109/TIP.2020.3001693.

YANG J R , ZHENG W S , YANG Q Z , et al . Spatial-temporal graph convolutional network for video-based person re-identification [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle : IEEE , 2020 : 3286 - 3296 .

CHOI S , LEE S M , KIM Y , et al . Hi-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle : IEEE , 2020 : 10254 - 10263 .

YANG J W , SHEN X , TIAN X M , et al . Local convolutional neural networks for person re-identification [C ] // Proceedings of the 26th ACM International Conference on Multimedia . Seoul : ACM , 2018 : 1074 - 1082 .

WANG G S , YUAN Y F , CHEN X , et al . Learning discriminative features with multiple granularities for person re-identification [C ] // Proceedings of the 26th ACM International conference on Multimedia . Seoul : ACM , 2018 : 274 - 282 .

KALAYEH M M , BASARAN E , GÖKMEN M , et al . Human semantic parsing for person re-identification [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 1062 - 1071 .

FU Y , WEI Y C , ZHOU Y Q , et al . Horizontal pyramid matching for person re-identification [C ] // Proceedings of The Thirty-Third AAAI Conference on Artificial Intelligence and the Thirty-First Conference on Innovative Applications of Artificial Intelligence and the Ninth Symposium on Educational Advances in Artificial Intelligence . Honolulu : AAAI Press , 2019 : 8295 - 8302 .

CAO Z , HIDALGO G , SIMON T , et al . OpenPose: Realtime multi-person 2D pose estimation using part affinity fields [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 1 ): 172 - 186 .

SUN K , XIAO B , LIU D , et al . Deep high-resolution representation learning for human pose estimation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach : IEEE , 2019 : 5686 - 5696 .

LIU H J , TAN X H , ZHOU X C . Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification [J ] . IEEE Transactions on Multimedia , 2021 , 23 : 4414 - 4425 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Las Vegas : IEEE , 2016 : 770 - 778 .

SCHROFF F , KALENICHENKO D , PHILBIN J . FaceNet: A unified embedding for face recognition and clustering [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Boston : IEEE , 2015 : 815 - 823 .

BATTAGLIA P W , HAMRICK J B , BAPST V , et al . Relational inductive biases, deep learning, and graph networks [EB/OL ] . ( 2018-06-04 )[ 2021-12 ] . https://arxiv.org/abs/1806.01261 https://arxiv.org/abs/1806.01261 .

WANG R Z , YAN J C , YANG X K . Combinatorial learning of robust deep graph matching: An embedding based approach [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 . DOI: 10.1109/TPAMI.2020.3005590 http://dx.doi.org/10.1109/TPAMI.2020.3005590 .

THOMAS N K , WELLING M . Semi-supervised classifi -cation with graph convolutional networks [EB/OL ] . ( 2016-09-09 )[ 2021-12 ] . https://arxiv.org/abs/1609.02907 https://arxiv.org/abs/1609.02907 .

ZANFIR A , SMINCHISESCU C . Deep learning of graph matching [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 2684 - 2693 .

SINKHORN R . A relationship between arbitrary positive matrices and doubly stochastic matrices [J ] . The Annals of Mathematical Statistics , 1964 , 35 ( 2 ): 876 - 879 .

IOFFE S , SZEGEDY C . Batch normalization: Accelerating deep network training by reducing internal covariate shift [C ] // Proceedings of the 32nd International Conference on International Conference on Machine Learning . Lille : JMLR.org , 2015 : 448 - 456 .

LUO H , GU Y Z , LIAO X Y , et al . Bag of tricks and a strong baseline for deep person re-identification [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Long Beach : IEEE , 2019 : 1487 - 1495 .

LIN T Y , MAIRE M , BELONGIE S , et al . Microsoft COCO: Common objects in context [C ] // European Conference on Computer Vision - ECCV 2014 . Zurich : Springer , 2014 : 740 - 755 .

PASZKE A , GROSS S , CHINTALA S , et al . Automatic differentiation in pytorch [C ] // Proceedings of the Conference and Workshop on Neural Information Processing Systems . California : NIPS , 2017 : 820 - 828 .

ZHONG Z , ZHENG L , KANG G L , et al . Random erasing data augmentation [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . New York : AAAI Press , 2020 : 13001 - 13008 .

LI D G , WEI X , HONG X P , et al . Infrared-visible cross-modal person re-identification with an X modality [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . New York : AAAI Press , 2020 : 4610 - 4617 .

YE M , SHEN J B , CRANDALL D J , et al . Dynamic dual-attentive aggregation learning for visible-infrared person re-identification [C ] // European Conference on Computer Vision - ECCV 2020 . Glasgow : Springer , 2020 : 229 - 247 .

PARK H , LEE S , LEE J , et al . Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Montreal : IEEE , 2021 : 12026 - 12035 .

HAO X , ZHAO S Y , YE M , et al . Cross-modality person re-identification via modality confusion and center aggregation [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Montreal : IEEE , 2021 : 16383 - 16392 .

TIAN X D , ZHANG Z Z , LIN S H , et al . Farewell to mutual information: Variational distillation for cross-modal person re-identification [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Nashville : IEEE , 2021 : 1522 - 1531 .

WU Q , DAI P Y , CHEN J , et al . Discover cross-modality nuances for visible-infrared person re-identification [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Nashville : IEEE , 2021 : 4328 - 4337 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

融合自注意力机制的多行为图对比学习推荐方法

基于伪标签正则化损失的无监督行人重识别

基于传播树的多特征谣言检测方法

基于元学习的图卷积网络少样本学习模型