生成对抗网络协同角度异构中心三元组损失的跨模态行人重识别

周非; 舒浩峰; 白梦林; 王锦华

doi:10.12263/DZXB.20220587

您当前的位置：

首页 >

文章列表页 >

生成对抗网络协同角度异构中心三元组损失的跨模态行人重识别

学术论文 | 更新时间：2025-12-08

- 生成对抗网络协同角度异构中心三元组损失的跨模态行人重识别
- Cross-Modal Person Re-Identification Based on Generative Adversarial Network Coordinated with Angle Based Heterogeneous Center Triplet Loss
- 电子学报 2023年51卷第7期页码：1803-1811
- 作者机构：
  
  1.重庆邮电大学通信与信息工程学院，重庆 400065
  2.泛在感知与互联重庆市重点实验室，重庆 400065
- 作者简介：
  
  [ "周非男，1977年6月出生，湖北浠水人.重庆邮电大学教授，博士生导师，主要研究方向为信息与信号处理、机器视觉、信息安全. E-mail: zhoufei@cqupt.edu.cn" ]
  [ "舒浩峰（通讯作者）男，1998年11月出生于重庆市，重庆邮电大学硕士研究生，主要研究方向为行人重识别.E-mail: s200131214@stu.cqupt.edu.cn" ]
- 基金信息：
  
  国家自然科学基金(62271096)
- DOI：10.12263/DZXB.20220587
  中图分类号： TP391.41
- 收稿：2022-05-23，
  
  修回：2023-02-22，
  
  纸质出版：2023-07-25
- 稿件说明：
移动端阅览
周非,舒浩峰,白梦林等.生成对抗网络协同角度异构中心三元组损失的跨模态行人重识别[J].电子学报,2023,51(07):1803-1811.

ZHOU Fei,SHU Hao-feng,BAI Meng-lin,et al.Cross-Modal Person Re-Identification Based on Generative Adversarial Network Coordinated with Angle Based Heterogeneous Center Triplet Loss[J].ACTA ELECTRONICA SINICA,2023,51(07):1803-1811.
周非,舒浩峰,白梦林等.生成对抗网络协同角度异构中心三元组损失的跨模态行人重识别[J].电子学报,2023,51(07):1803-1811. DOI： 10.12263/DZXB.20220587.

ZHOU Fei,SHU Hao-feng,BAI Meng-lin,et al.Cross-Modal Person Re-Identification Based on Generative Adversarial Network Coordinated with Angle Based Heterogeneous Center Triplet Loss[J].ACTA ELECTRONICA SINICA,2023,51(07):1803-1811. DOI： 10.12263/DZXB.20220587.

摘要

基于红外与可见光域之间的跨模态行人重识别对于夜间场景监控极为重要，但由于红外图像和可见光图像的数据分布存在较大差异，使得模型很难提取到同一行人在不同模态下的模态不变特征.本文针对现有跨模态行人重识别算法中存在的数据集样本数量较少问题以及不同模态图像之间存在较大跨模态差异问题，提出了一种新颖的生成对抗网络来生成与原始图像相似的匹配图像，在对跨模态行人数据集进行增广的同时减少跨模态差异；为减少跨模态差异和模态内差异，本文采用了双流网络来提取更具鉴别性特征，并提出了角度异构中心三元组损失对正负样本在特征空间中夹角进行约束，提升其在特征空间中的聚类效果.本文在SYSU-MM01和RegDB数据集上进行实验验证，结果表明本文所提出的生成匹配图像方法能够有效降低不同模态图像之间的跨模态差异，同时角度异构中心三元组损失使得特征空间中的嵌入特征具有角度辨别性，从而提升模型的分类能力.在SYSU-MM01数据集中，本文方法相较于最新算法在Rank-1和mAP分别提升了5.71%和8.18%，证实了文中方法的有效性.

Abstract

Cross-modal person re-identification based on infrared and visible images is very important for night scene monitoring

but due to the large difference in the data distribution of infrared images and visible images

it is difficult for the model to extract the modal-invariant features of the same pedestrian in different modal. Aiming at the problem of the small number of dataset samples and the large cross-modal difference between different modal images in the existing cross-modal person re-identification methods

this paper proposes a generative adversarial network to generate matching images which are similar to the original images which will augment the cross-modal person dataset while reducing cross-modal differences. To further reduce cross-modal differences and intra-modal differences

this paper utilizes a two stream network to extract discriminative features. Meanwhile to improve the positive and negative sample pairs' clustering effect in the feature space

an angle-based heterogeneous center triplet loss is proposed to constrain the angle between those sample pairs. Experiments are performed on the SYSU-MM01 and RegDB datasets. The results show that the proposed method for generating matching images can effectively reduce the cross-modal differences between images of different modalities. At the same time

the angle-based heterogeneous center triplet loss makes embedding features in feature space are angle-discriminative

thus improving the model's classification ability. Results on the SYSU-MM01 dataset show that Rank-1 and mAP have increased by 5.71% and 8.18% respectively

compared with the latest methods

confirming the effectiveness of our method.

关键词

Keywords

references

叶钰 , 王正 , 梁超 , 等 . 多源数据行人重识别研究综述 [J]. 自动化学报 , 2020 , 46 ( 9 ): 1869 - 1884 .

YE Y , WANG Z , LIANG C , et al . A survey on multi-source person re-identification [J]. Acta Automatica Sinica , 2020 , 46 ( 9 ): 1869 - 1884 . (in Chinese)

ZHU Z H , JIANG X Y , ZHENG F , et al . Viewpoint-aware loss with angular regularization for person re-identification [J]. Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 7 ): 13114 - 13121 .

ZHENG L , SHEN L Y , TIAN L , et al . Scalable person re-identification: A benchmark [C]// 2015 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2016 : 1116 - 1124 .

YE M , WANG Z , LAN X Y , et al . Visible thermal person re-identification via dual-constrained top-ranking [C]// Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence . California : International Joint Conferences on Artificial Intelligence Organization , 2018 : 2 .

WANG G A , ZHANG T Z , YANG Y , et al . Cross-modality paired-images generation for RGB-infrared person re-identification [J]. Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 7 ): 12144 - 12151 .

WANG Z X , WANG Z , ZHENG Y Q , et al . Learning to reduce dual-level discrepancy for infrared-visible person re-identification [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 618 - 626 .

CHENG D , GONG Y H , ZHOU S P , et al . Person re-identification by multi-channel parts-based CNN with improved triplet loss function [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 1335 - 1344 .

LIU H J , CHENG J , WANG W , et al . Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification [J]. Neurocomputing , 2020 , 398 : 11 - 19 .

YE H R , LIU H , MENG F Y , et al . Bi-directional exponential angular triplet loss for RGB-infrared person re-identification [J]. IEEE Transactions on Image Processing , 2021 , 30 : 1583 - 1595 .

CHOI S , LEE S M , KIM Y , et al . Hi-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 10254 - 10263 .

ZHU J Y , PARK T , ISOLA P , et al . Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 2242 - 2251 .

GOODFELLOW I J , POUGET-ABADIE J , MIRZA M , et al . Generative adversarial nets [C]// Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 . New York : ACM , 2014 : 2672 - 2680 .

WU A C , ZHENG W S , YU H X , et al . RGB-infrared cross-modality person re-identification [C]// 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 5390 - 5399 .

LIU H J , TAN X H , ZHOU X C . Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification [J]. IEEE Transactions on Multimedia , 2021 , 23 : 4414 - 4425 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

LUO H , GU Y Z , LIAO X Y , et al . Bag of tricks and a strong baseline for deep person re-identification [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2020 : 1487 - 1495 .

MÜLLER R , KORNBLITH S , HINTON G . When does label smoothing help? [J]. Advances in Neural Information Processing Systems , 2019 , 32 : 4694 - 4703 .

NGUYEN D , HONG H , KIM K , et al . Person recognition system based on a combination of body images from visible light and thermal cameras [J]. Sensors , 2017 , 17 ( 3 ): 605 .

ZHONG Z , ZHENG L A , KANG G L , et al . Random erasing data augmentation [J]. Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 7 ): 13001 - 13008 .

WANG G A , ZHANG T Z , CHENG J , et al . RGB-infrared cross-modality person re-identification via joint pixel and feature alignment [C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2020 : 3622 - 3631 .

YE M , SHEN J B , LIN G J , et al . Deep learning for person re-identification: A survey and outlook [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 6 ): 2872 - 2893 .

YE M , SHEN J B , CRANDALL D J , et al . Dynamic dual-attentive aggregation learning for visible-infrared person re-identification [C]// Computer Vision—ECCV 2020 . Cham : Springer International Publishing , 2020 : 229 - 247 .

PARK H , LEE S , LEE J , et al . Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences [C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2022 : 12026 - 12035 .

LIU H J , MA S , XIA D X , et al . SFANet: A spectrum-aware feature augmentation network for visible-infrared person reidentification [J]. IEEE Transactions on Neural Networks and Learning Systems , 2023 , 34 ( 4 ): 1958 - 1971 .

LIU H J , CHAI Y X , TAN X H , et al . Strong but simple baseline with dual-granularity triplet loss for visible-thermal person re-identification [J]. IEEE Signal Processing Letters , 2021 , 28 : 653 - 657 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于注意力机制优化的生成对抗网络及其在海杂波模拟中的应用

天波超视距雷达地海杂波图像增强与检测器设计

基于最小熵准则与生成对抗网络的SAR三维转动舰船目标重聚焦方法

基于伪标签正则化损失的无监督行人重识别

带有故障性质预测的自动测试向量求解模型