基于元权重网络的跨场景点预测人群计数方法

徐昕; 谭卓林; 高陈强; 席跃

doi:10.12263/DZXB.20250285

您当前的位置：

首页 >

文章列表页 >

基于元权重网络的跨场景点预测人群计数方法

学术论文 | 更新时间：2025-12-27

- 基于元权重网络的跨场景点预测人群计数方法
- Cross-Scene Point Prediction Crowd Counting Method Based on Meta-Weight-Net
- 电子学报 2025年53卷第9期页码：3371-3383
- 作者机构：
  
  1.重庆邮电大学通信与信息工程学院，重庆 400065
  2.澳门大学科技学院，澳门 999078
- 作者简介：
  
  [ "徐昕男，1999年1月出生于四川省资阳市.现为重庆邮电大学通信与信息工程学院硕士研究生.主要研究方向为人群计数、计算机视觉和机器学习." ]
  [ "谭卓林女，1998年1月出生于四川省达州市.现为重庆邮电大学博士研究生.主要研究方向为视频分析、图像处理和计算机视觉. E-mail: tanzhuolin98@gmail.com" ]
  [ "高陈强男，1981年8月出生于重庆市.现为重庆邮电大学通信与信息工程学院教授、博士生导师.主要研究方向为图像处理、视频分析和机器学习. E-mail: gaocq@cqupt.edu.cn" ]
  [ "席跃男，2004年3月出生于重庆市.现为澳门大学应用数学本科生.主要研究方向为算法设计. E-mail:DC22908@um.edu.mo" ]
- 基金信息：
  
  国家重点研发计划(2022YFA1004100)
- DOI：10.12263/DZXB.20250285
  中图分类号： TN911.73;TP391
- 收稿：2025-04-14，
  
  录用：2025-08-22，
  
  纸质出版：2025-09-25
- 稿件说明：
移动端阅览
徐昕, 谭卓林, 高陈强, 等. 基于元权重网络的跨场景点预测人群计数方法[J]. 电子学报, 2025, 53(09): 3371-3383.

XU Xin, TAN Zhuo-lin, GAO Chen-qiang, et al. Cross-Scene Point Prediction Crowd Counting Method Based on Meta-Weight-Net[J]. Acta Electronica Sinica, 2025, 53(09): 3371-3383.
徐昕, 谭卓林, 高陈强, 等. 基于元权重网络的跨场景点预测人群计数方法[J]. 电子学报, 2025, 53(09): 3371-3383. DOI：10.12263/DZXB.20250285

XU Xin, TAN Zhuo-lin, GAO Chen-qiang, et al. Cross-Scene Point Prediction Crowd Counting Method Based on Meta-Weight-Net[J]. Acta Electronica Sinica, 2025, 53(09): 3371-3383. DOI：10.12263/DZXB.20250285

摘要

跨场景人群计数由于光照、尺度、拍摄角度和人群密度等因素引起的数据分布差异，导致在不同场景下的计数精度下降.针对现有人群计数模型在跨场景应用时存在的问题，本文提出了一种基于元学习的场景感知重新加权方法.该方法通过设计点预测计数模型直接预测每个行人的精确坐标，避免了传统密度图方法的定位模糊问题.元权重网络从元数据中学习显式点预测损失的加权方案，通过场景感知分支将每个场景视为一个单独的学习任务，利用不同场景的内在特征实现自适应的加权方案，降低标注噪声对模型跨场景泛化能力的影响.此外，针对现有数据集在教学领域的局限性构建了新的校园多场景人群计数数据集（Multi-Scene Crowd counting dataset，MS-Crowd），为跨场景研究提供了更全面的评估基准.实验结果表明该方法在MS-Crowd和户外公开数据集ShanghaiTech上的平均绝对误差（Mean Absolute Error，MAE）分别降低了19.7%和10.7%，验证了方法的有效性.

Abstract

Cross-scene crowd counting often suffers from degraded accuracy due to data distribution disparities caused by factors such as illumination

scale

camera angles

and crowd density. To address the limitations of existing crowd counting models in cross-scene applications

a meta-learning-based scene-aware reweighting method is proposed. Instead of relying on traditional density map approaches that suffer from localization ambiguity

the method employs a point prediction counting model to directly estimate the precise coordinates of each individual. A meta-weight network is introduced to learn an explicit weighting scheme for the point prediction loss from meta-data

while a scene-aware branch treats each scene as an independent learning task

leveraging intrinsic features across scenes to adaptively adjust the weighting scheme and mitigate the impact of annotation noise on cross-scene generalization. Furthermore

to overcome the limitations of existing datasets in educational settings

a new campus multi-scene crowd counting dataset (MS-Crowd) is constructed

providing a more comprehensive benchmark for cross-scene evaluation. Experimental results demonstrate that the proposed method reduces the mean absolute error (MAE) by 19.7% and 10.7% on the MS-Crowd and the public outdoor dataset ShanghaiTech

respectively

validating its effectiveness.

关键词

Keywords

references

BAI H Y , MAO J G , GARY CHAN S H . A survey on deep learning-based single image crowd counting: Network design, loss function and supervisory signal [J ] . Neurocomputing , 2022 , 508 : 1 - 18 .

卢振坤 , 刘胜 , 钟乐 , 等 . 人群计数研究综述 [J ] . 计算机工程与应用 , 2022 , 58 ( 11 ): 33 - 46 .

LU Z K , LIU S , ZHONG L , et al . Survey on reaserch of crowd counting [J ] . Computer Engineering and Applications , 2022 , 58 ( 11 ): 33 - 46 . (in Chinese)

LIN Z , DAVIS L S . Shape-based human detection and segmentation via hierarchical part-template matching [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2010 , 32 ( 4 ): 604 - 618 .

张智 , 易华挥 , 郑锦 . 聚焦小目标的航拍图像目标检测算法 [J ] . 电子学报 , 2023 , 51 ( 4 ): 944 - 955 .

ZHANG Z , YI H H , ZHENG J . Focusing on small objects detector in aerial images [J ] . Acta Electronica Sinica , 2023 , 51 ( 4 ): 944 - 955 . (in Chinese)

钟佳平 , 李云松 , 谢卫莹 , 等 . 结合区域引导和双注意力机制的高光谱目标检测判别式学习网络 [J ] . 电子学报 , 2024 , 52 ( 5 ): 1716 - 1729 .

ZHONG J P , LI Y S , XIE W Y , et al . Region-guided and dual attention discriminative learning network for hyperspectral target detection [J ] . Acta Electronica Sinica , 2024 , 52 ( 5 ): 1716 - 1729 . (in Chinese)

ZHANG Y Y , ZHOU D S , CHEN S Q , et al . Single-image crowd counting via multi-column convolutional neural network [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 589 - 597 .

YANG Y F , LI G R , WU Z , et al . Reverse perspective network for perspective-aware object counting [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 4373 - 4382 .

LIN H , MA Z H , HONG X P , et al . Gramformer: Learning crowd counting via graph-modulated transformer [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 4 ): 3395 - 3403 .

IDREES H , TAYYAB M , ATHREY K , et al . Composition loss for counting, density map estimation and localization in dense crowds [C ] // Computer Vision-ECCV 2018 . Cham : Springer , 2018 : 544 - 559 .

SONG Q Y , WANG C G , JIANG Z K , et al . Rethinking counting and localization in crowds: A purely point-based framework [C ] // 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2022 : 3345 - 3354 .

LIANG D K , XU W , BAI X . An end-to-end transformer model for crowd localization [C ] // Computer Vision-ECCV 2022 . Cham : Springer , 2022 : 38 - 54 .

CHEN I H , CHEN W T , LIU Y W , et al . Improving point-based crowd counting and Localization based onAuxiliary point guidance [C ] // Computer Vision-ECCV 2024 . Cham : Springer , 2025 : 428 - 444 .

LIU C X , LU H , CAO Z G , et al . Point-query quadtree for crowd counting, localization, and more [C ] // 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2024 : 1676 - 1685 .

DU Z P , DENG J K , SHI M J . Domain-general crowd counting in unseen scenarios [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 1 ): 561 - 570 .

PENG Z X , GARY CHAN S H . Single domain generalization for crowd counting [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 28025 - 28034 .

CHAN A B , LIANG Z J , VASCONCELOS N . Privacy preserving crowd monitoring: Counting people without people models or tracking [C ] // 2008 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2008 : 1 - 7 .

CHEN K , LOY C C , GONG S G , et al . Feature mining for localised crowd counting [C ] // Proceedings ofthe British Machine Vision Conference 2012 . Surrey : British Machine Vision Association , 2012 : 21.1- 21 . 11 .

ZHANG C , LI H S , WANG X G , et al . Cross-scene crowd counting via deep convolutional neural networks [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2015 : 833 - 841 .

WANG Q , GAO J Y , LIN W , et al . NWPU-crowd: A large-scale benchmark for crowd counting and localization [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 6 ): 2141 - 2149 .

SINDAGI V A , YASARLA R , PATEL V M . JHU-CROWD: Large-scale crowd counting dataset and a benchmark method [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 5 ): 2594 - 2609 .

WANG Q , GAO J Y , LIN W , et al . Learning from synthetic data for crowd counting in the wild [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 8190 - 8199 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2015-04-10 )[ 2025-04-10 ] . https://arXiv.org/abs/1409.1556 https://arXiv.org/abs/1409.1556 .

KUHN H W . The Hungarian method for the assignment problem [J ] . Naval Research Logistics Quarterly , 1955 , 2 ( 1/2 ): 83 - 97 .

SHU J , YUAN X , MENG D Y , et al . CMW-net: Learning a class-aware sample weighting mapping for robust deep learning [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 10 ): 11521 - 11539 .

FRANCESCHI L , FRASCONI P , SALZO S , et al . Bilevel programming for hyperparameter optimization and meta-learning [C ] // the 35th International Conference on Machine Learning . Cambridge : PMLR , 2018 : 1568 - 1577 .

FINN C , ABBEEL P , LEVINE S . Model-agnostic meta-learning for fast adaptation of deep networks [C ] // International Conference on Machine Learning , 2017 .

YAN Z Y , YUAN Y C , ZUO W M , et al . Perspective-guided convolution networks for crowd counting [C ] // 2019 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2019 : 952 - 961 .

IDREES H , SALEEMI I , SEIBERT C , et al . Multi-source multi-scale counting in extremely dense crowd images [C ] // Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition . New York : ACM , 2013 : 2547 - 2554 .

LI Y H , ZHANG X F , CHEN D M . CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 1091 - 1100 .

LIN H , MA Z H , JI R R , et al . Boosting crowd counting via multifaceted attention [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2022 : 19596 - 19605 .

LIU L B , QIU Z L , LI G B , et al . Crowd counting with deep structured scale integration network [C ] // 2019 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2020 : 1774 - 1783 .

WANG B Y , LIU H D , SAMARAS D , et al . Distribution matching for crowd counting [C ] // Proceedings of the 34th International Conference on Neural Information Processing Systems . New York : ACM , 2020 : 1595 - 1607 .

ZHU H L , YUAN J L , YANG Z W , et al . Fine-grained fragment diffusion for cross domain crowd counting [C ] // Proceedings of the 30th ACM International Conference on Multimedia . New York : ACM , 2022 : 5659 - 5668 .

LIU Y T , WANG Z , SHI M J , et al . Towards unsupervised crowd counting via regression-detection bi-knowledge transfer [C ] // Proceedings of the 28th ACM International Conference on Multimedia . New York : ACM , 2020 : 129 - 137 .

HAN T , GAO J Y , YUAN Y , et al . Focus on semantic consistency for cross-domain crowd understanding [C ] // ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2020 : 1848 - 1852 .

WU Q Q , WAN J , CHAN A B . Dynamic momentum adaptation for zero-shot cross-domain crowd counting [C ] // Proceedings of the 29th ACM International Conference on Multimedia . New York : ACM , 2021 : 658 - 666 .

MANSILLA L , ECHEVESTE R , MILONE D H , et al . Domain generalization via gradient surgery [C ] // 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2022 : 6610 - 6618 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于图表征知识蒸馏的图像分类方法

基于特征注意力融合元残差网络的小样本SAR目标识别