Image Classification Network of Background Perception Mechanism

YUAN Heng; RAN Chao; ZHANG Sheng-chong

doi:10.12263/DZXB.20250028

您当前的位置：

首页 >

文章列表页 >

Image Classification Network of Background Perception Mechanism

PAPERS | 更新时间：2025-12-27

- Image Classification Network of Background Perception Mechanism
- ACTA ELECTRONICA SINICA Vol. 53, Issue 8, Pages: 2779-2793(2025)
- 作者机构：
  
  1.辽宁工程技术大学软件学院，辽宁葫芦岛 125105
  2.光电信息控制和安全技术重点实验室，天津 300308
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(61601213);National Defense Preliminary Research Fund(172068);Key Fund of Liaoning Provincial Department of Education(LJYL049)
- DOI：10.12263/DZXB.20250028
  CLC： TP391
- Received：08 January 2025，
  
  Accepted：13 May 2025，
  
  Published：25 August 2025
- 稿件说明：
移动端阅览
袁姮, 冉超, 张晟翀. 背景感知机制的图像分类网络[J]. 电子学报, 2025, 53(08): 2779-2793.

YUAN Heng, RAN Chao, ZHANG Sheng-chong. Image Classification Network of Background Perception Mechanism[J]. Acta Electronica Sinica, 2025, 53(08): 2779-2793.
袁姮, 冉超, 张晟翀. 背景感知机制的图像分类网络[J]. 电子学报, 2025, 53(08): 2779-2793. DOI：10.12263/DZXB.20250028

YUAN Heng, RAN Chao, ZHANG Sheng-chong. Image Classification Network of Background Perception Mechanism[J]. Acta Electronica Sinica, 2025, 53(08): 2779-2793. DOI：10.12263/DZXB.20250028

摘要

针对图像分类方法缺乏对复杂场景的有效理解，导致模型对关键特征的捕捉能力受限，进而影响分类精度等问题，本文提出背景感知机制的图像分类网络（image classification Network of Background Perception Mechanism，BPMNet）.首先，提出背景感知（Background Perception，BP）模块，通过双分支结构分别处理前景与背景信息，动态调整输入特征的贡献度，强化背景信息对前景特征的上下文支撑作用，以增强模型对背景信息的感知能力；然后，结合BP模块，设计了背景感知注意力（Background Perceptual Attention，BPA）模块，考虑局部特征信息、长程依赖关系的同时关注图像前景与背景之间的关系，动态调控背景信息对主体目标特征的影响程度，增强关键目标特征的判别性和定位能力.最后，将背景感知模块与背景感知注意力模块嵌入残差块中，实现从浅层细节到深层语义的特征传递，结合局部细节与全局语义，增强复杂场景下前景目标的特征表示能力.在图像数据集CIFAR-10、CIFAR-100、SVHN、Imagenette、Imagewoof上，BPMNet分别达到了96.95%、80.85%、97.68%、90.10%、81.70%的分类准确率，与其他主流网络相比分别平均提高了2.39%、3.17%、2.36%、2.30%、2.67%.与当前先进的网络模型相比，本文方法能够增强模型对复杂场景的理解，提高关键区域表达能力，从而更有效地提取关键特征，进一步提高模型的鲁棒性和泛化能力.

Abstract

In view of the lack of effective understanding of complex scenes in image classification methods

which leads to the limited ability of models to capture key features and thus affects the classification accuracy

this paper proposes an image classification network of background perception mechanism (BPMNet). Firstly

the background perception (BP) module is proposed. Through a dual-branch structure

the foreground and background information are processed respectively

the contribution degree of the input features is dynamically adjusted

and the context support role of the background information on the foreground features is strengthened to enhance the model’s perception ability of background information. Then

combined with the BP module

the background perception attention (BPA) module is designed. While considering the local feature information and long-range dependency relationship

it also pays attention to the relationship between the foreground and background of the image

and dynamically regulates the influence degree of the background information on the features of the subject target and enhances the discriminability and positioning ability of key target features. Finally

the background perception module and the background perception attention module are embedded in the residual block to achieve feature transfer from shallow details to deep semantics

and the feature representation ability of foreground targets in complex scenes is enhanced by combining local details and global semantics. Compared with other mainstream networks

the classification accuracy of BPMNet achieved on the image data sets such as CIFAR-10

CIFAR-100

SVHN

Imagenette and Imagewoof

are 96.95%

80.85%

97.68%

90.10% and 81.70%

respectively

which increased by 2.39%

3.17%

2.36%

2.30% and 2.67% on average. Compared with the current advanced network models

the proposed method can enhance the model’s understanding of complex scenes

improve the ability to express key regions

extract key features more effectively

and further improve the robustness and generalization ability of the model.

关键词

Keywords

references

杨传广 , 陈路明 , 赵二虎 , 等 . 基于图表征知识蒸馏的图像分类方法 [J ] . 电子学报 , 2024 , 52 ( 10 ): 3435 - 3447 .

YANG C G , CHEN L M , ZHAO E H , etal . Graph-based representation knowledge distillation for image classification [J ] . Acta Electronica Sinica , 2024 , 52 ( 10 ): 3435 - 3447 . (in Chinese)

姜文涛 , 高原 , 袁姮 , 等 . 门控机制的图像分类网络 [J ] . 电子学报 , 2024 , 52 ( 7 ): 2393 - 2406 .

JIANG W T , GAO Y , YUAN H , et al . Image classification network of gating mechanism [J ] . Acta Electronica Sinica , 2024 , 52 ( 7 ): 2393 - 2406 . (in Chinese)

JIANG W T , YUAN H , LIU W J . Neuron signal attenuation activation mechanism for deep learning [J ] . Patterns , 2025 , 6 ( 1 ): 101117 .

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [J ] . Communications of the ACM , 2017 , 60 ( 6 ): 84 - 90 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2014-09-04 )[ 2025-01-13 ] . https://arxiv.org/pdf/1409.1556 https://arxiv.org/pdf/1409.1556 .

SZEGEDY C , LIU W , JIA Y Q , et al . Going deeper with convolutions [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2015 : 1 - 9 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

ZAGORUYKO S , KOMODAKIS N . Wide residual net- wor-ks [EB/OL ] . ( 2016-05-23 )[ 2025-01-13 ] . https://arxiv.org/pdf/1605.07146 https://arxiv.org/pdf/1605.07146 .

HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1577 - 1586 .

HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2261 - 2269 .

CHEN Y , LI J , XIAO H , et al . Dual path networks [J ] . Advances in Neural Information Processing Systems , 2017 , 1 : 4470 - 4478 .

HU J , SHEN L , SUN G . Squeeze-and-excitation networ-ks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .

WANG Q L , WU B G , ZHU P F , et al . ECA-net: Efficient channel attention for deep convolutional neural networks [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11534 - 11542 .

QIN Z Q , ZHANG P Y , WU F , et al . FcaNet: Frequency channel attention networks [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 763 - 772 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // International Conference on Neural Information Processing Systems . Cambridge : MIT Press , 2017 : 6000 - 6010 .

CHOROMANSKI K , LIKHOSHERSTOV V , DOHAN D , et al . Rethinking attention with performers [EB/OL ] . ( 2020-09-30 )[ 2025-01-13 ] . https://arxiv.org/pdf/2009.14794 https://arxiv.org/pdf/2009.14794 .

LAN H , WANG X H , SHEN H , et al . Couplformer: Rethinking vision transformer with coupling attention [C ] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2023 : 6464 - 6473 .

张峰 , 黄仕鑫 , 花强 , 等 . 基于Depth-wise卷积和视觉Transformer的图像分类模型 [J ] . 计算机科学 , 2024 , 51 ( 2 ): 196 - 204 .

ZHANG F , HUANG S X , HUA Q , et al . Novel image classification model based on depth-wise convolution neural network and visual transformer [J ] . Computer Science , 2024 , 51 ( 2 ): 196 - 204 . (in Chinese)

TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [C ] // International Conference on Machine Learning . San Diego : PMLR , 2019 : 6105 - 6114 .

ZHOU C , ZHANG H , ZHOU Z , et al . QKFormer: Hierarchical spiking transformer using QK attention [EB/OL ] . ( 2024-03-25 )[ 2025-01-13 ] . https://arxiv.org/pdf/2403.16552v1 https://arxiv.org/pdf/2403.16552v1 .

SHIN H , CHOI D W . Teacher as a lenient expert: Teacher-agnostic data-free knowledge distillation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 13 ): 14991 - 14999 .

KONSTANTINIDIS D , PAPASTRATIS I , DIMITROPOULOS K , et al . Multi-manifold attention for vision transformers [J ] . IEEE Access , 2023 , 11 : 123433 - 123444 .

姜文涛 , 赵琳琳 , 涂潮 . 双分支多注意力机制的锐度感知分类网络 [J ] . 模式识别与人工智能 , 2023 , 36 ( 3 ): 252 - 267 .

JIANG W T , ZHAO L L , TU C . Double-branch multi-attention mechanism based sharpness-aware classification network [J ] . Pattern Recognition and Artificial Intelligence , 2023 , 36 ( 3 ): 252 - 267 . (in Chinese)

MA C , WU J , SI C , et al . Scaling supervised local learning with augmented auxiliary networks [EB/OL ] . ( 2024-02-27 )[ 2025-01-13 ] . https:// arxiv.org/pdf/2402.17318 https://arxiv.org/pdf/2402.17318 .

QIU X , ZHU R J , CHOU Y , et al . Gated attention coding for training high-performance and efficient spiking neural networks [EB/OL ] . ( 2024-06-04 )[ 2025-01-13 ] . https://arxiv.org/pdf/2308.06582 https://arxiv.org/pdf/2308.06582 .

TSENG C H , LEE S J , FENG J N , et al . UPANets: Learning from the universal pixel attention neworks [J ] . Entropy , 2022 , 24 ( 9 ): 1243 .

WU X D , GAO S Q , ZHANG Z Y , et al . Auto-train-once: controller network guided automatic network pruning from scratch [EB/OL ] . ( 2024-03-21 )[ 2025-01-13 ] . https://arxiv.org/pdf/2403.14729 https://arxiv.org/pdf/2403.14729 .

HASSANI A , WALTON S , SHAH N , et al . Escaping the big data paradigm with compact transformers [EB/OL ] . ( 2022-06-07 )[ 2025-01-13 ] . https://arxiv.org/pdf/2104.05704 https://arxiv.org/pdf/2104.05704 .

OUYANG D L , HE S , ZHANG G Z , et al . Efficient multi-scale attention module with cross-spatial learni-ng [C ] // ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICAS-SP) . Piscataway : IEEE , 2023 : 1 - 5 .

ZHANG H , WU C R , ZHANG Z Y , et al . ResNeSt: Split-attention networks [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2022 : 2735 - 2745 .

QIN Z , SUN W , LI D , et al . Lightning attention-2: A free lunch for handling unlimited sequence lengths in large language models [EB/OL ] . ( 2024-01-15 )[ 2025-01-13 ] . https://arxiv.org/pdf/2401.04658 https://arxiv.org/pdf/2401.04658 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Neural Network Based Image Style Transfer: A Survey

Image Classification Algorithm Based on Coordinate Importance Pooling and Decoupled Class Alignment Distillation

Integrated Structural Electromagnetic Optimization Design of Mesh Antennas Based on Adaptive Space Mapping Multi-Fidelity Model

Feature Anomaly Detection and Pseudo-Label Regression for Adversarial Domain Adaptation

Related Author

WANG Wei

ZHANG Jing-yi

WEN Yu-hui

WEI Yun-chao

LIU Ying

XUE Jia-hao

ZHANG Wei-dong

XU Zhi-jie

Related Institution

School of Computer Science and Technology, Beijing Jiaotong University

Center for Image and Information Processing, Xi’an University of Posts and Telecommunications

International Joint-Research Center for Wireless Communication and Information Processing

University of Huddersfield, West Yorkshire HD13DH, United Kingdom of Great Britain and Northern Ireland

State Key Laboratory of Integrated Electromechanical Manufacturing of High-performance Electronic Equipments, Xidian University

⁰