

浏览全部资源
扫码关注微信
1.辽宁工程技术大学软件学院,辽宁葫芦岛 125105
2.光电信息控制和安全技术重点实验室,天津 300308
Received:08 January 2025,
Accepted:13 May 2025,
Published:25 August 2025
移动端阅览
袁姮, 冉超, 张晟翀. 背景感知机制的图像分类网络[J]. 电子学报, 2025, 53(08): 2779-2793.
YUAN Heng, RAN Chao, ZHANG Sheng-chong. Image Classification Network of Background Perception Mechanism[J]. Acta Electronica Sinica, 2025, 53(08): 2779-2793.
袁姮, 冉超, 张晟翀. 背景感知机制的图像分类网络[J]. 电子学报, 2025, 53(08): 2779-2793. DOI:10.12263/DZXB.20250028
YUAN Heng, RAN Chao, ZHANG Sheng-chong. Image Classification Network of Background Perception Mechanism[J]. Acta Electronica Sinica, 2025, 53(08): 2779-2793. DOI:10.12263/DZXB.20250028
针对图像分类方法缺乏对复杂场景的有效理解,导致模型对关键特征的捕捉能力受限,进而影响分类精度等问题,本文提出背景感知机制的图像分类网络(image classification Network of Background Perception Mechanism,BPMNet).首先,提出背景感知(Background Perception,BP)模块,通过双分支结构分别处理前景与背景信息,动态调整输入特征的贡献度,强化背景信息对前景特征的上下文支撑作用,以增强模型对背景信息的感知能力;然后,结合BP模块,设计了背景感知注意力(Background Perceptual Attention,BPA)模块,考虑局部特征信息、长程依赖关系的同时关注图像前景与背景之间的关系,动态调控背景信息对主体目标特征的影响程度,增强关键目标特征的判别性和定位能力.最后,将背景感知模块与背景感知注意力模块嵌入残差块中,实现从浅层细节到深层语义的特征传递,结合局部细节与全局语义,增强复杂场景下前景目标的特征表示能力.在图像数据集CIFAR-10、CIFAR-100、SVHN、Imagenette、Imagewoof上,BPMNet分别达到了96.95%、80.85%、97.68%、90.10%、81.70%的分类准确率,与其他主流网络相比分别平均提高了2.39%、3.17%、2.36%、2.30%、2.67%.与当前先进的网络模型相比,本文方法能够增强模型对复杂场景的理解,提高关键区域表达能力,从而更有效地提取关键特征,进一步提高模型的鲁棒性和泛化能力.
In view of the lack of effective understanding of complex scenes in image classification methods
which leads to the limited ability of models to capture key features and thus affects the classification accuracy
this paper proposes an image classification network of background perception mechanism (BPMNet). Firstly
the background perception (BP) module is proposed. Through a dual-branch structure
the foreground and background information are processed respectively
the contribution degree of the input features is dynamically adjusted
and the context support role of the background information on the foreground features is strengthened to enhance the model’s perception ability of background information. Then
combined with the BP module
the background perception attention (BPA) module is designed. While considering the local feature information and long-range dependency relationship
it also pays attention to the relationship between the foreground and background of the image
and dynamically regulates the influence degree of the background information on the features of the subject target and enhances the discriminability and positioning ability of key target features. Finally
the background perception module and the background perception attention module are embedded in the residual block to achieve feature transfer from shallow details to deep semantics
and the feature representation ability of foreground targets in complex scenes is enhanced by combining local details and global semantics. Compared with other mainstream networks
the classification accuracy of BPMNet achieved on the image data sets such as CIFAR-10
CIFAR-100
SVHN
Imagenette and Imagewoof
are 96.95%
80.85%
97.68%
90.10% and 81.70%
respectively
which increased by 2.39%
3.17%
2.36%
2.30% and 2.67% on average. Compared with the current advanced network models
the proposed method can enhance the model’s understanding of complex scenes
improve the ability to express key regions
extract key features more effectively
and further improve the robustness and generalization ability of the model.
杨传广 , 陈路明 , 赵二虎 , 等 . 基于图表征知识蒸馏的图像分类方法 [J ] . 电子学报 , 2024 , 52 ( 10 ): 3435 - 3447 .
YANG C G , CHEN L M , ZHAO E H , etal . Graph-based representation knowledge distillation for image classification [J ] . Acta Electronica Sinica , 2024 , 52 ( 10 ): 3435 - 3447 . (in Chinese)
姜文涛 , 高原 , 袁姮 , 等 . 门控机制的图像分类网络 [J ] . 电子学报 , 2024 , 52 ( 7 ): 2393 - 2406 .
JIANG W T , GAO Y , YUAN H , et al . Image classification network of gating mechanism [J ] . Acta Electronica Sinica , 2024 , 52 ( 7 ): 2393 - 2406 . (in Chinese)
JIANG W T , YUAN H , LIU W J . Neuron signal attenuation activation mechanism for deep learning [J ] . Patterns , 2025 , 6 ( 1 ): 101117 .
LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J ] . Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .
KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks [J ] . Communications of the ACM , 2017 , 60 ( 6 ): 84 - 90 .
SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2014-09-04 )[ 2025-01-13 ] . https://arxiv.org/pdf/1409.1556 https://arxiv.org/pdf/1409.1556 .
SZEGEDY C , LIU W , JIA Y Q , et al . Going deeper with convolutions [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2015 : 1 - 9 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .
ZAGORUYKO S , KOMODAKIS N . Wide residual net- wor-ks [EB/OL ] . ( 2016-05-23 )[ 2025-01-13 ] . https://arxiv.org/pdf/1605.07146 https://arxiv.org/pdf/1605.07146 .
HAN K , WANG Y H , TIAN Q , et al . GhostNet: More features from cheap operations [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1577 - 1586 .
HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2261 - 2269 .
CHEN Y , LI J , XIAO H , et al . Dual path networks [J ] . Advances in Neural Information Processing Systems , 2017 , 1 : 4470 - 4478 .
HU J , SHEN L , SUN G . Squeeze-and-excitation networ-ks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .
WANG Q L , WU B G , ZHU P F , et al . ECA-net: Efficient channel attention for deep convolutional neural networks [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11534 - 11542 .
QIN Z Q , ZHANG P Y , WU F , et al . FcaNet: Frequency channel attention networks [C ] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 763 - 772 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // International Conference on Neural Information Processing Systems . Cambridge : MIT Press , 2017 : 6000 - 6010 .
CHOROMANSKI K , LIKHOSHERSTOV V , DOHAN D , et al . Rethinking attention with performers [EB/OL ] . ( 2020-09-30 )[ 2025-01-13 ] . https://arxiv.org/pdf/2009.14794 https://arxiv.org/pdf/2009.14794 .
LAN H , WANG X H , SHEN H , et al . Couplformer: Rethinking vision transformer with coupling attention [C ] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) . Piscataway : IEEE , 2023 : 6464 - 6473 .
张峰 , 黄仕鑫 , 花强 , 等 . 基于Depth-wise卷积和视觉Transformer的图像分类模型 [J ] . 计算机科学 , 2024 , 51 ( 2 ): 196 - 204 .
ZHANG F , HUANG S X , HUA Q , et al . Novel image classification model based on depth-wise convolution neural network and visual transformer [J ] . Computer Science , 2024 , 51 ( 2 ): 196 - 204 . (in Chinese)
TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [C ] // International Conference on Machine Learning . San Diego : PMLR , 2019 : 6105 - 6114 .
ZHOU C , ZHANG H , ZHOU Z , et al . QKFormer: Hierarchical spiking transformer using QK attention [EB/OL ] . ( 2024-03-25 )[ 2025-01-13 ] . https://arxiv.org/pdf/2403.16552v1 https://arxiv.org/pdf/2403.16552v1 .
SHIN H , CHOI D W . Teacher as a lenient expert: Teacher-agnostic data-free knowledge distillation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 13 ): 14991 - 14999 .
KONSTANTINIDIS D , PAPASTRATIS I , DIMITROPOULOS K , et al . Multi-manifold attention for vision transformers [J ] . IEEE Access , 2023 , 11 : 123433 - 123444 .
姜文涛 , 赵琳琳 , 涂潮 . 双分支多注意力机制的锐度感知分类网络 [J ] . 模式识别与人工智能 , 2023 , 36 ( 3 ): 252 - 267 .
JIANG W T , ZHAO L L , TU C . Double-branch multi-attention mechanism based sharpness-aware classification network [J ] . Pattern Recognition and Artificial Intelligence , 2023 , 36 ( 3 ): 252 - 267 . (in Chinese)
MA C , WU J , SI C , et al . Scaling supervised local learning with augmented auxiliary networks [EB/OL ] . ( 2024-02-27 )[ 2025-01-13 ] . https:// arxiv.org/pdf/2402.17318 https://arxiv.org/pdf/2402.17318 .
QIU X , ZHU R J , CHOU Y , et al . Gated attention coding for training high-performance and efficient spiking neural networks [EB/OL ] . ( 2024-06-04 )[ 2025-01-13 ] . https://arxiv.org/pdf/2308.06582 https://arxiv.org/pdf/2308.06582 .
TSENG C H , LEE S J , FENG J N , et al . UPANets: Learning from the universal pixel attention neworks [J ] . Entropy , 2022 , 24 ( 9 ): 1243 .
WU X D , GAO S Q , ZHANG Z Y , et al . Auto-train-once: controller network guided automatic network pruning from scratch [EB/OL ] . ( 2024-03-21 )[ 2025-01-13 ] . https://arxiv.org/pdf/2403.14729 https://arxiv.org/pdf/2403.14729 .
HASSANI A , WALTON S , SHAH N , et al . Escaping the big data paradigm with compact transformers [EB/OL ] . ( 2022-06-07 )[ 2025-01-13 ] . https://arxiv.org/pdf/2104.05704 https://arxiv.org/pdf/2104.05704 .
OUYANG D L , HE S , ZHANG G Z , et al . Efficient multi-scale attention module with cross-spatial learni-ng [C ] // ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICAS-SP) . Piscataway : IEEE , 2023 : 1 - 5 .
ZHANG H , WU C R , ZHANG Z Y , et al . ResNeSt: Split-attention networks [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2022 : 2735 - 2745 .
QIN Z , SUN W , LI D , et al . Lightning attention-2: A free lunch for handling unlimited sequence lengths in large language models [EB/OL ] . ( 2024-01-15 )[ 2025-01-13 ] . https://arxiv.org/pdf/2401.04658 https://arxiv.org/pdf/2401.04658 .
0
Views
6
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621