

浏览全部资源
扫码关注微信
1.辽宁工程技术大学软件学院,辽宁葫芦岛 125105
2.光电信息控制和安全技术重点实验室,天津 300308
Received:13 July 2025,
Accepted:04 January 2026,
Published:25 January 2026
移动端阅览
袁姮, 武景瑞, 张晟翀. 全局依赖引导的特征重构图像分类网络[J]. 电子学报, 2026, 54(01): 291-307.
YUAN Heng, WU Jingrui, ZHANG Shengchong. Global Dependency Guided Feature Reconstruction for Image Classification[J]. Acta Electronica Sinica, 2026, 54(01): 291-307.
袁姮, 武景瑞, 张晟翀. 全局依赖引导的特征重构图像分类网络[J]. 电子学报, 2026, 54(01): 291-307. DOI:10.12263/DZXB.20250619
YUAN Heng, WU Jingrui, ZHANG Shengchong. Global Dependency Guided Feature Reconstruction for Image Classification[J]. Acta Electronica Sinica, 2026, 54(01): 291-307. DOI:10.12263/DZXB.20250619
针对卷积神经网络在图像分类任务中存在的长距离依赖关系建模不足的问题,本文提出全局依赖引导的特征重构图像分类网络(Global Dependency guided Feature Reconstruction for image classification,GDFRNet)。GDFRNet通过设计新颖的特征重构模块(Feature Reconstruction Module,FRM)和特征优化分支,构建了一个协同工作的双路径架构,实现了长距离依赖建模和细节特征增强。FRM通过引入并行的水平与垂直方向全局平均池化,分别在两个空间维度上压缩特征并获取具有全局视野的上下文向量,将其重新映射到二维特征空间,从而建立起跨越图像全域的长距离特征依赖关系,同时结合转置卷积等操作实现对特征空间的重构,抑制无关背景噪声,强化目标主体的连贯语义表达。特征优化分支通过设计细粒度特征捕捉模块(Fine-Grained feature Capture Module,FGCM)和特征优化模块(Feature Optimization Module,FOM)提炼并融合细节信息,减少网络抽象过程中细节信息的损失。FGCM通过引入高斯-拉普拉斯卷积,捕获图像中易于丢失的细节信息;FOM负责在高分辨率特征池中对FRM处理后的全局语义特征图与FGCM提取的丰富细节特征进行自适应融合与优化。两条路径形成了“全局轮廓-局部细节”互补的工作机制,FRM提供的全局语义特征图为细节增强提供指导,确保细节强化不偏离整体语义。同时,特征优化分支所提炼的丰富底层细节又为FRM的特征重构过程提供了必要的细微信息反馈,形成了良性的优化循环。这种互补机制使网络最终融合了经过重构的语义信息与经过增强的局部细节信息,生成判别性更强的图像表征,整体上强化了模型对图像整体结构的理解能力,并提升了特征空间的判别性。论文所提模型在CIFAR-10、CIFAR-100、SVHN、Imagenette和Imagewoof五个数据集上与当前最先进模型(State-Of-The-Art,SOTA)进行了对比实验,GDFRNet均表现出了卓越的性能,与其他先进方法相比,GDFRNet在以上五个数据集上的分类精度依次平均提升了2.39%、3.73%、2.35%、3.33%和2.92%,证明了GDFRNet的有效性和先进性。
To address the inadequacy of modelling long-range dependencies in convolutional neural networks for image classification tasks
this paper proposes the global dependency guided feature reconstruction for image classification(GDFRNet). GDFRNet constructs a synergistic dual-path architecture through the design of a novel feature reconstruction module (FRM) and a feature optimization branch
achieving both long-range dependency modelling and fine-grained feature enhancement. The FRM introduces parallel horizontal and vertical global mean pooling to compress features across two spatial dimensions. This extracts context vectors with global vision
remapping them into a two-dimensional feature space to establish long-range feature dependencies spanning the entire image. Concurrently
operations such as transposed convolutions reconstruct the feature space
suppressing irrelevant background noise while reinforcing coherent semantic representations of the target subject. The feature optimization branch refines and fuses detail information through the fine-grained feature capture module (FGCM) and feature optimization module (FOM)
reducing the loss of detail information during the network abstraction process. FGCM employs Gaussian-Laplacian convolution to focus on capturing easily lost fine details within images. The FOM performs adaptive fusion and optimization of the global semantic feature map provided by the FRM and the rich detail features extracted by the FGCM within a high-resolution feature pool. These two pathways establish a complementary “global contour-local detail” working mechanism
the global semantic map provided by the FRM guides detail enhancement
ensuring that detail reinforcement does not deviate from the overall semantic context; simultaneously
the rich underlying details refined by the feature optimization branch provide essential fine-grained feedback and task-relevant guidance for FRM’s feature reconstruction process
establishing a virtuous optimization cycle. This complementary mechanism enables the network to ultimately fuse reconstructed semantic information with enhanced local detail
generating more discriminative image representations. This holistically strengthens the model’s understanding of overall image structure and significantly enhances the discriminative power of the feature space. Comparative experiments between the proposed model and state-of-the-art (SOTA) models were conducted across five benchmark datasets: CIFAR-10
CIFAR-100
SVHN
Imagenette and Imagewoof. GDFRNet demonstrated outstanding performance across all datasets. Compared with other advanced methods
GDFRNet achieved average improvements in classification accuracy of 2.39%
3.73%
2.35%
3.33%
and 2.92% on the five datasets mentioned above
demonstrating the effectiveness and advancement of GDFRNet.
刘颖 , 薛家昊 , 张伟东 , 等 . 基于坐标重要性池化和解耦类别对齐蒸馏的图像分类算法 [J ] . 电子学报 , 2025 , 53 ( 3 ): 962 - 973 .
Liu Ying , Xue Jiahao , Zhang Weidong , et al . Image classification algorithm based on coordinate importance pooling and decoupled class alignment distillation [J ] . Acta Electronica Sinica , 2025 , 53 ( 3 ): 962 - 973 . (in Chinese)
Jiang Wentao , Yuan Heng , Liu Wanjun . Neuron signal attenuation activation mechanism for deep learning [J ] . Patterns , 2025 , 6 ( 1 ): 101117 . DOI: 10.1016/j.patter.2024.101117 http://dx.doi.org/10.1016/j.patter.2024.101117
He Kaiming , Zhang Xiangyu , Ren Shaoqing , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 . DOI: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90
Abdi M , Nahavandi S . Multi-residual networks: Improving the speed and accuracy of residual networks [PP/OL ] . V2.arXiv ( 2017-03-15 )[ 2025-07-07 ] . https://doi.org/10.48550/arXiv.1609.05672 https://doi.org/10.48550/arXiv.1609.05672 .
Han Kai , Wang Yunhe , Tian Qi , et al . GhostNet: More features from cheap operations [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 1577 - 1586 . DOI: 10.1109/cvpr42600.2020.00165 http://dx.doi.org/10.1109/cvpr42600.2020.00165
Zagoruyko S , Komodakis N . Wide residual networks [PP/OL ] . V2.arXiv ( 2017-06-14 )[ 2025-07-07 ] . https://arxiv.org/pdf/1605.07146 https://arxiv.org/pdf/1605.07146 . DOI: 10.5244/c.30.87 http://dx.doi.org/10.5244/c.30.87
Tan Mingxing , Le Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [PP/OL ] . V2.arXiv ( 2020-09-11 )[ 2025-07-07 ] . https://arxiv.org/abs/1905.11946 https://arxiv.org/abs/1905.11946 .
Xiong Yuwen , Li Zhiqi , Chen Yuntao , et al . Efficient deformable ConvNets: Rethinking dynamic and sparse operator for vision applications [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2024 : 5652 - 5661 . DOI: 10.1109/cvpr52733.2024.00540 http://dx.doi.org/10.1109/cvpr52733.2024.00540
Kim S , Park E . SMPConv: Self-moving point representations for continuous convolution [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 10289 - 10299 . DOI: 10.1109/cvpr52729.2023.00992 http://dx.doi.org/10.1109/cvpr52729.2023.00992
孙航 , 付秋月 , 李勃辉 , 等 . 基于跨层注意力特征交互和多尺度通道注意力的单幅图像去雾网络 [J ] . 电子学报 , 2024 , 52 ( 11 ): 3711 - 3726 .
Sun Hang , Fu Qiuyue , Li Bohui , et al . Cross-layer attention feature interaction and multi-scale channel attention network for single image dehazing [J ] . Acta Electronica Sinica , 2024 , 52 ( 11 ): 3711 - 3726 . (in Chinese)
Hao Chaofan , Li Yanfeng , Sun Jia , et al . MFEN: Multi-scale feature expansion network for visible-infrared person re-identification [C ] // Proceedings of the International Conference on Computer Vision and Deep Learning . New York : ACM , 2024 : 3653824 . DOI: 10.1145/3653781.3653824 http://dx.doi.org/10.1145/3653781.3653824
Gao Shanghua , Cheng Mingming , Zhao Kai , et al . Res2Net: A new multi-scale backbone architecture [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 2 ): 652 - 662 . DOI: 10.1109/tpami.2019.2938758 http://dx.doi.org/10.1109/tpami.2019.2938758
Jeong J , Zou Yang , Kim T , et al . WinCLIP: Zero-/few-shot anomaly classification and segmentation [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 19606 - 19616 . DOI: 10.1109/cvpr52729.2023.01878 http://dx.doi.org/10.1109/cvpr52729.2023.01878
Hou Qibin , Zhou Daquan , Feng Jiashi . Coordinate attention for efficient mobile network design [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2021 : 13708 - 13717 . DOI: 10.1109/cvpr46437.2021.01350 http://dx.doi.org/10.1109/cvpr46437.2021.01350
姜文涛 , 赵琳琳 , 涂潮 . 双分支多注意力机制的锐度感知分类网络 [J ] . 模式识别与人工智能 , 2023 , 36 ( 3 ): 252 - 267 .
Jiang Wentao , Zhao Linlin , Tu Chao . Double-branch multi-attention mechanism based sharpness-aware classification network [J ] . Pattern Recognition and Artificial Intelligence , 2023 , 36 ( 3 ): 252 - 267 . (in Chinese)
姜文涛 , 高原 , 袁姮 , 等 . 门控机制的图像分类网络 [J ] . 电子学报 , 2024 , 52 ( 7 ): 2393 - 2406 .
Jiang Wentao , Gao Yuan , Yuan Heng , et al . Image classification network of gating mechanism [J ] . Acta Electronica Sinica , 2024 , 52 ( 7 ): 2393 - 2406 . (in Chinese)
Wang Yan , Liu Yi , Zhao Shijie , et al . CAMixerSR: Only details need more “attention” [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 25837 - 25846 . DOI: 10.1109/cvpr52733.2024.02441 http://dx.doi.org/10.1109/cvpr52733.2024.02441
Yao Zhiyang , Liu Shuyang , Yuan Xiaoyun , et al . SPECAT: SPatial-spEctral cumulative-attention transformer for high-resolution hyperspectral image reconstruction [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 25368 - 25377 . DOI: 10.1109/cvpr52733.2024.02397 http://dx.doi.org/10.1109/cvpr52733.2024.02397
Zhang Shuoxi , Liu Hanpeng , Lin S , et al . You only need less attention at each stage in vision transformers [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 6057 - 6066 . DOI: 10.1109/cvpr52733.2024.00579 http://dx.doi.org/10.1109/cvpr52733.2024.00579
Konstantinidis D , Papastratis I , Dimitropoulos K , et al . Multi-manifold attention for vision transformers [J ] . IEEE Access , 2023 , 11 : 123433 - 123444 . DOI: 10.1109/access.2023.3329952 http://dx.doi.org/10.1109/access.2023.3329952
Shi Baifeng , Darrell T , Wang Xin . Top-down visual attention from analysis by synthesis [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 2102 - 2112 . DOI: 10.1109/cvpr52729.2023.00209 http://dx.doi.org/10.1109/cvpr52729.2023.00209
Ma Chenxiang , Wu Jibin , Si Chenyang , et al . Scaling supervised local learning with augmented auxiliary networks [PP/OL ] . V2.arXiv ( 2024-02-27 )[ 2025-07-07 ] . https://arxiv.org/abs/2402.17318 https://arxiv.org/abs/2402.17318 .
Shin H , Choi D W . Teacher as a lenient expert: Teacher-agnostic data-free knowledge distillation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 13 ): 14991 - 14999 . DOI: 10.1609/aaai.v38i13.29420 http://dx.doi.org/10.1609/aaai.v38i13.29420
Huang Gao , Liu Zhuang , Van Der Maaten L , et al . Densely connected convolutional networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 2261 - 2269 . DOI: 10.1109/cvpr.2017.243 http://dx.doi.org/10.1109/cvpr.2017.243
姜文涛 , 王鑫杰 , 张晟翀 . 空间约束注意力机制的图像分类网络 [J ] . 智能系统学报 , 2025 , 20 ( 6 ): 1444 - 1460 .
Jiang Wentao , Wang Xinjie , Zhang Shengchong . Spatially constrained attention mechanism for image classification network [J ] . CAAI Transactions on Intelligent Systems , 2025 , 20 ( 6 ): 1444 - 1460 . (in Chinese)
Lan Hai , Wang Xihao , Shen Hao , et al . Couplformer: Rethinking vision transformer with coupling attention [C ] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision . Piscataway : IEEE , 2023 : 6464 - 6473 . DOI: 10.1109/wacv56688.2023.00641 http://dx.doi.org/10.1109/wacv56688.2023.00641
Choromanski K M , Likhosherstov V , Dohan D , et al . Rethinking attention with performers [PP/OL ] . V2.arXiv ( 2022-11-19 )[ 2025-07-07 ] . https://arxiv.org/abs/2009.14794 https://arxiv.org/abs/2009.14794 .
Zhou Chenlin , Zhang Han , Zhou Zhaokun , et al . QKFormer: Hierarchical spiking transformer using Q-K attention [C ] // Proceedings of the 38th Annual Conference on Neural Information Processing Systems . New York : Curran Associates Inc. , 2024 . DOI: 10.52202/079017-0416 http://dx.doi.org/10.52202/079017-0416
Yu Shikang , Chen Jiachen , Han Hu , et al . Data-free knowledge distillation via feature exchange and activation region constraint [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 24266 - 24275 . DOI: 10.1109/cvpr52729.2023.02324 http://dx.doi.org/10.1109/cvpr52729.2023.02324
Wu Xidong , Gao Shangqian , Zhang Zeyu , et al . Auto- train-once: Controller network guided automatic network pruning from scratch [C ] // 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2024 : 16163 - 16173 . DOI: 10.1109/cvpr52733.2024.01530 http://dx.doi.org/10.1109/cvpr52733.2024.01530
杨育婷 , 李玲玲 , 刘旭 , 等 . 基于多尺度-多方向Transformer的图像识别 [J ] . 计算机学报 , 2025 , 48 ( 2 ): 249 - 265 .
Yang Yuting , Li Lingling , Liu Xu , et al . Multiscale and multidirectional transformer-based image recognition [J ] . Chinese Journal of Computers , 2025 , 48 ( 2 ): 249 - 265 . (in Chinese)
0
Views
60
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621