Cross-Domain Meta Optimization and Dual-Channel Attention for Few-Shot Multi-Source Domain Object Detection

ZHU Song-hao; WANG Shuang-cheng

doi:10.12263/DZXB.20250309

您当前的位置：

首页 >

文章列表页 >

Cross-Domain Meta Optimization and Dual-Channel Attention for Few-Shot Multi-Source Domain Object Detection

PAPERS | 更新时间：2026-02-05

- Cross-Domain Meta Optimization and Dual-Channel Attention for Few-Shot Multi-Source Domain Object Detection
- ACTA ELECTRONICA SINICA Vol. 53, Issue 10, Pages: 3659-3670(2025)
- 作者机构：
  
  南京邮电大学自动化学院，江苏南京210023
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(52405065)
- DOI：10.12263/DZXB.20250309
  CLC： TP391.4;
- Received：21 April 2025，
  
  Accepted：24 October 2025，
  
  Published：25 October 2025
- 稿件说明：
移动端阅览
朱松豪, 王双丞. 跨域元优化和双通道注意力结合的少样本多源域目标检测[J]. 电子学报, 2025, 53(10): 3659-3670.

ZHU Song-hao, WANG Shuang-cheng. Cross-Domain Meta Optimization and Dual-Channel Attention for Few-Shot Multi-Source Domain Object Detection[J]. Acta Electronica Sinica, 2025, 53(10): 3659-3670.
朱松豪, 王双丞. 跨域元优化和双通道注意力结合的少样本多源域目标检测[J]. 电子学报, 2025, 53(10): 3659-3670. DOI：10.12263/DZXB.20250309

ZHU Song-hao, WANG Shuang-cheng. Cross-Domain Meta Optimization and Dual-Channel Attention for Few-Shot Multi-Source Domain Object Detection[J]. Acta Electronica Sinica, 2025, 53(10): 3659-3670. DOI：10.12263/DZXB.20250309

摘要

本文针对一个新的、具有挑战性的问题，即实现源域、中间域到单个目标域的知识转移，其中目标域的每个类别仅有少量标记样本.此种情况下的知识转移过程面临以下两个困难：一是目标数据极其稀缺，从而导致没有足够的目标域特征分布；二是现有的少样本学习方法无差别地提取每部分特征，从而导致少样本目标检测性能不佳.为解决上述问题，本文提出一种少样本多源域目标检测方法.提出一种新的元优化机制，通过引入的混合域将源域和目标域对齐，用以缓解目标域稀缺特征分布的问题.具体而言，首先利用图像级混合生成混合图像，和相应的标签共同构成第一个混合域；然后通过双通道注意力机制生成细粒度特征，再利用特征级混合生成特征级混合特征，和相应的标签共同构成第二个混合域；最后通过区域建议网络和感兴趣区域网络生成感兴趣区域特征，再利用感兴趣区域特征级混合生成ROI（Region Of Interest）级混合ROI特征，和相应的标签共同构成第三个混合域.生成的三个混合域共同用于计算损失函数，完成元优化过程.提出一种包含卷积层和特征校准的双通道注意力机制，用以学习更具判别性的深度特征表征，其中卷积层用于防止关键空间信息的丢失，特征校准用于选择性地增强重要特征并削弱非重要特征.具体而言，首先利用卷积层子模块生成粗粒度特征表示；其次，利用特征校准子模块根据特征间的相关性建立注意力权重，并将这些注意力权重与原始特征进行整合，从而有选择地强化重要区域，同时抑制不重要区域.COCO数据集和PASCAL-VOC数据集的大量实验结果证明了本文提出的跨域元优化和双通道注意力结合的少样本多源域目标检测方法的有效性和鲁棒性.在检测效果上超越了同领域内其他方法，同时在不同数据集上保持了良好的泛化性能，此外模型的参数量在同领域内相比其他方法有显著优势.

Abstract

This article addresses a novel and challenging problem of knowledge transfer from the source domain and the intermediate domain to a single target domain

where each category in the target domain has few labeled samples. The knowledge transfer process in this situation faces two difficulties: the target data is extremely scarce

resulting in insufficient target domain feature distribution. Existing few-shot learning methods often extract features from each part indiscriminately

resulting in poor performance in few-shot object detection. To solve the above problems

this paper proposes a few-shot multi-source domain object detection method. A new meta optimization mechanism is proposed to align the source domain and target domain by introducing a mixed domain

alleviating the problem of scarce feature distribution in the target domain. Firstly

image-level mixing is used to generate mixed images

which together with corresponding labels form the first mixed domain. Then

fine-grained features are generated through a dual-channel attention mechanism

and feature level mixing is used to generate feature level mixed features

which together with corresponding labels form the second mixed domain. Finally

region of interest features are generated through a region recommendation network and a region of interest network

and then ROI (Region Of Interest) level mixed ROI features are generated through feature-level mixing of the region of interest

which together with corresponding labels form the third mixed domain. The three generated mixed domains are used together to calculate the loss function and complete the meta optimization process. A dual channel attention mechanism including convolutional layers and feature calibration is proposed to learn more discriminative deep feature representations

where convolutional layers are used to prevent the loss of key spatial information

and feature calibration is used to selectively enhance important features and weaken non important features. Firstly

the convolutional layer submodules are used to generate coarse-grained feature representations. Secondly

the feature calibration submodules are used to establish attention weights based on the correlation between features

and these attention weights are integrated with the original features to selectively enhance important regions while suppressing unimportant regions. A large number of experimental results on the COCO dataset and PASCAL-VOC dataset demonstrate the effectiveness and robustness of the proposed method. It surpasses other methods in the same field in terms of detection performance

while maintaining good generalization performance on different datasets. Furthermore

the model’s parameter count has significant advantages compared to other methods in the same field.

关键词

Keywords

references

YAO X X , ZHAO S C , XU P F , et al . Multi-source domain adaptation for object detection [C ] // 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2022 : 3253 - 3262 .

WU C W , CAO G T , LI Y , et al . Chaos to order: A label propagation perspective on source-free domain adaptation [EB/OL ] . ( 2023-08-14 )[ 2025-09-30 ] . https://arxiv.org/abs/2301.08413 https://arxiv.org/abs/2301.08413 .

YUE X Y , ZHENG Z W , REED C , et al . Multi-source few-shot domain adaptation [EB/OL ] . ( 2021-09-25 )[ 2025-09-30 ] . https://arXiv.org/abs/2109.12391 https://arXiv.org/abs/2109.12391 .

WOO S , PARK J , LEE J Y , et al . CBAM: Convolutional block attention module [M ] // Computer Vision - ECCV 2018 . Cham : Springer International Publishing , 2018 : 3 - 19 .

LIU F , ZHANG X S , PENG Z L , et al . Integrally migrating pre-trained transformer encoder-decoders for visual object detection [C ] // 2023 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2024 : 6802 - 6811 .

刘鑫磊 , 冯林 , 廖凌湘 , 等 . 基于元学习的图卷积网络少样本学习模型 . 电子学报 , 2024 , 52 ( 3 ): 885 - 897 .

LIU X L , FENG L , LIAO L X , et al . Few-shot learning on graph convolutional network based on meta learning [J ] . Acta Electronica Sinica , 2024 , 52 ( 3 ): 885 - 897 . (in Chinese)

ZHANG C , CAI Y J , LIN G S , et al . DeepEMD: Few-shot image classification with differentiable earth mover’s distance and structured classifiers [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 12200 - 12210 .

YANG B Y , LIU C , LI B H , et al . Prototype mixture models for few-shot semantic segmentation [M ] // Computer Vision - ECCV 2020 . Cham : Springer , 2020 : 763 - 778 .

LIU B H , DING Y , JIAO J B , et al . Anti-aliasing semantic reconstruction for few-shot semantic segmentation [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2021 : 9742 - 9751 .

XU J , LIU B , XIAO Y S . A variational inference method for few-shot learning [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2023 , 33 ( 1 ): 269 - 282 .

SHAO S , XING L , WANG Y J , et al . Attention-based multi-view feature collaboration for decoupled few-shot learning [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2023 , 33 ( 5 ): 2357 - 2369 .

CARION N , MASSA F , SYNNAEVE G , et al . End-to-end object detection with transformers [M ] // Computer Vision - ECCV 2020 . Cham : Springer , 2020 : 213 - 229 .

REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: Unified, real-time object detection [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 779 - 788 .

TANG Y B , CAO Z Q , YANG Y Q , et al . Semi-supervised few-shot object detection via adaptive pseudo labeling [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 ( 4 ): 2151 - 2165 .

MA J W , NIU Y L , XU J C , et al . DiGeo: Discriminative geometry-aware learning for generalized few-shot object detection [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 3208 - 3218 .

SUN Q , CHATTOPADHYAY R , PANCHANATHAN S , et al . A two-stage weighting framework for multi-source domain adaptation [C ] // Proceedings of the 25th International Conference on Neural Information Processing Systems . New York : ACM , 2011 : 505 - 513 .

MUANDET K , BALDUZZI D , SCHOLKOPF B . Domain generalization via invariant feature representation [C ] // Proceedings of the 30th International Conference on Machine Learning . Cambridge : PMLR , 2013 , 28 ( 1 ): 10 - 18 .

LI H L , PAN S J , WANG S Q , et al . Domain generalization with adversarial feature learning [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 5400 - 5409 .

ZHOU K Y , YANG Y X , QIAO Y , et al . Domain generalization with mixstyle [EB/OL ] . ( 2021-04-05 )[ 2025-09-30 ] . https://arxiv.org/abs/2104.02008 https://arxiv.org/abs/2104.02008 .

HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .

WANG Q L , WU B G , ZHU P F , et al . ECA-net: Efficient channel attention for deep convolutional neural networks [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 11531 - 11539 .

ZHU Y H , LIU C L , JIANG S Q . Multi-attention meta learning for few-shot fine-grained image recognition [C ] // Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence . New York : ACM , 2021 : 1090 - 1096 .

SHU Y , CAO Z J , WANG C Y , et al . Open domain generalization with domain-augmented meta-learning [C ] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2021 : 9619 - 9628 .

MAI Z J , HU G S , CHEN D X , et al . Metamixup: Learning adaptive interpolation policy of mixup with metalearning [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2022 , 33 ( 7 ): 3050 - 3064 .

RAVI S , LAROCHELLE H . Optimization as a model for few-shot learning [C ] // Proceedings of the 5th International Conference on Learning Representations . Appleton : ICLR , 2017 : 67413369 .

YAN X P , CHEN Z L , XU A N , et al . Meta R-CNN: Towards general solver for instance-level low-shot learning [C ] // 2019 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2020 : 9576 - 9585 .

DOMOGUEN J K L , NAVAL P C . Dynamic model-agnostic meta-learning for incremental few-shot learning [C ] // 2022 26th International Conference on Pattern Recognition . Piscataway : IEEE , 2022 : 4927 - 4933 .

ZHANG G J , LUO Z P , CUI K W , et al . Meta-DETR: Image-level few-shot detection with inter-class correlation exploitation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 11 ): 12832 - 12843 .

FINN C , ABBEEL P , LEVINE S . Model-agnostic meta-learning for fast adaptation of deep networks [C ] // Proceedings of the 34th International Conference on Machine Learning . New York : ACM , 2017 , 70 : 1126 - 1135 .

MOTIIAN S , PICCIRILLI M , ADJEROH D A , et al . Unified deep supervised domain adaptation and generalization [C ] // 2017 IEEE International Conference on Computer Vision . Piscataway : IEEE , 2017 : 5716 - 5726 .

WANG X , HUANG T E , DARRELL T , et al . Frustratingly simple few-shot object detection [C ] // Proceedings of the 37th International Conference on Machine Learning . New York : ACM , 2020 : 9919 - 9928 .

LI J M , ZHANG Y N , QIANG W W , et al . Disentangle and remerge: Interventional knowledge distillation for few-shot object detection from a conditional causal perspective [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 1 ): 1323 - 1333 .

HAN G X , HUANG S Y , MA J W , et al . Meta faster R-CNN: Towards accurate few-shot object detection with attentive feature alignment [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2022 , 36 ( 1 ): 780 - 789 .

SHANGGUAN Z Y , ROSTAMI M . Improved region proposal network for enhanced few-shot object detection [J ] . Neural Networks , 2024 , 180 : 106699 .

YAN B W , LANG C B , CHENG G , et al . Understanding negative proposals in generic few-shot object detection [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 ( 7 ): 5818 - 5829 .

WANG Z C , YANG B , YUE H N , et al . Fine-grained prototypes distillation for few-shot object detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 6 ): 5859 - 5866 .

HAN J M , REN Y Q , DING J , et al . Few-shot object detection via variational feature aggregation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 1 ): 755 - 763 .

YIN A T , WANG Y N , MAO J X , et al . Category-contextual relation encoding network for few-shot object detection [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 ( 9 ): 8355 - 8367 .

QIN J H , XU Y , FU Y F , et al . FSMT: Few-shot object detection via Multi-Task Decoupled [J ] . Pattern Recognition Letters , 2025 , 192 : 8 - 14 .

WANG B X , YU D H . Orthogonal progressive network for few-shot object detection [J ] . Expert Systems with Applications , 2025 , 264 : 125905 .

WEI Y J , LONG S W , WANG Y T . Improved few-shot object detection method based on faster R-CNN [J ] . IET Image Processing , 2025 , 19 : e70038 .

WU J Q , LEI J , TIAN H , et al . Dynamic routing and calibration for few-shot object detection [C ] // ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2025 : 1 - 5 .

XIAO Y , MARLET R . Few-shot object detection and viewpoint estimation for objects in the wild [M ] // Computer Vision - ECCV 2020 . Cham : Springer , 2020 : 192 - 210 .

CHEN H , WANG Q , XIE K L , et al . MPF-Net: Multi-projection filtering network for few-shot object detection [J ] . Applied Intelligence , 2024 , 54 ( 17 ): 7777 - 7792 .

LIU W J , CAI X J , WANG C , et al . Dynamic relevance learning for few-shot object detection [J ] . Signal, Image and Video Processing , 2025 , 19 ( 4 ): 297 .

WU Y , INKPEN D , EL-ROBY A . Dual mixup regularized learning for adversarial domain adaptation [M ] // Computer Vision - ECCV 2020 . Cham : Springer International Publishing , 2020 : 540 - 555 .

ZHANG H Y , CISSE M , DAUPHIN Y N , et al . Mixup: Beyond empirical risk minimization [EB/OL ] . ( 2018-04-27 )[ 2025-09-20 ] . https://arXiv.org/abs/1710.09412 https://arXiv.org/abs/1710.09412 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

⁰