1.湖南工商大学人工智能与先进计算学院,湖南长沙 410000
2.湘江实验室,湖南长沙 410000
3.湖南工商大学智能工程与制造学院,湖南长沙 410000
4.湖南工商大学数学与统计学院,湖南长沙 410000
[ "黄依莎 女,2004年2月出生于江西省萍乡市.现为湖南工商大学人工智能与先进计算学院(湘江书院)本科生.主要研究方向为语义分割. E-mail: 2142827479@qq.com" ]
[ "姜林 男,1977年11月出生于湖南省常德市.现为湖南工商大学人工智能与先进计算学院(湘江书院)教授.主要研究方向为智能语音处理、机器视觉、机器人应用. E-mail: jlcdf@163.com" ]
[ "管亚菲 女,2005年8月出生于湖南省永州市.现为湖南工商大学人工智能与先进计算学院(湘江书院)本科生.主要研究方向为语义分割.E-mail: 2645740476@qq.com" ]
[ "张亚莎 女,2004年5月出生于湖南省浏阳市.现为湖南工商大学智能工程与制造学院本科生.主要研究方向为语义分割. E-mail: 2980083632@qq.com" ]
[ "梁欣 女,2005年5月出生于湖南省郴州市.现为湖南工商大学人工智能与先进计算学院(湘江书院)本科生.主要研究方向为语义分割. E-mail: 3278028915@qq.com" ]
[ "曾伟豪 男,2004年4月出生于湖南省益阳市.现为湖南工商大学人工智能与先进计算学院(湘江书院)本科生.主要研究方向为语义分割. E-mail: 2307225478@qq.com" ]
[ "方晓萍 女,1984年12月出生于湖南省娄底市.现为湖南工商大学数学与统计学院教授.主要研究方向为机器学习、大数据统计分析方法. E-mail: fxp1222@163.com" ]
收稿:2025-03-12,
修回:2025-06-10,
纸质出版:2025-06-25
移动端阅览
黄依莎, 姜林, 管亚菲, 等. DFRNet:融合扩散-聚焦物理机制的语义分割模型研究[J]. 电子学报, 2025, 53(06): 1755-1770.
HUANG Yi-sha, JIANG Lin, GUAN Ya-fei, et al. DFRNet: A Semantic Segmentation Method Inspired with Physical Mechanism of Diffusion-Focus[J]. Acta Electronica Sinica, 2025, 53(06): 1755-1770.
黄依莎, 姜林, 管亚菲, 等. DFRNet:融合扩散-聚焦物理机制的语义分割模型研究[J]. 电子学报, 2025, 53(06): 1755-1770. DOI:10.12263/DZXB.20250186
HUANG Yi-sha, JIANG Lin, GUAN Ya-fei, et al. DFRNet: A Semantic Segmentation Method Inspired with Physical Mechanism of Diffusion-Focus[J]. Acta Electronica Sinica, 2025, 53(06): 1755-1770. DOI:10.12263/DZXB.20250186
针对图像语义分割任务中下采样过程导致的信息丢失问题,以及现有上采样方法在处理复杂场景中普遍存在的全局信息丢失、细节模糊、生成过程不稳定及信息冗余等局限,本文提出了一种融合物理扩散-聚焦机制的轻量级语义分割模型——DFRNet.该模型引入了液体表面张力的扩散-聚焦机制,并进一步设计了动态上下文窗口选择(Dynamic context Window Selection,DWS)模块作为调节优化机制,从而实现了物理启发的能量传播上采样(Physics-Inspired Energy Propagation Upsampling,PIEPU)新方法.该方法包含扩散、聚焦与调节三大机制,分别承担全局上下文信息扩展、关键区域特征增强与信息流动优化的功能,协同提升模型在复杂场景下的细粒度感知与语义一致性表达能力.在7种类别14个数据集的验证表明:所提DFRNet在mIou、
F
1
分数和Accuracy指标上均领先于其他先进模型,且在不同数据集上,mIou提升幅度为0.165%~4.259%;
F
1
分数提升为0.140%~2.888%;Accuracy提升为0.035%~1.386%,验证了方法在多样化任务环境下的鲁棒性与泛化能力.本文模型参数量仅为3.34 MB,可满足轻量化实时应用.
To address the information loss induced by downsampling in image semantic segmentation tasks
as well as the widespread limitations of existing upsampling methods: such as inadequate global perception
blurred fine-grained reconstruction
unstable generation processes
and redundant information handling in various scenarios
this paper proposes a lightweight semantic segmentation model
DFRNet
which incorporates a physics-inspired diffusion-focusing mechanism. Specifically
inspired by the surface tension of liquids
the model introduces a diffusion-focusing mechanism and designs a dynamic context window selection (DWS) module to optimize information flow
thereby implementing the physics-inspired energy propagation upsampling (PIEPU) framework. PIEPU comprises three core modules: diffusion
focusing
and regulation. These modules collaborati
vely enhance global contextual propagation
critical region feature reinforcement
and optimized information flow
thereby significantly improving fine-grained perception and semantic consistency across complex scenarios. Extensive experiments conducted on 14 datasets covering 7 semantic categories demonstrate that DFRNet consistently achieves superior performance over state-of-the-art methods in terms of mean intersection over union (mIoU)
F
1
score
and Accuracy. Specifically
mIoU improvements range from 0.165% to 4.259%
F
1
score gains span 0.140% to 2.888%
and Accuracy enhancements vary from 0.035% to 1.386% across diverse datasets. These results validate the robustness and generalization capability of the proposed approach. Notably
DFRNet has a model size of only 3.34 MB
making it suitable for lightweight real-time applications.
HE H Y , CAI J F , PAN Z Z , et al . Dynamic focus-aware positional queries for semantic segmentation [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 11299 - 11308 .
NITISH V , ACHARYA P G , KRISHNAPRASAD B S , et al . Design and evaluation of a real-time semantic segmentation system for autonomous driving [C ] // 2024 IEEE 3rd International Conference for Innovation in Technology(INOCON) . Piscataway : IEEE , 2024 : 1 - 6 .
YAN J D , SHENG Y , PIAO M H . Semantic segmentation-based wafer map mixed-type defect pattern recognition [J ] . IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2023 , 42 ( 11 ): 4065 - 4074 .
LUCKE K , VAKANSKI A , XIAN M . A2DMN: Anatomy-aware dilated multiscale network for breast ultrasound semantic segmentation [C ] // 2024 IEEE International Symposium on Biomedical Imaging . Piscataway : IEEE , 2024 : 1 - 5 .
TARRY J , DONG X S , LI X F , et al . Unsupervised ensemble semantic segmentation for foreground-background separation on satellite image [C ] // 2024 IEEE 18th International Conference on Semantic Computing . Piscataway : IEEE , 2024 : 212 - 217 .
杨潇 , 陈伟 , 任鹏 , 等 . 基于域适应的煤矿环境监控图像语义分割 [J ] . 煤炭学报 , 2021 , 46 ( 10 ): 3386 - 3396 .
YANG X , CHEN W , REN P , et al . Coal mine monitoring image semantic segmentation based on domain adaptation [J ] . Journal of China Coal Society , 2021 , 46 ( 10 ): 3386 - 3396 . (in Chinese)
安喆 , 徐熙平 , 杨进华 , 等 . 结合图像语义分割的增强现实型平视显示系统设计与研究 [J ] . 光学学报 , 2018 , 38 ( 7 ): 85 - 91 .
AN Z , XU X P , YANG J H , et al . Design of augmented reality head-up display system based on image semantic segmentation [J ] . Acta Optica Sinica , 2018 , 38 ( 7 ): 85 - 91 . (in Chinese)
LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2015 : 3431 - 3440 .
RONNEBERGER O , FISCHER P , BROX T . U-Net: Convolutional networks for biomedical image segmentation [C ] // Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015 . Cham : Springer , 2015 : 234 - 241 .
ZHOU Z W , SIDDIQUEE M M R , TAJBAKHSH N , et al . UNet++: A nested U-Net architecture for medical image segmentation [EB/OL ] . ( 2018-07-18 )[ 2025-03-12 ] . https://arXiv.org/abs/1807.10165 https://arXiv.org/abs/1807.10165 .
HUANG H M , LIN L F , TONG R F , et al . UNet 3+: A full-scale connected UNet for medical image segmentation [C ] // ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2020 : 1055 - 1059 .
CHEN L C , PAPANDREOU G , SCHROFF F , et al . Rethinking atrous convolution for semantic image segmentation [EB/OL ] . ( 2017-12-05 )[ 2025-03-12 ] . https://arXiv.org/abs/1706.05587 https://arXiv.org/abs/1706.05587 .
CHEN L C , ZHU Y K , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C ] // Computer Vision-ECCV 2018 . Cham : Springer , 2018 : 833 - 851 .
LIU J Q , WANG Z L , CHENG K X . An improved algorithm for semantic segmentation of remote sensing images based on DeepLabv3+ [C ] // Proceedings of the 5th International Conference on Communication and Information Processing . New York : ACM , 2019 : 124 - 128 .
BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: A deep convolutional encoder-decoder architecture for image segmentation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 12 ): 2481 - 2495 .
ISOBE S , ARAI S . Deep convolutional encoder-decoder network with model uncertainty for semantic segmentation [C ] // 2017 IEEE International Conference on Innovations in Intelligent Systems and Applications (INISTA) . Piscataway : IEEE , 2017 : 365 - 370 .
ZHAO H S , SHI J P , QI X J , et al . Pyramid scene parsing network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 6230 - 6239 .
YANG S , LI J B , LI Y , et al . Imbalanced segmentation for abnormal cotton fiber based on GAN and multiscale residual U-Net [J ] . Alexandria Engineering Journal , 2024 , 106 : 25 - 41 .
许国良 , 毛骄 . 基于协同注意力的小样本的手机屏幕缺陷分割 [J ] . 电子与信息学报 , 2022 , 44 ( 4 ): 1476 - 1483 .
XU G L , MAO J . Few-shot segmentation on mobile phone screen defect based on co-attention [J ] . Journal of Electronics Information Technology , 2022 , 44 ( 4 ): 1476 - 1483 . (in Chinese)
LI Z H , WEI C . Dual dense upsampling convolution for road scene semantic segmentation [C ] // 2024 5th International Conference on Computer Engineering and Application . Piscataway : IEEE , 2024 : 721 - 726 .
ZHOU R X , ZHANG R Q , CHEN Y . Enhanced semantic segmentation with hierarchical upsampling and CBAM attention mechanism [C ] // 2024 5th International Conference on Computer Vision, Image and Deep Learning . Piscataway : IEEE , 2024 : 830 - 834 .
XUN S Y , ZHANG Y , DUAN S X , et al . ARGA-Unet: Advanced U-Net segmentation model using residual grouped convolution and attention mechanism for brain tumor MRI image segmentation [J ] . Virtual Reality Intelligent Hardware , 2024 , 6 ( 3 ): 203 - 216 .
徐亮亮 , 马开森 , 王霞 , 等 . LA-UNet网络模型在城市绿地遥感分类中的应用 [J ] . 应用生态学报 , 2024 , 35 ( 4 ): 1101 - 1111 .
XU L L , MA K S , WANG X , et al . Application of LA-UNet network model in remote sensing classification of urban green space [J ] . Chinese Journal of Applied Ecology , 2024 , 35 ( 4 ): 1101 - 1111 . (in Chinese)
MAGNUSSEN HELGESEN S E , NAKASHIMA K , TØRRESEN J , et al . Fast LiDAR upsampling using conditional diffusion models [C ] // 2024 33rd IEEE International Conference on Robot and Human Interactive Communication . Piscataway : IEEE , 2024 : 272 - 277 .
XIA B , ZHAN B , SHEN M , et al . Explicit implicit priori knowledge-based diffusion model for generative medical image segmentation [J ] . Knowledge-Based Systems , 2024 , 11 ( 303 ): 234 - 241 .
CHEN J L , LI G Y , ZHANG Z J , et al . EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation [J ] . Image and Vision Computing , 2024 , 142 . DOI: 10.1016/j.imavis.2023.104892 http://dx.doi.org/10.1016/j.imavis.2023.104892 .
张银胜 , 吉茹 , 童俊毅 , 等 . 基于双模态高效特征学习的高分辨率遥感图像分割 [J ] . 遥感学报 , 2024 , 28 ( 2 ): 481 - 493 .
ZHANG Y S , JI R , TONG J Y , et al . High resolution remote sensing image segmentation based on dual-modal efficient feature learning [J ] . National Remote Sensing Bulletin , 2024 , 28 ( 2 ): 481 - 493 . (in Chinese)
张大锦 , 刘辉 , 陈甫刚 , 等 . 频域多方向C-UNet及动态损失的工业烟尘图像分割 [J ] . 控制理论与应用 , 2024 , 41 ( 3 ): 543 - 554 .
ZHANG D J , LIU H , CHEN F G , et al . Industrial smoke image segmentation based on frequency domain multi-directional C-UNet and dynamic loss [J ] . Control Theory Applications , 2024 , 41 ( 3 ): 543 - 554 . (in Chinese)
CHANG J , HE X H , SONG D J , et al . A Multi-Scale attention network for building extraction from high-resolution remote sensing images [J ] . Scientific Reports , 2025 , 15 . DOI: 10.1038/s41598-025-09086-9 http://dx.doi.org/10.1038/s41598-025-09086-9 .
WANG H Y , XIE S , LIN L F , et al . Mixed transformer U-Net for medical image segmentation [C ] // ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2022 : 2390 - 2394 .
LEI L , YANG Q L , YANG L , et al . Deep learning implementation of image segmentation in agricultural applications: A comprehensive review [J ] . Artificial Intelligence Review , 2024 , 57 ( 6 ). DOI: 10.1007/s10462-024-10775-6 http://dx.doi.org/10.1007/s10462-024-10775-6 .
LIN A L , CHEN B Z , XU J Y , et al . DS-TransUNet: Dual swin transformer U-Net for medical image segmentation [J ] . IEEE Transactions on Instrumentation and Measurement , 2022 , 71 . DOI: 10.1109/TIM.2022.3178991 http://dx.doi.org/10.1109/TIM.2022.3178991 .
NASIM M A AL , MUNEM A AL , ISLAM M , et al . Brain tumor segmentation using enhanced U-Net model with empirical analysis [C ] // 2022 25th International Conference on Computer and Information Technology . Piscataway : IEEE , 2023 : 1027 - 1032 .
LIU L Z , CHEN X H , ZHU S Y , et al . CondLaneNet: A top-to-down lane detection framework based on conditional convolution [C ] // 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2022 : 3753 - 3762 .
WILM F , AMMELING J , ÖTTL M , et al . Rethinking U-Net skip connections for biomedical image segmentation [EB/OL ] . ( 2024-12-13 )[ 2025-03-12 ] . https://arXiv.org/abs/2402.08276 https://arXiv.org/abs/2402.08276 .
GUO B Y , WANG Y T , ZHEN S , et al . SPEED: Semantic prior and extremely efficient dilated convolution network for real-time metal surface defects detection [J ] . IEEE Transactions on Industrial Informatics , 2023 , 19 ( 12 ): 11380 - 11390 .
XU Q , MA Z C , DUAN W . DCSAU-Net: A deeper and more compact split-attention U-Net for medical image segmentation [J ] . Computers-in Biology and Medicine , 2023 , 154 . DOI: 10.48550/arXiv.2202.00972 http://dx.doi.org/10.48550/arXiv.2202.00972 .
ZHANG Y H , LU H Y , MA G Y , et al . MU-Net: Embedding MixFormer into unet to extract water bodies from remote sensing images [J ] . Remote Sensing , 2023 , 15 ( 14 ). DOI: 10.3390/rs15143559 http://dx.doi.org/10.3390/rs15143559 .
FENG H , SONG K C , CUI W Q , et al . Cross position aggregation network for few-shot strip steel surface defect segmentation [J ] . IEEE Transactions on Instrumentation and Measurement , 2023 , 72 . DOI: 10.1109/TIM.2023.3246519 http://dx.doi.org/10.1109/TIM.2023.3246519 .
SCHERR T , LÖFFLER K , BÖHLAND M , et al . Cell segmentation and tracking using CNN-based distance predictions and a graph-based matching strategy [J ] . PLoS One , 2020 , 15 ( 12 ). DOI: 10.1371/journal.pone.0243219 http://dx.doi.org/10.1371/journal.pone.0243219 .
张卓尔 , 潘俊 , 舒奇迪 . 基于双路细节关注网络的遥感影像建筑物提取 [J ] . 武汉大学学报(信息科学版) , 2024 , 49 ( 3 ): 376 - 388 .
ZHANG Z E , PAN J , SHU Q D . Building extraction based on dual-stream detail-concerned network [J ] . Geomatics and Information Science of Wuhan University , 2024 , 49 ( 3 ): 376 - 388 . (in Chinese)
ZHANG X L , LIANG L , ZHAO S L , et al . GRFB-UNet: A new multi-scale attention network with group receptive field block for tactile paving segmentation [J ] . Expert Systems with Applications , 2024 , 238 . DOI: 10.1016/j.eswa.2023.122109 http://dx.doi.org/10.1016/j.eswa.2023.122109 .
PAL K , YADAV P , KATAL N . RoadSegNet: A deep learning framework for autonomous urban road detection [J ] . Journal of Engineering and Applied Science , 2022 , 69 ( 1 ). DOI: 10.1186/s44147-022-00162-9 http://dx.doi.org/10.1186/s44147-022-00162-9 .
LIU J Y , ZHANG Q Y , WAN X , et al . LuSNAR: A lunar segmentation, navigation and reconstruction dataset based on Muti-sensor for autonomous exploration [EB/OL ] . ( 2024-10-26 )[ 2025-03-12 ] . https://arXiv.org/abs/2407.06512 https://arXiv.org/abs/2407.06512 .
WEI T Q , CHEN Z , YU X , et al . PlantSeg: A large-scale in-the-wild dataset for plant disease segmentation [EB/OL ] . ( 2024-09-06 )[ 2025-03-12 ] . https://arXiv.org/abs/2409.04038 https://arXiv.org/abs/2409.04038 .
WANG Y M , HA T , ALDRIDGE K , et al . Weed mapping with convolutional neural networks on high resolution whole-field images [C ] // 2023 IEEE/CVF International Conference on Computer Vision Workshops . Piscataway : IEEE , 2023 : 505 - 514 .
RUAN J C , XIE M Y , XIANG S C , et al . MEW-UNet: Multi-axis representation learning in frequency domain for medical image segmentation [EB/OL ] . ( 2022-10-25 )[ 2025-03-12 ] . https://arXiv.org/abs/2210.14007 https://arXiv.org/abs/2210.14007 .
XU J C , XIONG Z X , BHATTACHARYYA S P . PIDNet: A real-time semantic segmentation network inspired by PID controllers [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 19529 - 19539 .
HONG Y , PAN H , SUN W , et al . Deep dualresolution networks for real-time and accurate semantic segmentation of road scenes [EB/OL ] . ( 2021-09-01 )[ 2025-03-12 ] . https://arxiv.org/abs/2101.06085 https://arxiv.org/abs/2101.06085 .
IBTEHAZ N , KIHARA D . ACC-UNet: A completely convolutional UNet model for the 2020 s[M ] // Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 . Cham : Springer Nature Switzerland , 2023 : 692 - 702 .
AZAD R , ASADI-AGHBOLAGHI M , FATHY M , et al . Bi-directional ConvLSTM U-Net with densley connected convolutions [C ] // 2019 IEEE/CVF International Conference on Computer Vision Workshop . Piscataway : IEEE , 2019 : 406 - 415 .
LIU T H , HE Z S , LIN Z J , et al . An adaptive ima-ge segmentation network for surface defect detection [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2024 , 35 ( 6 ): 8510 - 8523 .
XU R G , HAO R Y , HUANG B Q . Efficient surface defect detection using self-supervised learning strategy and segmentation network [J ] . Advanced Engineering Informatics , 2022 , 52 . DOI: 10.1016/j.aei.2022.10156 http://dx.doi.org/10.1016/j.aei.2022.10156 .
YU H , CHO Y , KANG B , et al . Embedding-free transformer with inference spatial reduction for efficient semantic segmentation [M ] // Computer Vision - ECCV 2024 . Cham : Springer Nature Switzerland , 2024 : 92 - 110 .
0
浏览量
26
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621