DFRNet：融合扩散-聚焦物理机制的语义分割模型研究

黄依莎; 姜林; 管亚菲; 张亚莎; 梁欣; 曾伟豪; 方晓萍

doi:10.12263/DZXB.20250186

您当前的位置：

首页 >

文章列表页 >

DFRNet：融合扩散-聚焦物理机制的语义分割模型研究

大模型与互联网 | 更新时间：2025-10-16

- DFRNet：融合扩散-聚焦物理机制的语义分割模型研究
- DFRNet: A Semantic Segmentation Method Inspired with Physical Mechanism of Diffusion-Focus
- 电子学报 2025年53卷第6期页码：1755-1770
- 作者机构：
  
  1.湖南工商大学人工智能与先进计算学院，湖南长沙 410000
  2.湘江实验室，湖南长沙 410000
  3.湖南工商大学智能工程与制造学院，湖南长沙 410000
  4.湖南工商大学数学与统计学院，湖南长沙 410000
- 作者简介：
  
  [ "黄依莎女，2004年2月出生于江西省萍乡市.现为湖南工商大学人工智能与先进计算学院（湘江书院）本科生.主要研究方向为语义分割. E-mail: 2142827479@qq.com" ]
  [ "姜林男，1977年11月出生于湖南省常德市.现为湖南工商大学人工智能与先进计算学院（湘江书院）教授.主要研究方向为智能语音处理、机器视觉、机器人应用. E-mail: jlcdf@163.com" ]
  [ "管亚菲女，2005年8月出生于湖南省永州市.现为湖南工商大学人工智能与先进计算学院（湘江书院）本科生.主要研究方向为语义分割.E-mail: 2645740476@qq.com" ]
  [ "张亚莎女，2004年5月出生于湖南省浏阳市.现为湖南工商大学智能工程与制造学院本科生.主要研究方向为语义分割. E-mail: 2980083632@qq.com" ]
  [ "梁欣女，2005年5月出生于湖南省郴州市.现为湖南工商大学人工智能与先进计算学院（湘江书院）本科生.主要研究方向为语义分割. E-mail: 3278028915@qq.com" ]
  [ "曾伟豪男，2004年4月出生于湖南省益阳市.现为湖南工商大学人工智能与先进计算学院（湘江书院）本科生.主要研究方向为语义分割. E-mail: 2307225478@qq.com" ]
  [ "方晓萍女，1984年12月出生于湖南省娄底市.现为湖南工商大学数学与统计学院教授.主要研究方向为机器学习、大数据统计分析方法. E-mail: fxp1222@163.com" ]
- 基金信息：
  
  湘江实验室重大项目(23XJ01003;23XJ01009);湖南省教育厅科学研究重点项目(22A0441)
- DOI：10.12263/DZXB.20250186
  中图分类号： TP391;
- 收稿：2025-03-12，
  
  修回：2025-06-10，
  
  纸质出版：2025-06-25
- 稿件说明：
移动端阅览
黄依莎, 姜林, 管亚菲, 等. DFRNet：融合扩散-聚焦物理机制的语义分割模型研究[J]. 电子学报, 2025, 53(06): 1755-1770.

HUANG Yi-sha, JIANG Lin, GUAN Ya-fei, et al. DFRNet: A Semantic Segmentation Method Inspired with Physical Mechanism of Diffusion-Focus[J]. Acta Electronica Sinica, 2025, 53(06): 1755-1770.
黄依莎, 姜林, 管亚菲, 等. DFRNet：融合扩散-聚焦物理机制的语义分割模型研究[J]. 电子学报, 2025, 53(06): 1755-1770. DOI：10.12263/DZXB.20250186

HUANG Yi-sha, JIANG Lin, GUAN Ya-fei, et al. DFRNet: A Semantic Segmentation Method Inspired with Physical Mechanism of Diffusion-Focus[J]. Acta Electronica Sinica, 2025, 53(06): 1755-1770. DOI：10.12263/DZXB.20250186

摘要

针对图像语义分割任务中下采样过程导致的信息丢失问题，以及现有上采样方法在处理复杂场景中普遍存在的全局信息丢失、细节模糊、生成过程不稳定及信息冗余等局限，本文提出了一种融合物理扩散-聚焦机制的轻量级语义分割模型——DFRNet.该模型引入了液体表面张力的扩散-聚焦机制，并进一步设计了动态上下文窗口选择（Dynamic context Window Selection，DWS）模块作为调节优化机制，从而实现了物理启发的能量传播上采样（Physics-Inspired Energy Propagation Upsampling，PIEPU）新方法.该方法包含扩散、聚焦与调节三大机制，分别承担全局上下文信息扩展、关键区域特征增强与信息流动优化的功能，协同提升模型在复杂场景下的细粒度感知与语义一致性表达能力.在7种类别14个数据集的验证表明：所提DFRNet在mIou、

分数和Accuracy指标上均领先于其他先进模型，且在不同数据集上，mIou提升幅度为0.165%~4.259%；

分数提升为0.140%~2.888%；Accuracy提升为0.035%~1.386%，验证了方法在多样化任务环境下的鲁棒性与泛化能力.本文模型参数量仅为3.34 MB，可满足轻量化实时应用.

Abstract

To address the information loss induced by downsampling in image semantic segmentation tasks

as well as the widespread limitations of existing upsampling methods: such as inadequate global perception

blurred fine-grained reconstruction

unstable generation processes

and redundant information handling in various scenarios

this paper proposes a lightweight semantic segmentation model

DFRNet

which incorporates a physics-inspired diffusion-focusing mechanism. Specifically

inspired by the surface tension of liquids

the model introduces a diffusion-focusing mechanism and designs a dynamic context window selection (DWS) module to optimize information flow

thereby implementing the physics-inspired energy propagation upsampling (PIEPU) framework. PIEPU comprises three core modules: diffusion

focusing

and regulation. These modules collaborati

vely enhance global contextual propagation

critical region feature reinforcement

and optimized information flow

thereby significantly improving fine-grained perception and semantic consistency across complex scenarios. Extensive experiments conducted on 14 datasets covering 7 semantic categories demonstrate that DFRNet consistently achieves superior performance over state-of-the-art methods in terms of mean intersection over union (mIoU)

score

and Accuracy. Specifically

mIoU improvements range from 0.165% to 4.259%

score gains span 0.140% to 2.888%

and Accuracy enhancements vary from 0.035% to 1.386% across diverse datasets. These results validate the robustness and generalization capability of the proposed approach. Notably

DFRNet has a model size of only 3.34 MB

making it suitable for lightweight real-time applications.

关键词

Keywords

references

HE H Y , CAI J F , PAN Z Z , et al . Dynamic focus-aware positional queries for semantic segmentation [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 11299 - 11308 .

NITISH V , ACHARYA P G , KRISHNAPRASAD B S , et al . Design and evaluation of a real-time semantic segmentation system for autonomous driving [C ] // 2024 IEEE 3rd International Conference for Innovation in Technology(INOCON) . Piscataway : IEEE , 2024 : 1 - 6 .

YAN J D , SHENG Y , PIAO M H . Semantic segmentation-based wafer map mixed-type defect pattern recognition [J ] . IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2023 , 42 ( 11 ): 4065 - 4074 .

LUCKE K , VAKANSKI A , XIAN M . A2DMN: Anatomy-aware dilated multiscale network for breast ultrasound semantic segmentation [C ] // 2024 IEEE International Symposium on Biomedical Imaging . Piscataway : IEEE , 2024 : 1 - 5 .

TARRY J , DONG X S , LI X F , et al . Unsupervised ensemble semantic segmentation for foreground-background separation on satellite image [C ] // 2024 IEEE 18th International Conference on Semantic Computing . Piscataway : IEEE , 2024 : 212 - 217 .

杨潇 , 陈伟 , 任鹏 , 等 . 基于域适应的煤矿环境监控图像语义分割 [J ] . 煤炭学报 , 2021 , 46 ( 10 ): 3386 - 3396 .

YANG X , CHEN W , REN P , et al . Coal mine monitoring image semantic segmentation based on domain adaptation [J ] . Journal of China Coal Society , 2021 , 46 ( 10 ): 3386 - 3396 . (in Chinese)

安喆 , 徐熙平 , 杨进华 , 等 . 结合图像语义分割的增强现实型平视显示系统设计与研究 [J ] . 光学学报 , 2018 , 38 ( 7 ): 85 - 91 .

AN Z , XU X P , YANG J H , et al . Design of augmented reality head-up display system based on image semantic segmentation [J ] . Acta Optica Sinica , 2018 , 38 ( 7 ): 85 - 91 . (in Chinese)

LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation [C ] // 2015 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2015 : 3431 - 3440 .

RONNEBERGER O , FISCHER P , BROX T . U-Net: Convolutional networks for biomedical image segmentation [C ] // Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015 . Cham : Springer , 2015 : 234 - 241 .

ZHOU Z W , SIDDIQUEE M M R , TAJBAKHSH N , et al . UNet++: A nested U-Net architecture for medical image segmentation [EB/OL ] . ( 2018-07-18 )[ 2025-03-12 ] . https://arXiv.org/abs/1807.10165 https://arXiv.org/abs/1807.10165 .

HUANG H M , LIN L F , TONG R F , et al . UNet 3+: A full-scale connected UNet for medical image segmentation [C ] // ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2020 : 1055 - 1059 .

CHEN L C , PAPANDREOU G , SCHROFF F , et al . Rethinking atrous convolution for semantic image segmentation [EB/OL ] . ( 2017-12-05 )[ 2025-03-12 ] . https://arXiv.org/abs/1706.05587 https://arXiv.org/abs/1706.05587 .

CHEN L C , ZHU Y K , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C ] // Computer Vision-ECCV 2018 . Cham : Springer , 2018 : 833 - 851 .

LIU J Q , WANG Z L , CHENG K X . An improved algorithm for semantic segmentation of remote sensing images based on DeepLabv3+ [C ] // Proceedings of the 5th International Conference on Communication and Information Processing . New York : ACM , 2019 : 124 - 128 .

BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: A deep convolutional encoder-decoder architecture for image segmentation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 12 ): 2481 - 2495 .

ISOBE S , ARAI S . Deep convolutional encoder-decoder network with model uncertainty for semantic segmentation [C ] // 2017 IEEE International Conference on Innovations in Intelligent Systems and Applications (INISTA) . Piscataway : IEEE , 2017 : 365 - 370 .

ZHAO H S , SHI J P , QI X J , et al . Pyramid scene parsing network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 6230 - 6239 .

YANG S , LI J B , LI Y , et al . Imbalanced segmentation for abnormal cotton fiber based on GAN and multiscale residual U-Net [J ] . Alexandria Engineering Journal , 2024 , 106 : 25 - 41 .

许国良 , 毛骄 . 基于协同注意力的小样本的手机屏幕缺陷分割 [J ] . 电子与信息学报 , 2022 , 44 ( 4 ): 1476 - 1483 .

XU G L , MAO J . Few-shot segmentation on mobile phone screen defect based on co-attention [J ] . Journal of Electronics Information Technology , 2022 , 44 ( 4 ): 1476 - 1483 . (in Chinese)

LI Z H , WEI C . Dual dense upsampling convolution for road scene semantic segmentation [C ] // 2024 5th International Conference on Computer Engineering and Application . Piscataway : IEEE , 2024 : 721 - 726 .

ZHOU R X , ZHANG R Q , CHEN Y . Enhanced semantic segmentation with hierarchical upsampling and CBAM attention mechanism [C ] // 2024 5th International Conference on Computer Vision, Image and Deep Learning . Piscataway : IEEE , 2024 : 830 - 834 .

XUN S Y , ZHANG Y , DUAN S X , et al . ARGA-Unet: Advanced U-Net segmentation model using residual grouped convolution and attention mechanism for brain tumor MRI image segmentation [J ] . Virtual Reality Intelligent Hardware , 2024 , 6 ( 3 ): 203 - 216 .

徐亮亮 , 马开森 , 王霞 , 等 . LA-UNet网络模型在城市绿地遥感分类中的应用 [J ] . 应用生态学报 , 2024 , 35 ( 4 ): 1101 - 1111 .

XU L L , MA K S , WANG X , et al . Application of LA-UNet network model in remote sensing classification of urban green space [J ] . Chinese Journal of Applied Ecology , 2024 , 35 ( 4 ): 1101 - 1111 . (in Chinese)

MAGNUSSEN HELGESEN S E , NAKASHIMA K , TØRRESEN J , et al . Fast LiDAR upsampling using conditional diffusion models [C ] // 2024 33rd IEEE International Conference on Robot and Human Interactive Communication . Piscataway : IEEE , 2024 : 272 - 277 .

XIA B , ZHAN B , SHEN M , et al . Explicit implicit priori knowledge-based diffusion model for generative medical image segmentation [J ] . Knowledge-Based Systems , 2024 , 11 ( 303 ): 234 - 241 .

CHEN J L , LI G Y , ZHANG Z J , et al . EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation [J ] . Image and Vision Computing , 2024 , 142 . DOI: 10.1016/j.imavis.2023.104892 http://dx.doi.org/10.1016/j.imavis.2023.104892 .

张银胜 , 吉茹 , 童俊毅 , 等 . 基于双模态高效特征学习的高分辨率遥感图像分割 [J ] . 遥感学报 , 2024 , 28 ( 2 ): 481 - 493 .

ZHANG Y S , JI R , TONG J Y , et al . High resolution remote sensing image segmentation based on dual-modal efficient feature learning [J ] . National Remote Sensing Bulletin , 2024 , 28 ( 2 ): 481 - 493 . (in Chinese)

张大锦 , 刘辉 , 陈甫刚 , 等 . 频域多方向C-UNet及动态损失的工业烟尘图像分割 [J ] . 控制理论与应用 , 2024 , 41 ( 3 ): 543 - 554 .

ZHANG D J , LIU H , CHEN F G , et al . Industrial smoke image segmentation based on frequency domain multi-directional C-UNet and dynamic loss [J ] . Control Theory Applications , 2024 , 41 ( 3 ): 543 - 554 . (in Chinese)

CHANG J , HE X H , SONG D J , et al . A Multi-Scale attention network for building extraction from high-resolution remote sensing images [J ] . Scientific Reports , 2025 , 15 . DOI: 10.1038/s41598-025-09086-9 http://dx.doi.org/10.1038/s41598-025-09086-9 .

WANG H Y , XIE S , LIN L F , et al . Mixed transformer U-Net for medical image segmentation [C ] // ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2022 : 2390 - 2394 .

LEI L , YANG Q L , YANG L , et al . Deep learning implementation of image segmentation in agricultural applications: A comprehensive review [J ] . Artificial Intelligence Review , 2024 , 57 ( 6 ). DOI: 10.1007/s10462-024-10775-6 http://dx.doi.org/10.1007/s10462-024-10775-6 .

LIN A L , CHEN B Z , XU J Y , et al . DS-TransUNet: Dual swin transformer U-Net for medical image segmentation [J ] . IEEE Transactions on Instrumentation and Measurement , 2022 , 71 . DOI: 10.1109/TIM.2022.3178991 http://dx.doi.org/10.1109/TIM.2022.3178991 .

NASIM M A AL , MUNEM A AL , ISLAM M , et al . Brain tumor segmentation using enhanced U-Net model with empirical analysis [C ] // 2022 25th International Conference on Computer and Information Technology . Piscataway : IEEE , 2023 : 1027 - 1032 .

LIU L Z , CHEN X H , ZHU S Y , et al . CondLaneNet: A top-to-down lane detection framework based on conditional convolution [C ] // 2021 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2022 : 3753 - 3762 .

WILM F , AMMELING J , ÖTTL M , et al . Rethinking U-Net skip connections for biomedical image segmentation [EB/OL ] . ( 2024-12-13 )[ 2025-03-12 ] . https://arXiv.org/abs/2402.08276 https://arXiv.org/abs/2402.08276 .

GUO B Y , WANG Y T , ZHEN S , et al . SPEED: Semantic prior and extremely efficient dilated convolution network for real-time metal surface defects detection [J ] . IEEE Transactions on Industrial Informatics , 2023 , 19 ( 12 ): 11380 - 11390 .

XU Q , MA Z C , DUAN W . DCSAU-Net: A deeper and more compact split-attention U-Net for medical image segmentation [J ] . Computers-in Biology and Medicine , 2023 , 154 . DOI: 10.48550/arXiv.2202.00972 http://dx.doi.org/10.48550/arXiv.2202.00972 .

ZHANG Y H , LU H Y , MA G Y , et al . MU-Net: Embedding MixFormer into unet to extract water bodies from remote sensing images [J ] . Remote Sensing , 2023 , 15 ( 14 ). DOI: 10.3390/rs15143559 http://dx.doi.org/10.3390/rs15143559 .

FENG H , SONG K C , CUI W Q , et al . Cross position aggregation network for few-shot strip steel surface defect segmentation [J ] . IEEE Transactions on Instrumentation and Measurement , 2023 , 72 . DOI: 10.1109/TIM.2023.3246519 http://dx.doi.org/10.1109/TIM.2023.3246519 .

SCHERR T , LÖFFLER K , BÖHLAND M , et al . Cell segmentation and tracking using CNN-based distance predictions and a graph-based matching strategy [J ] . PLoS One , 2020 , 15 ( 12 ). DOI: 10.1371/journal.pone.0243219 http://dx.doi.org/10.1371/journal.pone.0243219 .

张卓尔 , 潘俊 , 舒奇迪 . 基于双路细节关注网络的遥感影像建筑物提取 [J ] . 武汉大学学报(信息科学版) , 2024 , 49 ( 3 ): 376 - 388 .

ZHANG Z E , PAN J , SHU Q D . Building extraction based on dual-stream detail-concerned network [J ] . Geomatics and Information Science of Wuhan University , 2024 , 49 ( 3 ): 376 - 388 . (in Chinese)

ZHANG X L , LIANG L , ZHAO S L , et al . GRFB-UNet: A new multi-scale attention network with group receptive field block for tactile paving segmentation [J ] . Expert Systems with Applications , 2024 , 238 . DOI: 10.1016/j.eswa.2023.122109 http://dx.doi.org/10.1016/j.eswa.2023.122109 .

PAL K , YADAV P , KATAL N . RoadSegNet: A deep learning framework for autonomous urban road detection [J ] . Journal of Engineering and Applied Science , 2022 , 69 ( 1 ). DOI: 10.1186/s44147-022-00162-9 http://dx.doi.org/10.1186/s44147-022-00162-9 .

LIU J Y , ZHANG Q Y , WAN X , et al . LuSNAR: A lunar segmentation, navigation and reconstruction dataset based on Muti-sensor for autonomous exploration [EB/OL ] . ( 2024-10-26 )[ 2025-03-12 ] . https://arXiv.org/abs/2407.06512 https://arXiv.org/abs/2407.06512 .

WEI T Q , CHEN Z , YU X , et al . PlantSeg: A large-scale in-the-wild dataset for plant disease segmentation [EB/OL ] . ( 2024-09-06 )[ 2025-03-12 ] . https://arXiv.org/abs/2409.04038 https://arXiv.org/abs/2409.04038 .

WANG Y M , HA T , ALDRIDGE K , et al . Weed mapping with convolutional neural networks on high resolution whole-field images [C ] // 2023 IEEE/CVF International Conference on Computer Vision Workshops . Piscataway : IEEE , 2023 : 505 - 514 .

RUAN J C , XIE M Y , XIANG S C , et al . MEW-UNet: Multi-axis representation learning in frequency domain for medical image segmentation [EB/OL ] . ( 2022-10-25 )[ 2025-03-12 ] . https://arXiv.org/abs/2210.14007 https://arXiv.org/abs/2210.14007 .

XU J C , XIONG Z X , BHATTACHARYYA S P . PIDNet: A real-time semantic segmentation network inspired by PID controllers [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 19529 - 19539 .

HONG Y , PAN H , SUN W , et al . Deep dualresolution networks for real-time and accurate semantic segmentation of road scenes [EB/OL ] . ( 2021-09-01 )[ 2025-03-12 ] . https://arxiv.org/abs/2101.06085 https://arxiv.org/abs/2101.06085 .

IBTEHAZ N , KIHARA D . ACC-UNet: A completely convolutional UNet model for the 2020 s[M ] // Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 . Cham : Springer Nature Switzerland , 2023 : 692 - 702 .

AZAD R , ASADI-AGHBOLAGHI M , FATHY M , et al . Bi-directional ConvLSTM U-Net with densley connected convolutions [C ] // 2019 IEEE/CVF International Conference on Computer Vision Workshop . Piscataway : IEEE , 2019 : 406 - 415 .

LIU T H , HE Z S , LIN Z J , et al . An adaptive ima-ge segmentation network for surface defect detection [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2024 , 35 ( 6 ): 8510 - 8523 .

XU R G , HAO R Y , HUANG B Q . Efficient surface defect detection using self-supervised learning strategy and segmentation network [J ] . Advanced Engineering Informatics , 2022 , 52 . DOI: 10.1016/j.aei.2022.10156 http://dx.doi.org/10.1016/j.aei.2022.10156 .

YU H , CHO Y , KANG B , et al . Embedding-free transformer with inference spatial reduction for efficient semantic segmentation [M ] // Computer Vision - ECCV 2024 . Cham : Springer Nature Switzerland , 2024 : 92 - 110 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

一种基于课程学习的胚胎图像语义分割方法

多注意力融合的环高原湖泊遥感影像分割

基于语义与形态特征融合的语义分割网络

基于实景数据增强和双路径融合网络的实时街景语义分割算法