

浏览全部资源
扫码关注微信
1.湖南工商大学人工智能与先进计算学院,湖南长沙 410000
2.湘江实验室,湖南长沙 410000
3.湖南工商大学前沿交叉学院,湖南长沙 410000
Received:23 May 2025,
Revised:2025-08-04,
Published:25 October 2025
移动端阅览
梁欣, 姜林, 彭超, 等. DynKANet:KA表示定理启发的图像语义分割动态融合网络模型[J]. 电子学报, 2025, 53(10): 3671-3691.
LIANG Xin, JIANG Lin, PENG Chao, et al. DynKANet: Dynamic Fusion Network for Real-Time Semantic Segmentation Based on Kolmogorov-Arnold Representation Theorem[J]. Acta Electronica Sinica, 2025, 53(10): 3671-3691.
梁欣, 姜林, 彭超, 等. DynKANet:KA表示定理启发的图像语义分割动态融合网络模型[J]. 电子学报, 2025, 53(10): 3671-3691. DOI:10.12263/DZXB.20250412
LIANG Xin, JIANG Lin, PENG Chao, et al. DynKANet: Dynamic Fusion Network for Real-Time Semantic Segmentation Based on Kolmogorov-Arnold Representation Theorem[J]. Acta Electronica Sinica, 2025, 53(10): 3671-3691. DOI:10.12263/DZXB.20250412
针对现有语义分割任务在复杂场景下全局语义与局部细节特征融合时难以平衡的问题,本文提出KA表示定理启发的图像语义分割动态融合网络(Dynamic Kolmogorov-Arnold Network,DynKANet),包含多级特征提取模块和动态特征融合模块.在多级特征提取阶段,设计基于残差连接的U型上下文增强模块和基于差异图的特征细化模块,分别用于精准捕捉全局语义和局部细节特征,以增强特征表达能力.在上述特征提取基础上,设计KA表示定理启发的动态特征融合模块,采用内外两层函数对前序模块提取的全局语义与局部细节特征进行深度提取和精准拟合,并结合自学习动态特征融合策略,确保全局语义与局部细节信息的平衡互补,有效缓解两类特征融合时难以平衡的问题.此外,设计CE/TopK+Dice动态联合损失函数,并采用条件触发策略,增强模型对困难样本的特征学习能力.实验结果表明,所提模型在涉及5个领域的10个公开数据集上均取得优异性能,平均在每个数据集上性能提升8%,具备较强的泛化能力与应用潜力.
Semantic segmentation in complex scenes often suffers from the challenge of effectively balancing global semantic context and local fine-grained details during feature fusion. To address this issue
we propose dynamic Kolmogorov-Arnold network (DynKANet)
a novel segmentation framework inspired by the Kolmogorov-Arnold representation theorem. The proposed architecture comprises a multi-level feature extraction module and a dynamic feature fusion module. Specifically
the feature extraction stage integrates a residual-connected U-shaped context enhancement module for robust global semantic modeling and a difference-map-based refinement module to enhance local detail representation.Building on these representations
we design a KA-inspired dynamic fusion module that decomposes the fusion process into nested inner and outer functions
enabling precise modeling of complex interactions between global and local features. A self-adaptive dynamic fusion strategy is incorporated to ensure complementary integration and mitigate the conflict between different feature types. Additionally
we introduce a dynamic compound loss function—CE/TopK+Dice—guided by a conditional trigger mechanism to strengthen the network’s ability to learn from hard samples. Extensive experiments on 10 public datasets spanning 5 domains demonstrate that DynKANet achieves consistent improvements
with an average performance gain of 8% per dataset. These results highlight the strong generalization capability and practical potential of the proposed approach for real-world semantic segmentation tasks in challenging scenarios.
秦飞巍 , 沈希乐 , 彭勇 , 等 . 无人驾驶中的场景实时语义分割方法 [J ] . 计算机辅助设计与图形学学报 , 2021 , 33 ( 7 ): 1026 - 1037 .
QIN F W , SHEN X Y , PENG Y , et al . A real-time semantic segmentation approach for autonomous driving scenes [J ] . Journal of Computer-Aided Design & Computer Graphics , 2021 , 33 ( 7 ): 1026 - 1037 . (in Chinese)
ASGARI TAGHANAKI S , ABHISHEK K , COHEN J P , et al . Deep semantic segmentation of natural and medical images: A review [J ] . Artificial Intelligence Review , 2021 , 54 ( 1 ): 137 - 178 .
董思俊 , 孟小亮 . 基于文本语义驱动的遥感影像要素提取 [J ] . 航天返回与遥感 , 2024 , 45 ( 3 ): 82 - 91 .
DONG S J , MENG X L . Text-semantics-driven feature extraction from remote sensing imagery [J ] . Spacecraft Recovery & Remote Sensing , 2024 , 45 ( 3 ): 82 - 91 . (in Chinese)
CORDTS M , OMRAN M , RAMOS S , et al . The cityscapes dataset for semantic urban scene understanding [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 3213 - 3223 .
郝铁文 . 基于DeepLabv3+的轻量化铝带表面缺陷分割方法 [J ] . 信息技术与信息化 , 2024 ( 12 ): 83 - 87 .
HAO T W . Surface defect segmentation method of lightweight aluminum strip based on DeepLabv3+ [J ] . Information Technology and Informatization , 2024 ( 12 ): 83 - 87 . (in Chinese)
高常鑫 , 徐正泽 , 吴东岳 , 等 . 深度学习实时语义分割综述 [J ] . 中国图象图形学报 , 2024 , 29 ( 5 ): 1119 - 1145 .
GAO C X , XU Z Z , WU D Y , et al . Deep learning-based real-time semantic segmentation: A survey [J ] . Journal of Image and Graphics , 2024 , 29 ( 5 ): 1119 - 1145 . (in Chinese)
RONNEBERGER O , FISCHER P , BROX T . U-Net: Convolutional networks for biomedical image segmentation [C ] // Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 . Cham : Springer , 2015 : 234 - 241 .
XU J C , XIONG Z X , BHATTACHARYYA S P . PIDNet: A real-time semantic segmentation network inspired by PID controllers [C ] // 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2023 : 19529 - 19539 .
IBTEHAZ N , KIHARA D . ACC-UNet: A completely convolutional UNet model for the 2020s [M ] // Medical Image Computing and Computer Assisted Intervention - MICCAI 2023 . Cham : Springer Nature Switzerland , 2023 : 692 - 702 .
CHEN L C , PAPANDREOU G , KOKKINOS I , et al . DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 , 40 ( 4 ): 834 - 848 .
ZHAO H S , SHI J P , QI X J , et al . Pyramid scene parsing network [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 6230 - 6239 .
YU C Q , GAO C X , WANG J B , et al . BiSeNet V2: Bilateral network with guided aggregation for real-time semantic segmentation [J ] . International Journal of Computer Vision , 2021 , 129 ( 11 ): 3051 - 3068 .
PENG Y P , CHEN D Z , SONKA M . U-Net V2: Rethinking the skip connections of U-Net for medical image segmentation [C ] // 2025 IEEE 22nd International Symposium on Biomedical Imaging . Piscataway : IEEE , 2025 . DOI: 10.1109/ISBI60581.2025.10980742 http://dx.doi.org/10.1109/ISBI60581.2025.10980742 .
PAN H H , HONG Y D , SUN W C , et al . Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes [J ] . IEEE Transactions on Intelligent Transportation Systems , 2023 , 24 ( 3 ): 3448 - 3460 .
PENG J C , LIU Y , TANG S Y , et al . PP-LiteSeg: A superior real-time semantic segmentation model [EB/OL ] . ( 2022-04-06 )[ 2025-03-23 ] . https://arXiv.org/abs/2204.02681 https://arXiv.org/abs/2204.02681 .
POUDEL R P K , LIWICKI S , CIPOLLA R . Fast-S-CNN: Fast semantic segmentation network [C ] // Proceedings of the British Machine Vision Conference , 2019 : 60441195 .
LIU Z M , WANG Y X , VAIDYA S , et al . KAN: Kolmogorov-Arnold networks [EB/OL ] . ( 2025-02-09 )[ 2025-03-23 ] . https://arXiv.org/abs/2404.19756 https://arXiv.org/abs/2404.19756 .
LI C X , LIU X Y , LI W Y , et al . U-KAN makes strong backbone for medical image segmentation and generation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2025 , 39 ( 5 ): 4652 - 4660 .
YANG X Y , WANG X C . Kolmogorov-Arnold transformer [EB/OL ] . ( 2024-09-16 )[ 2025-03-23 ] . https://arxiv.org/abs/2409.10594 https://arxiv.org/abs/2409.10594 .
ZHANG M Y , WANG L , CHEN Z H , et al . Path-SAM2: Transfer SAM2 for digital pathology semantic segmentation [EB/OL ] . ( 2024-09-24 )[ 2025-03-23 ] . https://arXiv.org/abs/2408.03651 https://arXiv.org/abs/2408.03651 .
YU C Q , WANG J B , PENG C , et al . BiSeNet: Bilateral segmentation network for real-time semantic segmentation [C ] // Computer Vision - ECCV 2018 . Cham : Springer , 2018 : 334 - 349 .
LIU M S , DAN J , LU Z Q , et al . CM-UNet: Hybrid CNN-mamba UNet for remote sensing image semantic segmentation [EB/OL ] . ( 2024-03-17 )[ 2025-03-23 ] . https://arXiv.org/abs/2405.10530 https://arXiv.org/abs/2405.10530 .
WANG Z Y , MA C . Weak-mamba-UNet: Visual mamba makes CNN and ViT work better for scribble-based medical image segmentation [EB/OL ] . ( 2024-01-16 )[ 2025-03-23 ] . https://arXiv.org/abs/2402.10887 https://arXiv.org/abs/2402.10887 .
LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 936 - 944 .
HE K M , ZHANG X Y , REN S Q , et al . Spatial pyramid pooling in deep convolutional networks for visual recognition [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 , 37 ( 9 ): 1904 - 1916 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .
HU J , SHEN L , SUN G . Squeeze-and-excitation networks [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7132 - 7141 .
CAO Y , XU J R , LIN S , et al . Global context networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 6 ): 6881 - 6895 .
FU J , LIU J , TIAN H J , et al . Dual attention network for scene segmentation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 3141 - 3149 .
HU Y , CHEN Y , LI X , et al . Dynamic featurefusion for semantic edge detection [EB/OL ] . ( 2019-02-25 )[ 2025-03-23 ] . https://arxiv.org/abs/1902.09104 https://arxiv.org/abs/1902.09104 .
WANG C Y , ZHONG C M . Adaptive feature pyramid networks for object detection [J ] . IEEE Access , 2021 , 9 : 107024 - 107032 .
QIAO D H , ZULKERNINE F . Adaptive feature fusion for cooperative perception using LiDAR point clouds [C ] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision . Piscataway : IEEE , 2023 : 1186 - 1195 .
XIE X Y , ZHENG X B , YU Z T , et al . FusionMamba: Dynamic feature enhancement for multimodal image fusion with Mamba [J ] . Visual Intelligence , 2024 , 2 ( 1 ): 37 .
YANG J , QIU P J , ZHANG Y C , et al . D-net: Dynamic large kernel with dynamic feature fusion for volumetric medical image segmentation [EB/OL ] . ( 2024-10-17 )[ 2025-03-23 ] . https://arXiv.org/abs/2403.10674 https://arXiv.org/abs/2403.10674 .
WU B Q , XIAO Q , LIU S W , et al . E2ENet: Dynamic sparse feature fusion for accurate and efficient 3D medical image segmentation [EB/OL ] . ( 2025-02-19 )[ 2025-03-23 ] . https://arXiv.org/abs/2312.04727 https://arXiv.org/abs/2312.04727 .
TA H T , DUY QUY T , TRAN A N , et al . Prkan:Parameter-reduced Kolmogorov-Arnold networks [EB/OL ] . ( 2025-02-11 )[ 2025-03-23 ] . https://arxiv.org/abs/2501.07032 https://arxiv.org/abs/2501.07032 .
MA X P , WANG Z Y , HU Y , et al . Kolmogorov-Arnold network for remote sensing image semantic segmentation [EB/OL ] . ( 2025-01-13 )[ 2025-03-23 ] . https://arXiv.org/abs/2501.07390 https://arXiv.org/abs/2501.07390 .
YANG L H , ZHAO J X , WANG Z M , et al . M-KANUNet: Enhanced defect segmentation in X-ray images of copper pipe welds via multi-scale representation and Kolmogorov-Arnold Networks [J ] . The Visual Computer , 2025 , 41 ( 10 ): 7201 - 7214 .
ZHANG B H , HUANG H R , SHEN Y , et al . MM-UKAN: A novel Kolmogorov-Arnold network-based U-shaped network for ultrasound image segmentation [J ] . IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control , 2025 , 72 ( 4 ): 498 - 514 .
PANG S C , ZHAO X , ZHANG Y L , et al . FKAN-GMFNet: Fourier Kolmogorov-Arnold-based group multi-scale fusion network for aneurysm image segmentation [C ] // ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2025 : 1 - 5 .
LI Y H , LIU S , WU J , et al . Multi-scale Kolmogorov-Arnold network (KAN)-based linear attention network: Multi-scale feature fusion with KAN and deformable convolution for urban scene image semantic segmentation [J ] . Remote Sensing , 2025 , 17 ( 5 ): 802 .
LIN S Y , HU R , LI Z Y , et al . KAC-unet: A medical image segmentation with the adaptive group strategy and Kolmogorov-Arnold network [J ] . IEEE Transactions on Instrumentation and Measurement , 2025 , 74 : 5015413 .
LI C S , XU X . Semi-supervised learning with Kolmogorov-Arnold network for MRI cardiac segmentation [J ] . IEEE Transactions on Instrumentation and Measurement , 2025 , 74 : 2515311 .
YUAN J , ZHOU L J , HE M R , et al . A lightweight dual path Kolmogorov-Arnold convolution network for medical optical image segmentation [J ] . Neurocomputing , 2026 , 659 : 131776 .
汤红忠 , 王蔚 , 王涛 , 等 . 一种基于课程学习的胚胎图像语义分割方法 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3365 - 3376 .
TANG H Z , WANG W , WANG T , et al . A semantic segmentation method of embryo image based on curriculum learning [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3365 - 3376 . (in Chinese)
梁燕 , 易春霞 , 王光宇 , 等 . 基于多尺度语义编解码网络的遥感图像语义分割 [J ] . 电子学报 , 2023 , 51 ( 11 ): 3199 - 3214 .
LIANG Y , YI C X , WANG G Y , et al . Semantic segmentation of remote sensing image based on multi-scale semantic encoder-decoder network [J ] . Acta Electronica Sinica , 2023 , 51 ( 11 ): 3199 - 3214 . (in Chinese)
QIN X B , ZHANG Z C , HUANG C Y , et al . U 2 -Net: Going deeper with nested U-structure for salient object detection [J ] . Pattern Recognition , 2020 , 106 : 107404 .
CHOLLET F . Xception: Deep learning with depthwise separable convolutions [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 1800 - 1807 .
DING X H , GUO Y C , DING G G , et al . ACNet: Strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks [C ] // 2019 IEEE/CVF International Conference on Computer Vision . Piscataway : IEEE , 2019 : 1911 - 1920 .
YU F , KOLTUN V . Multi-scale context aggregation by dilated convolutions [EB/OL ] . ( 2016-04-30 )[ 2025-03-23 ] . https://arXiv.org/abs/1511.07122 https://arXiv.org/abs/1511.07122 .
周勇 , 陈思霖 , 赵佳琦 , 等 . 基于弱语义注意力的遥感图像可解释目标检测 [J ] . 电子学报 , 2021 , 49 ( 4 ): 679 - 689 .
ZHOU Y , CHEN S L , ZHAO J Q , et al . Weakly semantic based attention network for interpretable object detection in remote sensing imagery [J ] . Acta Electronica Sinica , 2021 , 49 ( 4 ): 679 - 689 . (in Chinese)
张志文 , 刘天歌 , 聂鹏举 . 基于实景数据增强和双路径融合网络的实时街景语义分割算法 [J ] . 电子学报 , 2022 , 50 ( 7 ): 1609 - 1620 .
ZHANG Z W , LIU T G , NIE P J . Real-time semantic segmentation for road scene based on data enhancement and dual-path fusion network [J ] . Acta Electronica Sinica , 2022 , 50 ( 7 ): 1609 - 1620 . (in Chinese)
孟琭 , 徐磊 , 郭嘉阳 . 一种基于改进的MobileNetV2网络语义分割算法 [J ] . 电子学报 , 2020 , 48 ( 9 ): 1769 - 1776 .
MENG L , XU L , GUO J Y . Semantic segmentation algorithm based on improved MobileNetV2 [J ] . Acta Electronica Sinica , 2020 , 48 ( 9 ): 1769 - 1776 . (in Chinese)
LI H C , XIONG P F , FAN H Q , et al . DFANet: Deep feature aggregation for real-time semantic segmentation [C ] // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 9514 - 9523 .
BODNER A D , TEPSICH A S , SPOLSKI J N , et al . Convolutional Kolmogorov-Arnold networks [EB/OL ] . ( 2024-06-19 )[ 2025-03-23 ] . https://arXiv.org/abs/2406.13155 https://arXiv.org/abs/2406.13155 .
DROKIN I . Kolmogorov-Arnold convolutions: Design principles and empirical studies [EB/OL ] . ( 2024-06-19 )[ 2025-03-23 ] . https://arXiv.org/abs/2407.01092 https://arXiv.org/abs/2407.01092 .
FAN Y B , LYU S W , YING Y M , et al . Learning with average top-k loss [C ] // Proceedings of the 31st International Conference on Neural Information Processing Systems . New York : ACM , 2017 : 497 - 505 .
GUO B Y , WANG Y T , ZHEN S , et al . SPEED: Semantic prior and extremely efficient dilated convolution network for real-time metal surface defects detection [J ] . IEEE Transactions on Industrial Informatics , 2023 , 19 ( 12 ): 11380 - 11390 .
FENG H , SONG K C , CUI W Q , et al . Cross position aggregation network for few-shot strip steel surface defect segmentation [J ] . IEEE Transactions on Instrumentation and Measurement , 2023 , 72 : 5007410 .
WANG H Y , XIE S , LIN L F , et al . Mixed transformer U-Net for medical image segmentation [C ] // ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2022 : 2390 - 2394 .
GEIGER A , LENZ P , STILLER C , et al . Vision meets robotics: The KITTI dataset [J ] . International Journal of Robotics Research , 2013 , 32 ( 11 ): 1231 - 1237 .
ZHANG X L , LIANG L , ZHAO S L , et al . GRFB-UNet: A new multi-scale attention network with group receptive field block for tactile paving segmentation [J ] . Expert Systems with Applications , 2024 , 238 : 122109 .
LIU J Y , ZHANG Q Y , WAN X , et al . LuSNAR: A lunar segmentation, navigation and reconstruction dataset based on Muti-sensor for autonomous exploration [EB/OL ] . ( 2024-09-26 )[ 2025-03-23 ] . https://arXiv.org/abs/2407.06512 https://arXiv.org/abs/2407.06512 .
CHEBROLU N , LOTTES P , SCHAEFER A , et al . Agricultural robot dataset for plant classification, localization and mapping on sugar beet fields [J ] . International Journal of Robotics Research , 2017 , 36 ( 10 ): 1045 - 1052 .
WEI T Q , CHEN Z , YU X , et al . PlantSeg: A large-scale in-the-wild dataset for plant disease segmentation [EB/OL ] . ( 2024-09-06 )[ 2025-03-23 ] . https://arXiv.org/abs/2409.04038 https://arXiv.org/abs/2409.04038 .
LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2015 : 3431 - 3440 .
BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: A deep convolutional encoder-decoder architecture for image segmentation [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 12 ): 2481 - 2495 .
RUAN J C , GAO J S , XIE M Y , et al . Learning multi-axis representation in frequency domain for medical image segmentation [J ] . Machine Learning , 2025 , 114 ( 1 ): 10 .
0
Views
7
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621