1.北方民族大学计算机科学与工程学院,宁夏银川 750021
2.北方民族大学图像图形智能处理国家民委重点实验室, 宁夏银川 750021
3.宁夏医科大学医学信息与工程学院,宁夏银川 750004
[ "周涛 男,1977年出生于宁夏回族自治区吴忠市.现为北方民族大学计算机科学与工程学院教授.主要研究方向为医学图像分析处理、深度学习、模式识别. E-mail: zhoutaonxmu@126.com." ]
[ "牛玉霞 女,2000年出生于山西省长治市.现为北方民族大学计算机科学与工程学院硕士研究生.主要研究方向为智能医学图像分析处理. E-mail: nyx2607133584@163.com." ]
[ "叶鑫宇 男,1997年出生于湖北省天门市.曾为北方民族大学计算机科学与工程学院硕士研究生.主要研究方向为智能医学图像分析处理. E-mail: 3303626778@qq.com." ]
[ "刘隆 男,2000年出生于安徽省安庆市.现为北方民族大学计算机科学与工程学院硕士研究生.主要研究方向为智能医学图像分析处理. E-mail: liulong5254@163.com." ]
[ "陆惠玲 女,1976年出生于河北省保定市.现为宁夏医科大学医学信息与工程学院教授.主要研究方向为医学图像分析处理、深度学习和模式识别等. E-mail: lu_huiling@163.com" ]
收稿:2024-07-09,
修回:2025-02-27,
纸质出版:2025-03-25
移动端阅览
周涛, 牛玉霞, 叶鑫宇, 等. 面向肺部肿瘤分类的跨模态Light-3Dformer模型[J]. 电子学报, 2025, 53(03): 951-961.
ZHOU Tao, NIU Yu-xia, YE Xin-yu, et al. Cross-Modal Light-3Dformer Model for Lung Tumor Classification[J]. Acta Electronica Sinica, 2025, 53(03): 951-961.
周涛, 牛玉霞, 叶鑫宇, 等. 面向肺部肿瘤分类的跨模态Light-3Dformer模型[J]. 电子学报, 2025, 53(03): 951-961. DOI:10.12263/DZXB.20240642
ZHOU Tao, NIU Yu-xia, YE Xin-yu, et al. Cross-Modal Light-3Dformer Model for Lung Tumor Classification[J]. Acta Electronica Sinica, 2025, 53(03): 951-961. DOI:10.12263/DZXB.20240642
基于深度学习的三维多模态正电子发射型断层扫描/计算机断层扫描(Positron Emission Tomography/Computed Tomography,PET/CT)肺部肿瘤识别是一个重要的研究方向.肺部肿瘤病灶的空间形状不规则、与周围组织边界模糊,导致模型难以充分提取肿瘤特征,且模型在三维任务中需要较高的计算复杂度.针对上述问题,本文提出一种跨模态Light-3Dformer的三维肺部肿瘤识别模型.本文的主要创新工作有以下几个方面.首先,采用主、辅网络结构,其中主干网络提取PET/CT图像特征,辅助网络提取PET图像和CT图像特征,并采用轻量化跨模态协同注意力实现多模态特征增强和交互式学习.其次,设计Light-3Dformer模块,在该模块中,将Transformer的2次矩阵乘法操作更新为全局注意力机制Lightformer的线性元素乘法操作;设计级联Lightformer结构,其输出特征图和最初的输入特征图融合,通过并行和融合更多的深浅层特征,可以实现轻量化和提取丰富的梯度信息;设计无参数的注意力,该机制能从通道、空间和断层3个方面增强肺部肿瘤特征提取能力.再次,设计轻量化跨模态协同注意力模块(Light Cross-modal Collaborative Attention Module,LCCAM),该模块能充分学习三维多模态影像的跨模态优势信息,对深浅层特征进行交互式学习.最后,进行消融实验和对比实验,在自建的肺部肿瘤三维多模态数据集中,本文模型在计算量和运行时间最优的前提下,准确率和曲线下面积(Area Under the Curve,AUC)值分别达到90.19%和89.81%,与3D-SwinTransformer-S模型相比,参数量降低117倍,计算量降低400倍.实验结果表明:本文模型能更好地提取肺部肿瘤病灶的多模态信息,这为深度学习三维模型轻量化和多模态交互提供了新思路.
Recognition of 3D multimodal positron emission tomography/computed tomography (PET/CT) lung tumor using deep learning is an important research area. In medical images of lung tumors
the spatial shape of lesions is irregular and the boundary between the lesions and the surrounding tissues is blurred
which makes it difficult for the model to fully extract tumor features
and the computational complexity of the model is higher in three-dimensional tasks. To solve the above problems
a cross-modal Light-3Dformer 3D lung tumor recognition model is proposed in this paper. The main contributions of this paper are as follows. Firstly
the backbone network extracts PET/CT image features
and the auxiliary network extracts PET image features and CT image features. Multi-modal feature enhancement and interactive learning are realized by lightweight cross-modal collaborative attention. Secondly
Light-3Dformer module are designed. In this module
Updating the 2 times matrix multiplication operation of Transformer to the linear element multiplication operation of Lightformer; The cascade Lightformer structure is designed
the output feature map of the cascade Lightformer structure and the initial input feature map are fused
through parallel and deep and shallow feature fusion
lightweight and rich gradient information can be realized; Designing with parameter less attention
this structure can enhance the ability of lung tumor feature extraction from three aspects: channel
space
and tomography image. Thirdly
lightweight cross-modal collaborative attention module (LCCAM) is designed
which can fully learn the cross-modal advantage information of 3D multi-modal images and carry out interactive learning of deep and shallow features. Finally
ablation experiments and comparative experiments. In the self-built 3D multi-modal data set of lung tumor
the accuracy and area under the curve (AUC) values of the model are 90.19% and 89.81%
respectively
under the premise of optimal computation and running time. Comparing with the 3D-SwinTransformer-S model
the computation quantity is reduced by 117 times
and the calculation quantity is reduced by 400 times. The experimental results show that the model can better extract multi-modal information of lung tumor lesions
which provides a new idea for lightweight and multi-modal interaction of deep learning 3D models.
SHI Z X , LIN J L , WU Y F , et al . Burden of cancer and changing cancer spectrum among older adults in China: Trends and projections to 2030 [J ] . Cancer Epidemiology , 2022 , 76 : 102068 .
LEI Y M , ZHANG J P , SHAN H M . Strided self-supervised low-dose CT denoising for lung nodule classification [J ] . Phenomics , 2021 , 1 ( 6 ): 257 - 268 .
SORI W J , FENG J , GODANA A W , et al . DFD-Net: Lung cancer detection from denoised CT scan image using deep learning [J ] . Frontiers of Computer Science , 2020 , 15 ( 2 ): 152701 .
PANDYA M , JARDOSH S , THAKKAR A . An efficient IISH-2D DCNN-based lung nodule classification using CT scan images [J ] . International Journal of Modeling, Simulation, and Scientific Computing , 2023 , 14 ( 1 ): 2243005 .
DUTANDE P , BAID U , TALBAR S . LNCDS: A 2D-3D cascaded CNN approach for lung nodule classification, detection and segmentation [J ] . Biomedical Signal Processing and Control , 2021 , 67 : 102527 .
FARUQUI N , YOUSUF M ABU , WHAIDUZZAMAN M , et al . LungNet: A hybrid deep-CNN model for lung cancer diagnosis using CT and wearable sensor-based medical IoT data [J ] . Computers in Biology and Medicine , 2021 , 139 : 104961 .
周涛 , 刘赟璨 , 侯森宝 , 等 . M 3 Res-Transformer: 新冠肺炎胸部X-ray图像识别模型 [J ] . 电子学报 , 2024 , 52 ( 2 ): 589 - 601 .
ZHOU T , LIU Y C , HOU S B , et al . M 3 res-transformer: Chest X-ray image recognition model of COVID-19 [J ] . Acta Electronica Sinica , 2024 , 52 ( 2 ): 589 - 601 . (in Chinese)
NEAL JOSHUA E S , BHATTACHARYYA D , CHAKKRAVARTHY M , et al . 3D CNN with visual insights for early detection of lung cancer using gradient-weighted class activation [J ] . Journal of Healthcare Engineering , 2021 , 2021 : 6695518 .
TSIVGOULIS M , PAPASTERGIOU T , MEGALOOIKONOMOU V . An improved SqueezeNet model for the diagnosis of lung cancer in CT scans [J ] . Machine Learning with Applications , 2022 , 10 : 100399 .
ZHANG H W , ZHANG W , WANG S S , et al . Deep 3D multi-scale dual path network for automatic lung nodule classification [J ] . International Journal of Biomedical Engineering and Technology , 2022 , 39 ( 2 ): 149 .
ZHOU T , LIU F Z , YE X Y , et al . RNE-DSNet: A re-parameterization neighborhood enhancement-based dual-stream network for CT image recognition [J ] . Engineering Science and Technology, an International Journal , 2024 , 56 : 101760 .
GUO Y X , SONG Q , JIANG M M , et al . Histological subtypes classification of lung cancers on CT images using 3D deep learning and radiomics [J ] . Academic Radiology , 2021 , 28 ( 9 ): 258 - 266 .
FU Y , XUE P , ZHAO P , et al . 3D multi-resolution deep learning model for diagnosis of multiple pathological types on pulmonary nodules [J ] . International Journal of Imaging Systems and Technology , 2022 , 32 ( 1 ): 74 - 87 .
KRONEMEIJER P S , GAVVES E , SONKE J J , et al . Tumor tracking in 4D CT images for adaptive radiotherapy [C ] // Medical Imaging 2022: Image Processing . San Diego : SPIE , 2022 : 54 - 61 .
NAEEM ABID M M , ZIA T , GHAFOOR M , et al . Multi-view convolutional recurrent neural networks for lung cancer nodule identification [J ] . Neurocomputing , 2021 , 453 : 299 - 311 .
ZHOU T , NIU Y X , LU H L , et al . Vision transformer: To discover the “four secrets” of image patches [J ] . Information Fusion , 2024 , 105 : 102248 .
NIU C , WANG G . Unsupervised contrastive learning based transformer for lung nodule detection [J ] . Physics in Medicine and Biology , 2022 , 67 ( 20 ). DOI: 10.1088/1361-10.1088/6560/ac92ba http://dx.doi.org/10.1088/1361-10.1088/6560/ac92ba .
MEHTA S , RASTEGARI M . MobileVIT: Light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL ] . ( 2022-03-04 )[ 2024-07-09 ] . https://arxiv.org/abs/2110.02178v2 https://arxiv.org/abs/2110.02178v2 .
KUMAR A , FULHAM M , FENG D G , et al . Co-learning feature fusion maps from PET-CT images of lung cancer [J ] . IEEE Transactions on Medical Imaging , 2020 , 39 ( 1 ): 204 - 217 .
SHI H Y , ZHANG N D , WU X Q , et al . Multimodal lung tumor image recognition algorithm based on integrated convolutional neural network [J ] . Concurrency and Computation: Practice and Experience , 2020 , 32 ( 21 ): e4965 .
SCHWYZER M , FERRARO D A , MUEHLEMATTER U J , et al . Automated detection of lung cancer at ultralow dose PET/CT by deep neural networks-Initial results [J ] . Lung Cancer , 2018 , 126 : 170 - 173 .
CHEN S , HAN X J , TIAN G W , et al . Using stacked deep learning models based on PET/CT images and clinical data to predict EGFR mutations in lung cancer [J ] . Frontiers in Medicine , 2022 ( 9 ): 1041034 .
ZHAO X Y , WANG X , XIA W , et al . A cross-modal 3D deep learning for accurate lymph node metastasis prediction in clinical stage T1 lung adenocarcinoma [J ] . Lung Cancer , 2020 , 145 : 10 - 17 .
BRADSHAW T , PERK T , CHEN S , et al . Deep learning for the detection of benign and malignant pulmonary nodules in non-screening chest CT scans [J ] . Communications Medicine , 2018 ( 1 ): 327 .
YU W H , LUO M , ZHOU P , et al . Metaformer is actually what you need for vision [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2022 : 10809 - 10819 .
MAAZ M , SHAKER A , CHOLAKKAL H , et al . Edgenext: Efficiently amalgamated cnn-transformer architecture for mobile vision applications [C ] // In European Conference On Computer Vision . Israel : ECCV Workshops , 2023 : 3 - 20 .
GUO J Y , HAN K , WU H , et al . CMT: Convolutional neural networks meet vision transformers [C ] // 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 12165 - 12175 .
LI J S , XIA X , LI W , et al . Next-VIT: Next generation vision transformer for efficient deployment in realistic industrial scenarios [EB/OL ] . ( 2022-08-16 )[ 2024-07-09 ] . https://arxiv.org/abs/2207.05501v4 https://arxiv.org/abs/2207.05501v4 .
0
浏览量
11
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621