M<sup>3</sup> Res-Transformer: Chest X-ray Image Recognition Model of COVID-19

ZHOU Tao; LIU Yun-can; HOU Sen-bao; CHANG Xiao-yu; YE Xin-yu; LU Hui-ling

doi:10.12263/DZXB.20220999

您当前的位置：

首页 >

文章列表页 >

M³ Res-Transformer: Chest X-ray Image Recognition Model of COVID-19

PAPERS | 更新时间：2025-12-11

- M³ Res-Transformer: Chest X-ray Image Recognition Model of COVID-19
- ACTA ELECTRONICA SINICA Vol. 52, Issue 2, Pages: 589-601(2024)
- 作者机构：
  
  1.北方民族大学计算机科学与工程学院，宁夏银川 750021
  2.北方民族大学图像图形智能处理国家民委重点实验室，宁夏银川 750021
  3.宁夏医科大学医学信息与工程学院，宁夏银川 750004
- 作者简介：
- 基金信息：
  
  Key Research and Development Plan of Ningxia Autonomous Region(2020BEB04022);Graduate Innovation Project of North Minzu University(YCX22198;YCX22190)
- DOI：10.12263/DZXB.20220999
  CLC： TP399;
- Received：31 August 2022，
  
  Revised：2023-01-03，
  
  Published：25 February 2024
- 稿件说明：
移动端阅览
周涛,刘赟璨,侯森宝,等.M³ Res-Transformer：新冠肺炎胸部X-ray图像识别模型[J].电子学报,2024,52(02):589-601.

ZHOU Tao, LIU Yun-can, HOU Sen-bao, et al.M³ Res-Transformer: Chest X-ray Image Recognition Model of COVID-19[J].Acta Electronica Sinica, 2024, 52(02): 589-601.
周涛,刘赟璨,侯森宝,等.M³ Res-Transformer：新冠肺炎胸部X-ray图像识别模型[J].电子学报,2024,52(02):589-601. DOI：10.12263/DZXB.20220999

ZHOU Tao, LIU Yun-can, HOU Sen-bao, et al.M³ Res-Transformer: Chest X-ray Image Recognition Model of COVID-19[J].Acta Electronica Sinica, 2024, 52(02): 589-601. DOI：10.12263/DZXB.20220999

摘要

新冠肺炎（COVID-19）自爆发以来严重影响人类生命健康，近年来残差神经网络广泛应用于COVID-19识别任务中，辅助医生快速地诊断COVID-19患者，但是COVID-19图像病变区域形状复杂、大小不一，与周围组织的边界模糊，导致网络难以提取有效特征.本文针对上述问题，提出一种M

Res-Transformer的新冠肺炎胸部X-ray图像识别模型，采用Res-Transformer作为模型的主干网络，结合ResNet和ViT，有效地整合局部病变特征和全局特征；设计混合残差注意力模块（mixed residual attention Module，mraM），同时考虑通道和空间位置的相互依赖性，增强网络的特征表达能力；为了增大感受野，提取多尺度特征，通过叠加具有不同扩张率的扩张卷积构造多尺度扩张残差模块（multi-scale dilated residual Module，mdrM），根据不同层次特征尺度的差异，使用3个逐渐收缩尺度的mdrM进行多尺度特征提取；提出上下文交叉感知模块（contextual cross-awareness Module，ccaM），使用深层特征中的语义信息来引导浅层特征，然后将浅层特征中的空间信息嵌入深层特征中，采用交叉加权注意力机制高效聚合深层和浅层特征，获得更丰富的上下文信息.为了验证本文所提模型的有效性，在新冠肺炎胸部X-ray图像数据集上进行实验，与先进的CNN分类模型、融合不同注意力机制的ResNet50模型、基于Transformer的分类模型对比以及消融实验.结果表明，本文所提模型的Acc、Pre、Rec、

-Score与Spe指标分别为96.33%、96.36%、96.33%、96.35%与96.26%，在COVID-19胸部X-ray图像识别任务中有效提升了识别精度，并通过可视化方法对其进行进一步验证，为COVID-19的辅助诊断提供重要的参考价值.

Abstract

COVID-19 has seriously affected human life and health since its outbreak. In recent years

residual neural network has been widely used in COVID-19 recognition task to assist doctors to quickly diagnose COVID-19 patients. However

the shape of COVID-19 image lesion regions is complex

the size is different

and the boundary with surrounding tissues is blurred

which make it difficult for the network to extract effective features. Aiming at the above problems

a M

Res-Transformer model for COVID-19 Chest X-ray image recognition is proposed. Res-Transformer is used as the backbone network of the model

combining ResNet and ViT to effectively integrate local lesion features and global features; A mixed residual attention module (mraM) is designed to enhance the feature expression ability of the network by considering the interdependence of channels and spatial locations; In order to increase the receptive field and extract multi-scale features

the multi-scale dilated residual module (mdrM) is constructed by superimposing dilated convolution with different dilation rates

and three mdrM with gradually shrinking scales are used for multi-scale feature extraction according to the difference of feature scales at different layers; The contextual cross-awareness module (ccaM) is proposed

which uses the semantic information of deep features to guide shallow features

then embeds the spatial information of shallow features into deep features

and uses the cross-weighted attention mechanism to efficiently aggregate deep and shallow features to obtain richer contextual information. In order to verify the effectiveness of the model in this paper

experiments were conducted on the Chest X-ray image dataset of COVID-19

and through comparison with advanced CNN classification models

comparison with ResNet50 models fusing different attention mechanisms

comparison with Transformer-based classification models and ablation experiment

the results showed that the Acc

Pre

Rec

-Score and Spe indexes of the proposed model are 96.33%

96.36%

96.33%

96.35% and 96.26% respectively

which effectively improves the recognition accuracy in COVID-19 Chest X-ray image recognition task

then it is further verified by visualization method

which provides important reference value for COVID-19 aided diagnosis.

关键词

Keywords

references

GORBALENYA A E , BAKER S C , BARIC R S , et al . The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2 [J]. Nature Microbiology , 2020 , 5 : 536 - 544 .

LAI C C , SHIH T P , KO W C , et al . Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges [J]. International Journal of Antimicrobial Agents , 2020 , 55 ( 3 ): 105924 .

MAMALAKIS M , SWIFT A J , VORSELAARS B , et al . DenResCov-19: A deep transfer learning network for robust automatic classification of COVID-19, pneumonia, and tuberculosis from X-rays [J]. Computerized Medical Imaging and Graphics , 2021 , 94 : 102008 .

ZHOU T , LU H L , YANG Z L , et al . The ensemble deep learning model for novel COVID-19 on CT images [J]. Applied Soft Computing , 2021 , 98 : 106885 .

周涛 , 刘赟璨 , 陆惠玲 , 等 . ResNet及其在医学图像处理领域的应用:研究进展与挑战 [J]. 电子与信息学报 , 2022 , 44 ( 1 ): 149 - 167 .

ZHOU T , LIU Y C , LU H L , et al . ResNet and its application to medical image processing: Research progress and challenges [J]. Journal of Electronics & Information Technology , 2022 , 44 ( 1 ): 149 - 167 . (in Chinese)

WANG L D , LIN Z Q , WONG A . COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images [J]. Scientific Reports , 2020 , 10 : 19549 .

SETHY P K , BEHERA S K , RATHA P K , et al . Detection of coronavirus disease (COVID-19) based on deep features and support vector machine [J]. International Journal of Mathematical, Engineering and Management Sciences , 2020 , 5 ( 4 ): 643 - 651 .

NARIN A , KAYA C , PAMUK Z . Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks [J]. Pattern Analysis and Applications , 2021 , 24 ( 3 ): 1207 - 1220 .

FAROOQ M , HAFEEZ A . COVID-ResNet: A deep learning framework for screening of COVID19 from radiographs [EB/OL]. ( 2020-03-31 )[ 2022-08-28 ]. https://arxiv.org/abs/2003.14395 https://arxiv.org/abs/2003.14395 .

DOSOVITSKIY A , BEYER L , KOLESNIKOV A , et al . An image is worth 16 × 16 words: Transformers for image recognition at scale[EB/OL]. ( 2022-01-20 ) [ 2022-08-28 ]. https://arxiv.org/abs/2010.11929 https://arxiv.org/abs/2010.11929 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [EB/OL]. ( 2017-12-06 ) [ 2022-08-28 ]. https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762 .

PENG Z L , HUANG W , GU S Z , et al . Conformer: Local features coupling global representations for visual recognition [C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2021 : 357 - 366 .

ZHOU T , CHANG X Y , LU H L , et al . Pooling operations in deep learning: From invariable to variable [J]. BioMed Research International , 2022 , 2022 : 4067581 .

GAO H M , CHEN Z H , LI C M . Hierarchical shrinkage multiscale network for hyperspectral image classification with hierarchical feature fusion [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , 2021 , 14 : 5760 - 5772 .

PENG C , ZHANG X Y , YU G , et al . Large kernel matters—Improve semantic segmentation by global convolutional network [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 1743 - 1751 .

KRIZHEVSKY A , SUTSKEVER I , HINTON G . ImageNet classification with deep convolutional neural networkss [C]// The 25th International Conference on Neural Information Processing Systems . New York : ACM , 2012 : 1097 - 1105 .

SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [C]// The 3rd International Conference on Learning Representations . San Diego : ICLR , 2015 : 1 - 14 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

GAO S H , CHENG M M , ZHAO K , et al . Res2Net: A new multi-scale backbone architecture [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 2 ): 652 - 662 .

HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2261 - 2269 .

SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: Inverted residuals and linear bottlenecks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 4510 - 4520 .

SZEGEDY C , VANHOUCKE V , IOFFE S , et al . Rethinking the inception architecture for computer vision [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 2818 - 2826 .

TAN M X , LE Q V . EfficientNet: Rethinking model scaling for convolutional neural networks [EB/OL]. ( 2020-09-11 )[ 2022-8-28 ]. https://arxiv.org/abs/1905.11946v4 https://arxiv.org/abs/1905.11946v4 .

HU J , SHEN L , ALBANIE S , et al . Squeeze-and-excitation networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 8 ): 2011 - 2023 .

WOO S , PARK J , LEE J Y , et al . CBAM: Convolutional block attention module [C]// European Conference on Computer Vision . Cham : Springer , 2018 : 3 - 19 .

WANG Q L , WU B G , ZHU P F , et al . ECA-net: Efficient channel attention for deep convolutional neural networks [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11531 - 11539 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Neighborhood and Hypergraph Collaboration for Session-Based Recommendation

Object Detection Based on EIMYOLO for High-Resolution Remote Sensing Images

Single-Image High Dynamic Range Reconstruction Based on Multi-Attention and Perceptual Weighted Learning

Facing Different Challenges and Separating Homogeneous and Heterogeneous Information for RGBT Tracking

Related Author

LU Hui-ling

CHEN Rong-yuan

WEN Jie-bin

HUANG Shao-nian

HE Ye-yu

CAO Feng

ZENG Ke-wen

LI De-yu

Related Institution

School of Science， Ningxia Medical University

College of Frontier Intersection, Hunan University of Technology and Business

Key Laboratory of Hunan Province for Statistical Learning and Intelligent Computation, Hunan University of Technology and Business

School of Computer Science, Hunan University of Technology and Business

School of Information and Technology, Shanxi University

⁰

M3 Res-Transformer: Chest X-ray Image Recognition Model of COVID-19

DOI：10.12263/DZXB.20220999

摘要

Abstract

关键词

Keywords

references

M³ Res-Transformer: Chest X-ray Image Recognition Model of COVID-19