

浏览全部资源
扫码关注微信
北方工业大学信息学院,北京 100144
Received:08 December 2021,
Revised:2022-05-17,
Published:25 January 2023
移动端阅览
张永梅,孙捷.基于动静态特征双输入神经网络的咳嗽声诊断COVID-19算法[J].电子学报,2023,51(01):202-212.
ZHANG Yong-mei,SUN Jie.A Dynamic-Static Dual Input Deep Neural Network Algorithm for Diagnosing COVID-19 by Cough[J].ACTA ELECTRONICA SINICA,2023,51(01):202-212.
张永梅,孙捷.基于动静态特征双输入神经网络的咳嗽声诊断COVID-19算法[J].电子学报,2023,51(01):202-212. DOI: 10.12263/DZXB.20211630.
ZHANG Yong-mei,SUN Jie.A Dynamic-Static Dual Input Deep Neural Network Algorithm for Diagnosing COVID-19 by Cough[J].ACTA ELECTRONICA SINICA,2023,51(01):202-212. DOI: 10.12263/DZXB.20211630.
新型冠状病毒肺炎(COVID-19)已经在世界范围内造成了严重影响,在防控疫情方面学者们进行了大量研究.利用咳嗽声判断病变部位来诊断新冠肺炎具有非接触、成本低、易获取等优点,但是此类研究在国内较为匮乏.梅尔倒谱系数(Mel Frequency Cepstral Coefficients,MFCC)特征仅能够表示声音的静态特征,而一阶差分MFCC特征还能反应声音的动态特征.为了更好地防治新冠肺炎,本文提出了基于动静态特征双输入神经网络的咳嗽声诊断新冠肺炎算法,通过咳嗽声诊断新冠肺炎.在Coswara数据集基础上,对咳嗽声的音频进行裁剪,提取MFCC和一阶差分MFCC特征训练了一个动静态特征双输入神经网络模型.本文模型采用统计池化层,可以输入不同长度的MFCC特征.实验结果表明,与现有模型相比较,本文算法明显提升了识别准确率、召回率、特异性和F1值.
The COVID-19 (corona virus disease 2019) has caused serious impacts worldwide. Many scholars have done a lot of research on the prevention and control of the epidemic. The diagnosis of COVID-19 by cough is non-contact
low-cost
and easy-access
however
such research is still relatively scarce in China. Mel frequency cepstral coefficients (MFCC) feature can only represent the static sound feature
while the first-order differential MFCC feature can also reflect the dynamic feature of sound. In order to better prevent and treat COVID-19
the paper proposes a dynamic-static dual input deep neural network algorithm for diagnosing COVID-19 by cough. Based on Coswara dataset
cough audio is clipped
MFCC and first-order differential MFCC features are extracted
and a dynamic and static feature dual-input neural network model is trained. The model adopts a statistic pooling layer so that different length of MFCC features can be input. The experiment results show the proposed algorithm can significantly improve the recognition accuracy
recall rate
specificity
and F1-score compared with the existing models.
LAGUARTA J , HUETO F , SUBIRANA B . COVID-19 artificial intelligence diagnosis using only cough recordings [J]. IEEE Open Journal of Engineering in Medicine and Biology , 2020 , 1 : 275 - 281 .
张小恒 , 张馨月 , 李勇明 , 等 . 面向帕金森病语音诊断的非监督两步式卷积稀疏迁移学习算法 [J]. 电子学报 , 2022 , 50 ( 1 ): 177 - 184 .
ZHANG X H , ZHANG X Y , LI Y M , et al . An unsupervised two-step convolution sparse transfer learning algorithm for Parkinson's disease speech diagnosis [J]. Acta Electronica Sinica , 2022 , 50 ( 1 ): 177 - 184 . (in Chinese)
世界卫生组织 . 2019冠状病毒病(COVID-19)专题问答 [EB/OL]. ( 2020-11-10 )[ 2021-11-10 ]. https://www.who.int/zh/news-room/questions-and-answers/item/coronavirus-disease-covid-19 https://www.who.int/zh/news-room/questions-and-answers/item/coronavirus-disease-covid-19 .
BROWN C , CHAUHAN J , GRAMMENOS A , et al . Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data [C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . Virtual Conference : ACM , 2020 : 3474 - 3484 .
HAN J , BROWN C , CHAUHAN J , et al . Exploring automatic COVID-19 diagnosis via voice and symptoms from crowdsourced data [C]// ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing . Toronto : IEEE , 2021 : 8328 - 8332 .
ANDREU-PEREZ J , PÉREZ-ESPINOSA H , TIMONET E , et al . A generic deep learning based cough analysis system from clinically validated samples for point-of-need COVID-19 test and severity levels [J]. IEEE Transactions on Services Computing , 2022 , 15 ( 3 ): 1220 - 1232 .
IMRAN A , POSOKHOVA I , QURESHI H N , et al . AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app [J]. Informatics in Medicine Unlocked , 2020 , 20 : 100378 .
BAGAD P , DALMIA A , DOSHI J , et al . Cough against COVID: Evidence of COVID-19 signature in cough sounds [EB/OL]. ( 2020-09-23 )[ 2021-12-07 ]. https://arxiv.org/abs/2009.08790 https://arxiv.org/abs/2009.08790 .
赵建 , 黎煊 , 刘望宏 , 等 . 基于DNN-HMM声学模型的连续猪咳嗽声识别 [J]. 农业工程技术 , 2020 , 40 ( 30 ): 93 .
ZHAO J , LI X , LIU W H , et al . DNN-HMM based acoustic model for continuous pig cough sound recognition [J]. Agricultural Engineering Technology , 2020 , 40 ( 30 ): 93 . (in Chinese)
黎煊 , 赵建 , 高云 , 等 . 基于深度信念网络的猪咳嗽声识别 [J]. 农业机械学报 , 2018 , 49 ( 3 ): 179 - 186 .
LI X , ZHAO J , GAO Y , et al . Recognition of pig cough sound based on deep belief nets [J]. Transactions of the Chinese Society for Agricultural Machinery , 2018 , 49 ( 3 ): 179 - 186 . (in Chinese)
李伟红 , 王伟冰 , 龚卫国 . 低信噪比下公共场所异常声音声学特征提取 [J]. 声学学报 , 2019 , 44 ( 5 ): 934 - 944 .
LI W H , WANG W B , GONG W G . Acoustic features extraction of abnormal sounds in public places with low signal-to-noise ratio [J]. Acta Acustica , 2019 , 44 ( 5 ): 934 - 944 . (in Chinese)
ALEXGEERTSEN , VLADISLAV C . GitHub-covid19-cough/dataset: Dataset of recordings of induced cough [DB/OL]. ( 2020-12-11 )[ 2021-12-07 ]. https://github.com/covid19-cough/dataset https://github.com/covid19-cough/dataset .
MUGULI A , PINTO L , SHARMA N , et al . DiCOVA challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics [EB/OL]. ( 2021-06-17 )[ 2021-12-07 ]. https://arxiv.org/abs/2103.09148 https://arxiv.org/abs/2103.09148 .
SHARMA N , KRISHNAN P , KUMAR R , et al . Coswara—A database of breathing, cough, and voice sounds for COVID-19 diagnosis [EB/OL]. ( 2020-08-11 )[ 2021-12-07 ]. https://arxiv.org/abs/2005.10548 https://arxiv.org/abs/2005.10548 .
DAVIS S , MERMELSTEIN P . Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences [J]. IEEE Transactions on Acoustics, Speech, and Signal Processing , 1980 , 28 ( 4 ): 357 - 366 .
顾玲玲 , 张晓俊 , 黄程韦 , 等 . 息肉与麻痹喉声源分类中非线性动力学发声系统模型研究 [J]. 声学学报 , 2015 , 40 ( 6 ): 878 - 885 .
GU L L , ZHANG X J , HUANG C W , et al . Study on the model of nonlinear dynamics phonation system for the classification of polyps and paralysis phonation [J]. Acta Acustica , 2015 , 40 ( 6 ): 878 - 885 . (in Chinese)
OKABE K , KOSHINAKA T , SHINODA K . Attentive statistics pooling for deep speaker embedding [EB/OL]. ( 2019-02-25 )[ 2021-12-07 ]. https://arxiv.org/abs/1803.10963 https://arxiv.org/abs/1803.10963 .
KAMBLE M R , GONZALEZ-LOPEZ J A , GRAU T , ET al . PANACEA cough sound-based diagnosis of COVID-19 for the DiCOVA 2021 Challenge [EB/OL]. ( 2021-06-07 )[ 2021-12-07 ]. https://arxiv.org/abs/2106.04423 https://arxiv.org/abs/2106.04423 .
DESHPANDE G , SCHULLER B W . The DiCOVA 2021 challenge—An encoder-decoder approach for COVID-19 recognition from coughing audio [C]// Interspeech 2021 . Brno : ISCA , 2021 : 931 - 935 .
CHANG J , CUI S , FENG M . DiCOVA-Net: Diagnosing covid-19 using acoustics based on deep residual network for the DiCOVA challenge 2021 [EB/OL]. ( 2021-07-11 )[ 2021-12-07 ]. https://arxiv.org/abs/2107.06126 https://arxiv.org/abs/2107.06126 .
0
Views
14
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621