

浏览全部资源
扫码关注微信
南京邮电大学计算机学院、软件学院、网络空间安全学院,江苏南京 210003
Received:06 January 2023,
Revised:2023-08-25,
Published:25 November 2023
移动端阅览
蒋凌云,鞠金恒,徐佳等.一种基于改进CRNN的轻量化乐谱识别方法[J].电子学报,2023,51(11):3167-3175.
JIANG Ling-yun,JU Jin-heng,XU Jia,et al.A Lightweight Music Recognition Method Based on Improved CRNN[J].ACTA ELECTRONICA SINICA,2023,51(11):3167-3175.
蒋凌云,鞠金恒,徐佳等.一种基于改进CRNN的轻量化乐谱识别方法[J].电子学报,2023,51(11):3167-3175. DOI: 10.12263/DZXB.20230031.
JIANG Ling-yun,JU Jin-heng,XU Jia,et al.A Lightweight Music Recognition Method Based on Improved CRNN[J].ACTA ELECTRONICA SINICA,2023,51(11):3167-3175. DOI: 10.12263/DZXB.20230031.
基于深度学习的乐谱识别方法提高了识别精度,但存在模型训练单次迭代耗时长、总迭代轮数多的问题.本文提出了一种改进卷积循环神经网络的轻量化乐谱识别方法CRNN-lite(lightweight Convolutional Recurrent Neural Networks),该方法在卷积层引入残差式深度可分离卷积,减少计算量并加速特征图的提取;在循环层使用双向简单循环单元,采用并行计算避免了串行计算的强依赖问题;在转录层调节交叉熵函数参数,针对性地学习不均衡样本数据.实验结果表明,该方法提高训练速度,单次迭代耗时为基准网络的43%,在失真图像数据上符号错误率为1.12%,序列错误率为14.5%,错误率指标均优于对比方案.
The deep learning-based music score recognition method has improved recognition accuracy
while there is a dilemma of long single iteration time and multiple total iterations in model training. This work proposes CRNN-lite (lightweight Convolutional Recurrent Neural Networks) for music score recognition. CRNN-lite introduced residual depth separable convolution into the convolution layer
which reduced the computation and speeds up the feature map extraction. The bidirectional simple recurrent unit was used in the recurrent layer
and the strong dependence on serial computation was avoided by parallel computation. The parameters of the cross entropy function were adjusted at the transcription layer to learn unbalanced sample data. The results show that the proposed method improves the training speed
the single iteration time is 43% of the benchmark network
the symbol error rate is 1.12% and the sequence error rate is 14.5% on the distorted image data. The error rate indexes are better than the comparison scheme.
CALVO-ZARAGOZA J , HAJIČ J , PACHA A . Understanding optical music recognition [J ] . ACM Computing Surveys , 2020 , 53 ( 4 ): 1 - 35 .
PACHA A . Incremental Supervised Staff Detection [C ] // Proceedings of the 2nd International Workshop on Reading Music Systems . Alicante : WoRMS , 2019 : 16 - 20 .
TUGGENER L , ELEZI I , SCHMIDHUBER J , et al . DeepScores-a dataset for segmentation, detection and classification of tiny objects [C ] // 2018 24th International Conference on Pattern Recognition (ICPR) . Piscataway : IEEE , 2018 : 3704 - 3709 .
JIA X , SONG Y Q , MA S C , et al . Printed score detection based on deep learning [C ] // 2021 Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS) . Piscataway : IEEE , 2021 : 173 - 177 .
CHOI K , FAZEKAS G , SANDLER M , et al . Convolutional recurrent neural networks for music classification [C ] // 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2017 : 2392 - 2396 .
BARÓ A , RIBA P , CALVO-ZARAGOZA J , et al . From optical music recognition to handwritten music recognition: A baseline [J ] . Pattern Recognition Letters , 2019 , 123 : 1 - 8 .
LIU A Z , ZHANG L P , MEI Y Q , et al . Residual recurrent CRNN for end-to-end optical music recognition on monophonic scores [C ] // Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding . New York : ACM , 2021 : 23 - 27 .
吴琼 , 李锵 , 关欣 . 基于多尺度残差式卷积神经网络与双向简单循环单元的光学乐谱识别方法 [J ] . 激光与光电子学进展 , 2020 , 57 ( 8 ): 67 - 76 .
WU Q , LI Q , GUAN X . Optical music recognition method combining multi-scale residual convolutional neural network and bi-directional simple recurrent units [J ] . Laser & Optoelectronics Progress , 2020 , 57 ( 8 ): 67 - 76 . (in Chinese)
袁海英 , 成君鹏 , 曾智勇 , 等 . Mobile_BLNet:基于Big-Little Net的轻量级卷积神经网络优化设计 [J ] . 电子学报 , 2023 , 51 ( 1 ): 180 - 191 .
YUAN H Y , CHENG J P , ZENG Z Y , et al . Mobile_ BLNet: Optimization design of lightweight convolutional neural network based on big-little net [J ] . Acta Electronica Sinica , 2023 , 51 ( 1 ): 180 - 191 . (in Chinese)
HOWARD A G , ZHU M L , CHEN B , et al . MobileNets: Efficient convolutional neural networks for mobile vision applications [EB/OL ] . ( 2017-04-17 )[ 2021-12-28 ] . https://arxiv.org/abs/1704.04861 https://arxiv.org/abs/1704.04861 .
SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: Inverted residuals and linear bottlenecks [C ] // 2018 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2018 : 4510 - 4520 .
HOWARD A , SANDLER M , CHEN B , et al . Searching for MobileNetV3 [C ] // 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 1314 - 1324 .
ZHANG X Y , ZHOU X Y , LIN M X , et al . ShuffleNet: An extremely efficient convolutional neural network for mobile devices [C ] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 6848 - 6856 .
MA N N , ZHANG X Y , ZHENG H T , et al . ShuffleNet V2: Practical guidelines for efficient CNN architecture design [C ] // European Conference on Computer Vision . Cham : Springer , 2018 : 122 - 138 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .
BALLAKUR A A , ARYA A . Empirical evaluation of gated recurrent neural network architectures in aviation delay prediction [C ] // 2020 5th International Conference on Computing, Communication and Security (ICCCS) . Piscataway : IEEE , 2020 : 1 - 7 .
LIN T Y , GOYAL P , GIRSHICK R , et al . Focal loss for dense object detection [C ] // 2017 IEEE International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2017 : 2999 - 3007 .
CALVO-ZARAGOZA J . The printed images of music staves (PrIMuS) dataset [EB/OL ] . ( 2018-04-11 )[ 2022-10-08 ] . https://grfia.dlsi.ua.es/primus https://grfia.dlsi.ua.es/primus .
0
Views
26
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621