[1] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[A].Advances in Neural Information Processing Systems[C].[S.l.]:NIPS,2012.1097-1105.
[2] CIRESAN D,GIUSTI A,et al.Deep neural networks segment neuronal membranes in electron microscopy images[A].Advances in neural information processing systems[C].[S.l.]:NIPS,2012.2843-2851.
[3] DAHL G E,YU D,DENG L,et al.Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition[J].IEEE Transactions on Audio,Speech,and Language Processing,2011,20(1):30-42.
[4] DENG L,HINTON G,KINGSBURY B.New types of deep neural network learning for speech recognition and related applications:An overview[A].2013 IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2013.8599-8603.
[5] COLLOBERT R,WESTON J.A unified architecture for natural language processing:Deep neural networks with multitask learning[A].Proceedings of the 25th International Conference on Machine Learning[C].USA:ACM,2008.160-167.
[6] HIRSCHBERG J,et al.Advances in natural language processing[J].Science,2015,349(6245):261-266.
[7] BAHDANAU D,CHOROWSKI J,et al.End-to-end attention-based large vocabulary speech recognition[A].2016 IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2016.4945-4949.
[8] GAO F C,XIN L I,YONG H Y.Using highway connections to enable deep small-footprint LSTM-RNNs for speech recognition[J].Chinese Journal of Electronics,2019,28(1):107-112.
[9] ZEYER A,IRIE K,et al.Improved training of end-to-end attention models for speech recognition[A].Interspeech[C].Hyderabad:[s.n.],2018.1845-1859.
[10] MERBOLDT A,ZEYER A,et al.An analysis of local monotonic attention variants[A].Interspeech[C].Graz:[s.n.],2019.1398-1402.
[11] BAHAR P,ZEYER A,et al.On using 2D sequence-to-sequence models for speech recognition[A].IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2019.5671-5675.
[12] ZWEIG G,et al.Advances in all-neural speech recognition[A].IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2017.4805-4809.
[13] CHOROWSKI J,BAHDANAU D,CHO K,et al.End-to-end continuous speech recognition using attention-based recurrent nn:First results[J].Eprint Arxiv,2014.
[14] CHAN W,et al.Listen,attend and spell:A neural network for large vocabulary conversational speech recognition[A].IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2016.4960-4964.
[15] BAHDANAU D,CHOROWSKI J,et al.End-to-end attention-based large vocabulary speech recognition[A].IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2016.4945-4949.
[16] MARTINS A,ASTUDILLO R.From softmax to sparsemax:A sparse model of attention and multi-label classification[A].International Conference on Machine Learning[C].[S.l.]:[s.n.],2016.1614-1623.
[17] KIM Y.Convolutional neural networks for sentence classification[A].Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)[C].USA:Association for Computational Linguistics,2014.1746-1751.
[18] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[A].Advances in Neural Information Processing Systems[C].[S.l.]:NIPS,2014.3104-3112.
[19] CHOROWSKI J K,et al.Attention-based models for speech recognition[A].Advances in Neural Information Processing Systems[C].[S.l.]:NIPS,2015.577-585.
[20] GODFREY J J,HOLLIMAN E C,MCDANIEL J.SWITCHBOARD:Telephone speech corpus for research and development[A].IEEE International Conference on Acoustics,Speech,and Signal Processing[C].USA:IEEE,1992.517-520.
[21] WENBIN J,PEILIN L,FEI W.Speech magnitude spectrum reconstruction from MFCCs using deep neural network[J].Chinese Journal of Electronics,2018,27(2):393-398.
[22] WEN M C,TIAN C H.The multi-weight neuron with geometry algorithm and its application[J].Chinese Journal of Electronics,2008,17(2):261-264.
[23] JI X U,JIE L P,YONG H Y.Agglutinative language speech recognition using automatic allophone deriving[J].Chinese Journal of Electronics,2016,25(2):134-139.
[24] KIM S,HORI T.Joint CTC-attention based end-to-end speech recognition using multi-task learning[A].IEEE International Conference on Acoustics,Speech and Signal Processing[C].USA:IEEE,2017.4835-4839.
[25] WATANABE S,HORI T,KIM S,et al.Hybrid CTC/attention architecture for end-to-end speech recognition[J].IEEE Journal of Selected Topics in Signal Processing,2017,11(8):1240-1253. |