All Issue

2020 Vol.39, Issue 5 Preview Page

Research Article

September 2020. pp. 483-489
Abstract
References
1
C. Weng and D. Yu, "A comparison of lattice-free discriminative training criteria for purely sequence- trained neural network acoustic models," Proc. ICASSP. 6430-6434 (2019).
10.1109/ICASSP.2019.8683664
2
W. Michel, R. Schluter, and H. Ney, "Comparison of lattice-free and lattice-based sequence discriminative training criteria for LVCSR," arXiv:1907.01409 (2019).
10.21437/Interspeech.2019-2254
3
J. Jorge, A. Gimenez, J. Iranzo-Sanchez, J. Civera, A. Sanchis, and A. Juan, "Real-time one-pass decoder for speech recognition using LSTM language models," Proc. Interspeech, 3820-3824 (2019).
10.21437/Interspeech.2019-2798
4
J. Y. Chung, C. Gulcehre, K. H. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling," arXiv:1412.3555 (2014).
5
V. Peddinti, D. Povey, and S. Khudanpur, "A time delay neural network architecture for efficient modeling of long temporal contexts," Proc. Interspeech, 2-6 (2015).
6
B. Christian and T. Griffiths, Algorithms to Llive by: The Computer Science of Human Decisions Chapter 7: Overfitting (William Collins, Hampshire, 2017), pp. 149-168.
7
D. Povey, H. Hadian, P. Ghahremani, K. Li, and S. Khudanpur, "A time-restricted self-attention layer for ASR," Proc. ICASSP. 5874-5878 (2018).
10.1109/ICASSP.2018.8462497
8
G. Cheng, D. Povey, L. Huang, J. Xu, S. Khudanpur, and Y. Yan, "Output-gate projected gated recurrent unit for speech recognition," Proc. Interspeech, 1793- 1797 (2018).
10.21437/Interspeech.2018-140328916962
9
D. Bahdanau, K. H. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," Proc. ICLR. 1-15 (2015).
10
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," Proc. NIPS. 5999-6009 (2017).
11
M.-T. Luong, H. Pham, and C. D. Manning, "Effective approaches to attention-based neural machine translation," Proc. EMNLP. 1412-1421 (2015).
10.18653/v1/D15-1166
12
D. Povey, H. Hadian, P. Ghahremani, K. Li, and S. Khudanpur, "A time-restricted self-attention layer for ASR," Proc. ICASSP. 5874-5878 (2018).
10.1109/ICASSP.2018.8462497
13
Zeroth Korean, http://openslr.org/40/, (Last viewed June 4, 2020).
14
A. Stolcke, "SRILM-an extensible language modeling toolkit," Proc. ICSLP. 901-904 (2002).
15
H. Xu, T. Chen, D. Gao, Y. Wang, K. Li, N. Goel, Y. Carmiel, D. Povey, and S. Khudanpur, "a pruned RNNLM lattice-rescoring algorithm for automatic speech recognition," Proc. ICASSP. 5929-5933 (2018).
10.1109/ICASSP.2018.8461974
16
D. S. Park, W. Chan, Y. Zhang, C.-C. Chiu, B. Zoph, E. D. Cubuk, and Q. V. Le, "SpecAugment: a simple data augmentation method for automatic speech recognition," Proc. Interspeech, 2613-2617 (2019).
10.21437/Interspeech.2019-2680
17
Y.-Y. Wang, A. Acero, and C. Chelba, "Is word error rate a good indicator for spoken language understanding accuracy," Proc. ASRU. 577-582 (2003).
Information
  • Publisher :The Acoustical Society Of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 39
  • No :5
  • Pages :483-489
  • Received Date :2020. 08. 07
  • Accepted Date : 2020. 09. 04