D. Barchiesi, D. Giannoulis, D. Stowell, and M. D. Plumbley, "Acoustic scene classification: Classifying environments from the sounds they produce," IEEE Signal Process. Mag. 32, 16-34 (2015).10.1109/MSP.2014.2326181
E. Cakir, T. Heittola, H. Huttunen, and T. Virtanen, "Polyphonic sound event detection using multi label deep neural networks," Proc. IJCNN. 1-7 (2015).10.1109/IJCNN.2015.7280624
N. Turpault, R. Serizel, A. P. Shah, and J. Salamon, "Sound event detection in domestic environments with weakly labeled data and soundscape synthesis," Proc. 2019 DCASE Workshop, 253-257 (2019).10.33682/006b-jx26
J. Lee, J. Park, K. Kim, and J. Nam, "Samplecnn: End-to-end deep convolutional neural networks using very small filters for music classification," Applied Sciences, 8, 150 (2018).10.3390/app8010150
Y. Tokozume and T. Harada, "Learning environmental sounds with end-to-end convolutional neural network," Proc. ICASSP. 2721-2725 (2017).10.1109/ICASSP.2017.7952651
S. Chu, S. Narayanan, C. -C. J. Kuo, and M. J. Mataric, "Where am I? scene recognition for mobile robots using audio features," Proc. IEEE Intern. Conf. Multimedia and Expo. 885-888 (2006).10.1109/ICME.2006.262661
J. -J. Aucouturier, B. Defreville, and F. Pachet, "The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music," J. Acoust. Soc. Am. 122, 881-891 (2007).10.1121/1.275016017672638
J. Salamon and J. P. Bello, "Deep convolutional neural networks and data augmentation for environmental sound classification," IEEE Sig. Proc. Lett. 24, 279-283 (2017).10.1109/LSP.2017.2657381
R. Raj, S. Waldekar, and G. Saha, "Large-scale weakly labelled semi-supervised CQT based sound event detection in domestic environments," DCASE2018 Challenge Tech. Rep., 2018.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proc. IEEE Conf. Comput. Vis. Pattern Recognit, 770-778 (2016).10.1109/CVPR.2016.90
S. Woo, J. Park, J. -Y. Lee, and I. S. Kweon, "CBAM: convolutional block attention module," Proc. ECCV. 3-19 (2018).10.1007/978-3-030-01234-2_1
J. Wagner, D. Schiller, A. Seiderer, and E. Andre, "Deep learning in paralinguistic recognition tasks: are hand-crafted features still relevant?," Proc. Interspeech, 147-151 (2018).10.21437/Interspeech.2018-1238
Q. Zhou and Z. Feng, "Robust sound event detection through noise estimation and source separation using NMF," Proc. DCASE 2017 (2017).
T. Hayashi, S. Watanabe, T. Toda, T. Hori, J. L. Roux, and K. Takeda, "BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic sound event detection," Proc. ICASSP. 776-770 (2017).10.1109/ICASSP.2017.7952259
L. Jiakai and P. Shanghai, "Mean teacher convolution system for DCASE 2018 task 4," DCASE 2018 Challenge Tech. Rep., 2018.
D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980 (2014).
- Publisher :The Acoustical Society Of Korea
- Publisher(Ko) :한국음향학회
- Journal Title :The Journal of the Acoustical Society of Korea
- Journal Title(Ko) :한국음향학회지
- Volume : 39
- No :1
- Pages :24-31
- Received Date :2019. 11. 13
- Accepted Date : 2019. 12. 13