自闭症干预中无监督自编码的语音情感识别
作者:
基金项目:

国家自然科学基金(61540007,61373100);北京航空航天大学虚拟现实技术与系统国家重点实验室开放基金(BUAA-VR-15KF02,BUAA-VR-16KF13)


A Speech Emotion Recognition Based on Unsupervised Autoencoder in the Intervention of Autism
Author:
Fund Project:

National Natural Science Foundation of China (61540007, 61373100); Virtual Reality Technology and National Key Laboratory of Open Foundation (Beihang University)(BUAA-VR-15KF02, BUAA-VR-16KF13)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [23]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    语音情感识别是人机交互中重要的研究内容,儿童自闭症干预治疗中的语音情感识别系统有助于自闭症儿童的康复,但是由于目前语音信号中的情感特征多而杂,特征提取本身就是一项具有挑战性的工作,这样不利于整个系统的识别性能.针对这一问题,提出了一种语音情感特征提取算法,利用无监督自编码网络自动学习语音信号中的情感特征,通过构建一个3层的自编码网络提取语音情感特征,把多层编码网络学习完的高层特征作为极限学习机分类器的输入进行分类,其识别率为84.14%,比传统的基于提取人为定义特征的识别方法有所提高.

    Abstract:

    Speech emotion recognition is an important research area in human computer interaction (HCI). The speech emotion recognition system used in the intervention therapy for autistic children is helpful for their rehabilitation. However, the variation and complexity in speech emotion features, the extraction of which itself is a challenging task, will contribute to the difficulty to improve the recognition performance of the whole system. In view of this problem, this paper proposes a new method of speech emotion feature extraction with unsupervised auto-encoding network to learn emotional feature in speech signal automatically. By constructing a 3-layer auto-encoding network to extract the speech emotional feature, the high level feature is used as the input of extreme learning machine classifier to make final recognition. The speech emotion recognition rate of the system reaches 84.14%, which is higher than the traditional method based on human defined feature extraction.

    参考文献
    [1] Muller CL, Anacker AMJ, Veenstra-VanderWeele J. The serotonin system in autism spectrum disorder:From biomarker to animal models. Neuroscience, 2015.
    [2] Lovell B, Wetherell MA. The psychophysiological impact of childhood autism spectrum disorder on siblings. Research in Developmental Disabilities, 2016,49-50:226-234.
    [3] Szabó MK. Patterns of play activities in autism and typical development. A case study. Procedia-Social and Behavioral Sciences, 2014,140:630-637.
    [4] Rendall AR. Learning delays in a mouse model of autism spectrum disorder. Learning, 2015.
    [5] Muotri AR. The human model:Changing focus on autism research. Biological Psychiatry, 2015.
    [6] Jin Q, Chen SZ, Li XR, Yang G, Xu JP. Speech emotion recognition based on acoustic feature. Computer Science, 2015,42(9):24-28(in Chinese with Abstract English).
    [7] Tao HW, Zha C, Liang RY, Zhang XR, Zhao L, Wang QY. Spectrogram feature extraction algorithm for speech emotion recognition. Journal of Southeast University (Natural Science Edition), 2015,20(9):1817-821(in Chinese with Abstract English).
    [8] Mao QR, Bai LJ, Wang L, Zhan YZ. Emotion reasoning algorithm based on emotional context of speech. Pattern Recognition and Artificial Intelligence, 2014,9:826-834(in Chinese with Abstract English).
    [9] He L, Huang H, Liu XH. Speech emotion detection based on glottal signal features. Computer Engineering and Design, 2013,34(6):2147-2151(in Chinese with Abstract English).
    [10] Ye JX, Hu HX. Application of Hilbert marginal energy spectrum in speech emotion recognition. Computer Engineering and Applications, 2014,50(7):203-207(in Chinese with Abstract English).
    [11] Wu SQ, Falk TH, Chan WY. Automatic speech emotion recognition using modulation spectral features. Speech Communication, 2011,53(5):768-785.
    [12] Wu CH, Liang WB. Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Trans. on Affective Computing, 2011,2(1):10-21.
    [13] Bengio Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009,2(1):1-127.
    [14] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Computation, 2006,18(7):1527-1554.
    [15] Tang JX, Deng CW, Huang GB. Extreme learning machine for multilayer perceptron. IEEE Trans. on Neural Networks and Learning Systems, 2015.
    [16] Vincent P, Larochelle H, Bengio Y, Manzagol PA. Extracting and composing robust features with denoising autoencoders. In:Proc. of the 25th Int'l Conf. on Machine Learning. ACM, 2008. 1096-1103.
    [17] Daliri MR. Combining extreme learning machines using support vector machines for breast tissue classification. Computer Methods in Biomechanics and Biomedical Engineering, 2015,18(2):185-191.
    附中文参考文献:
    [6] 金琴,陈师哲,李锡荣,杨刚,许洁萍.基于声学特征的语言情感识别.计算机科学,2015,42(9):24-28.
    [7] 陶华伟,査诚,梁瑞宇,张昕然,赵力,王青云.向语音情感识别的语谱图特征提取算法.东南大学学报(自然科学版),2015,45(5):817-821.
    [8] 毛启容,白李娟,王丽,詹永照.基于情感上下文的语音情感推理算法.模式识别与人工智能,2014,27(9):826-834.
    [9] 何凌,黄华,刘肖珩.基于声门特征参数的语音情感识别算法研究.计算机工程与设计,2013,34(6):2147-2151.
    [10] 叶吉祥,胡海翔.Hilbert边际能量谱在语音情感识别中的应用.计算机工程与应用,2014,50(7):203-207.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

葛磊,强彦,赵涓涓.自闭症干预中无监督自编码的语音情感识别.软件学报,2016,27(S2):130-136

复制
分享
文章指标
  • 点击次数:1820
  • 下载次数: 3849
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2016-05-01
  • 最后修改日期:2016-11-21
  • 在线发布日期: 2017-01-10
文章二维码
您是第19792023位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号