喵ID:cdPBDv免责声明

Visual-only discrimination between native and non-native speech

仅通过视觉区分母语和非母语

基本信息

DOI:
--
发表时间:
2014
期刊:
IEEE International Conference on Acoustics, Speech, and Signal Processing
影响因子:
--
通讯作者:
M. Pantic
中科院分区:
文献类型:
--
作者: Christos Georgakis;Stavros Petridis;M. Pantic研究方向: -- MeSH主题词: --
关键词: --
来源链接:pubmed详情页地址

文献摘要

Accent is an important biometric characteristic that is defined by the presence of specific traits in the speaking style of an individual. These are identified by patterns in the speech production system, such as those present in the vocal tract or in lip movements. Evidence from linguistics and speech processing research suggests that visual information enhances speech recognition. Intrigued by these findings, along with the assumption that visually perceivable accent-related patterns are transferred from the mother tongue to a foreign language, we investigate the task of discriminating native from non-native speech in English, employing visual features only. Training and evaluation is performed on segments of continuous visual speech, captured by mobile phones, where all speakers read the same text. We apply various appearance descriptors to represent the mouth region at each video frame. Vocabulary-based histograms, being the final representation of dynamic features for all utterances, are used for recognition. Binary classification experiments, discriminating native and non-native speakers, are conducted in a subject-independent manner. Our results show that this task can be addressed by means of an automated approach that uses visual features only.
口音是一种重要的生物特征,它由个体说话风格中特定特征的存在所定义。这些特征通过语音产生系统中的模式来识别,比如声道或嘴唇运动中存在的模式。来自语言学和语音处理研究的证据表明,视觉信息可增强语音识别。受这些发现以及视觉可感知的与口音相关的模式从母语转移到外语这一假设的启发,我们仅利用视觉特征来研究区分英语母语者和非母语者语音的任务。我们对由手机拍摄的连续视觉语音片段进行训练和评估,所有说话者都朗读相同的文本。我们应用各种外观描述符来表示每个视频帧中的嘴巴区域。基于词汇的直方图作为所有话语动态特征的最终表示形式,被用于识别。以与说话者无关的方式进行区分母语者和非母语者的二分类实验。我们的结果表明,仅使用视觉特征的自动化方法可以解决这一任务。
参考文献(1)
被引文献(7)
Comparison of Image Transform-Based Features for Visual Speech Recognition in Clean and Corrupted Videos
DOI:
10.1155/2008/810362
发表时间:
2008-01-01
期刊:
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING
影响因子:
2.4
作者:
Seymour, Rowan;Stewart, Darryl;Ming, Ji
通讯作者:
Ming, Ji

数据更新时间:{{ references.updateTime }}

M. Pantic
通讯地址:
--
所属机构:
--
电子邮件地址:
--
免责声明免责声明
1、猫眼课题宝专注于为科研工作者提供省时、高效的文献资源检索和预览服务;
2、网站中的文献信息均来自公开、合规、透明的互联网文献查询网站,可以通过页面中的“来源链接”跳转数据网站。
3、在猫眼课题宝点击“求助全文”按钮,发布文献应助需求时求助者需要支付50喵币作为应助成功后的答谢给应助者,发送到用助者账户中。若文献求助失败支付的50喵币将退还至求助者账户中。所支付的喵币仅作为答谢,而不是作为文献的“购买”费用,平台也不从中收取任何费用,
4、特别提醒用户通过求助获得的文献原文仅用户个人学习使用,不得用于商业用途,否则一切风险由用户本人承担;
5、本平台尊重知识产权,如果权利所有者认为平台内容侵犯了其合法权益,可以通过本平台提供的版权投诉渠道提出投诉。一经核实,我们将立即采取措施删除/下架/断链等措施。
我已知晓