Unifying audio signal processing and machine learning: a fundamental framework for machine hearing
统一音频信号处理和机器学习:机器听力的基本框架
基本信息
- 批准号:EP/L000776/1
- 负责人:
- 金额:$ 12.37万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2013
- 资助国家:英国
- 起止时间:2013 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Modern technology is leading to a flood of audio data. For example, over seventy two hours of unstructured and unlabelled sound-tracks are uploaded to internet sites every minute. Automatic systems are urgently needed for recognising audio content so that these sound-tracks can be tagged for categorisation and search. Moreover, an increasing proportion of recordings are made on hand-held devices in challenging environments that contain multiple sound sources and noise. Such uncurated and noisy data necessitate automatic systems for cleaning the audio content and separating sources from mixtures. On a related note, devices for the hearing impaired currently perform poorly in noise. In fact, this is a major reason why six million people in the UK who would benefit from a hearing aid, do not use them (a market worth £18 billion p.a.). Patients fitted with cochlear implants suffer from similar limitations, and as the population ages more people are affected. It is clear that audio recognition and enhancement methods are required to stop us drowning in audio-data, for processing in hearing devices, and tosupport new technological innovations. Current approaches to these problems use a combination of audio signal processing (which places the audio data into a convenient format and reduces the data-rate) and machine learning (which removes noise, separates sources, or classifies the content). It is widely believed that these two fields must become increasingly integrated in the future. However, this union is currently a troubled one, suffering from four problems. Inefficiency: The methods are too inefficient when we have vast amounts of data (as is the case for audio-tracks on the web) or for real-time applications (such as is necessary in hearing aids)Impoverished models: The machine learning modules tend to be statistically limited.Unadapted: The signal processing modules are unadapted despite evidence from other fields, like computer vision, which suggests that automatic tuning leads to significant performance gains Distorted mixtures: The signal processing modules introduce non-linear distortions which are not captured by the machine learning modules.In this project we address these four limitations by introducing a new theoretical framework which unifies signal processing and machine learning. The key step is to view the signal processing module as solving an inference problem. Since the machine-learning modules are often framed in this way, the two modules can be integrated into a single coherent approach allowing technologies from the two fields to be completely integrated. In the project we will then use the new approach to develop efficient, rich, adaptive, and distortion free approaches to audio denoising, source separation and recognition. We will evaluate the the noise reduction and source separations algorithms on the hearing impaired, and the audio recognition algorithms on audio-sound track data.We believe this new framework will form a foundation of the emerging field of machine hearing. In the future, machine hearing will be deployed in a vast range of applications from music processing tasks to augmented reality systems (in conjunction with technologies from computer vision). We believe that this project will kick start this proliferation.
现代技术正在导致大量的音频数据,例如,每分钟都会将超过 72 小时的非结构化和未标记的音轨上传到互联网站点,因此迫切需要自动系统来识别音频内容,以便可以对这些音轨进行识别。此外,越来越多的录音是在包含多个声源和噪音的充满挑战的环境中通过手持设备进行的,因此需要自动系统来进行记录。清理音频内容并将音源与混合物分离 与此相关的是,听力障碍人士的设备目前在噪音方面表现不佳。事实上,这是英国 600 万人受益于助听器的主要原因。不要使用它们(每年价值 180 亿英镑的市场)。安装人工耳蜗的患者也面临着类似的限制,而且随着人口老龄化,越来越多的人受到影响,显然需要音频识别和增强方法来阻止我们淹没在音频中。 -数据,用于听力处理设备,并支持新技术创新,当前解决这些问题的方法结合使用音频信号处理(将音频数据转换为方便的格式并降低数据速率)和机器学习(消除噪声、分离源或人们普遍认为,这两个领域在未来必须变得越来越融合,但目前这一联盟存在着四个问题:当我们拥有大量数据时,这些方法效率太低。 (就像这样的情况网络上的音轨)或实时应用(例如助听器所必需的)贫困模型:机器学习模块往往在技术上受到限制。未适应:尽管有来自其他领域的证据,例如信号处理模块尚未适应计算机视觉,这表明自动调整会带来显着的性能提升。 混合失真:信号处理模块引入了机器学习模块无法捕获的非线性失真。在这个项目中,我们通过引入一个新的理论框架来解决这四个限制,统一信号关键步骤是将信号处理模块视为解决推理问题,因为机器学习模块通常以这种方式构建,因此可以将这两个模块集成到一个单一的连贯方法中,从而允许使用两个模块。在该项目中,我们将使用新方法开发高效、丰富、自适应且无失真的音频去噪、源分离和识别方法,我们将评估听力方面的降噪和源分离算法。受损,以及音频音轨上的音频识别算法我们相信,这个新框架将构成机器听觉新兴领域的基础。未来,机器听觉将被部署在从音乐处理任务到增强现实系统的广泛应用中(与计算机视觉技术相结合)。 )我们相信这个项目将启动这种扩散。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Infinite-Horizon Gaussian Processes
- DOI:
- 发表时间:2018-11
- 期刊:
- 影响因子:0
- 作者:A. Solin;J. Hensman;Richard E. Turner
- 通讯作者:A. Solin;J. Hensman;Richard E. Turner
Sparse Gaussian Process Variational Autoencoders
稀疏高斯过程变分自动编码器
- DOI:10.48550/arxiv.2010.10177
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Ashman M
- 通讯作者:Ashman M
On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes
- DOI:10.17863/cam.15597
- 发表时间:2015-04
- 期刊:
- 影响因子:0
- 作者:A. G. Matthews;J. Hensman;Richard E. Turner;Zoubin Ghahramani
- 通讯作者:A. G. Matthews;J. Hensman;Richard E. Turner;Zoubin Ghahramani
Deterministic Variational Inference for Robust Bayesian Neural Networks
- DOI:
- 发表时间:2018-09
- 期刊:
- 影响因子:0
- 作者:Anqi Wu;Sebastian Nowozin;Edward Meeds;Richard E. Turner;José Miguel Hernández-Lobato;Alexander L. Gaunt
- 通讯作者:Anqi Wu;Sebastian Nowozin;Edward Meeds;Richard E. Turner;José Miguel Hernández-Lobato;Alexander L. Gaunt
The Multivariate Generalised von Mises Distribution: Inference and Applications
- DOI:10.1609/aaai.v31i1.10943
- 发表时间:2016-02
- 期刊:
- 影响因子:0
- 作者:Alexandre K. W. Navarro;J. Frellsen;Richard E. Turner
- 通讯作者:Alexandre K. W. Navarro;J. Frellsen;Richard E. Turner
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Richard Turner其他文献
Minority opinion: CT screening for lung cancer.
少数意见:肺癌CT筛查。
- DOI:
10.1097/01.rti.0000189989.65271.79 - 发表时间:
2005 - 期刊:
- 影响因子:3.3
- 作者:
C. Henschke;J. Austin;Nathaniel Berlin;T. Bauer;S. Giunta;Fred Gannis;M. Kalafer;S. Kopel;Albert Miller;H. Pass;H. Roberts;R. Shah;D. Shaham;Michael John Smith;S. Sone;Richard Turner;D. Yankelevitz;J. Zulueta - 通讯作者:
J. Zulueta
The importance of psychological flow in a creative, embodied and enactive psychological therapy approach (Arts for the Blues)
心理流动在创造性、具体化和积极的心理治疗方法中的重要性(蓝调艺术)
- DOI:
10.1080/17432979.2022.2130431 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Ailsa Parsons;Linda Dubrow‐Marshall;Richard Turner;S. Thurston;Jennifer S. Starkey;Joanna Omylinska‐Thurston;V. Karkou - 通讯作者:
V. Karkou
Comprehensive studies on building a scalable downstream process for mRNAs to enable mRNA therapeutics
关于构建可扩展的 mRNA 下游流程以实现 mRNA 疗法的综合研究
- DOI:
10.1002/btpr.3301 - 发表时间:
2022 - 期刊:
- 影响因子:2.9
- 作者:
Tingting Cui;Kareem Fakhfakh;Hannah Turney;Gülin Güler;A. Tołoczko;Martyn Hulley;Richard Turner - 通讯作者:
Richard Turner
Extracting Lineage Information from Hand-Drawn Ancient Maps
从手绘古代地图中提取谱系信息
- DOI:
10.1007/978-3-319-41501-7_30 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Ehab Essa;Xianghua Xie;Richard Turner;Matthew Stevens;D. Power - 通讯作者:
D. Power
The New Zealand Reanalysis (NZRA)
新西兰再分析 (NZRA)
- DOI:
10.2307/27226715 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Amir Pirooz;S. Moore;T. Carey;Richard Turner;Chun - 通讯作者:
Chun
Richard Turner的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Richard Turner', 18)}}的其他基金
Machine Learning for Tomorrow: Efficient, Flexible, Robust and Automated
面向未来的机器学习:高效、灵活、稳健和自动化
- 批准号:
EP/T005637/1 - 财政年份:2020
- 资助金额:
$ 12.37万 - 项目类别:
Research Grant
Nanoporous polymer particles and gels containing functionalized semi-rigid copolymer structures
含有官能化半刚性共聚物结构的纳米孔聚合物颗粒和凝胶
- 批准号:
1609379 - 财政年份:2016
- 资助金额:
$ 12.37万 - 项目类别:
Standard Grant
Machine Learning for Hearing Aids: Intelligent Processing and Fitting
助听器机器学习:智能处理和验配
- 批准号:
EP/M026957/1 - 财政年份:2015
- 资助金额:
$ 12.37万 - 项目类别:
Research Grant
Sterically Congested and Stiffened Alternating Copolymers: Synthesis, Solution and Solid-State Properties
空间拥挤和硬化交替共聚物:合成、溶液和固态特性
- 批准号:
1206409 - 财政年份:2012
- 资助金额:
$ 12.37万 - 项目类别:
Standard Grant
Probabilistic Auditory Scene Analysis
概率听觉场景分析
- 批准号:
EP/G050821/1 - 财政年份:2010
- 资助金额:
$ 12.37万 - 项目类别:
Fellowship
Precisely Functionalized Alternating Copolymers Based on Substituted Stilbene Monomers
基于取代二苯乙烯单体的精确官能化交替共聚物
- 批准号:
0905231 - 财政年份:2009
- 资助金额:
$ 12.37万 - 项目类别:
Standard Grant
Improvement of Instruction in Marine Ecology
海洋生态学教学的改进
- 批准号:
7814013 - 财政年份:1978
- 资助金额:
$ 12.37万 - 项目类别:
Standard Grant
相似国自然基金
基于声音信号的山区桁架式索道塔架螺栓松动损伤识别研究
- 批准号:52268049
- 批准年份:2022
- 资助金额:33 万元
- 项目类别:地区科学基金项目
硬岩破裂失稳的声音信号定量化研究
- 批准号:52169021
- 批准年份:2021
- 资助金额:35 万元
- 项目类别:地区科学基金项目
听觉注意强化目标声音信号的中枢表达时间精准性的神经机制
- 批准号:
- 批准年份:2020
- 资助金额:58 万元
- 项目类别:面上项目
基于非调制声音信号的非智能设备室内定位算法研究
- 批准号:62002104
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
声音-化学多模信号在锯腿原指树蛙繁殖活动中的作用
- 批准号:
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:
相似海外基金
Desing and systematization of next-generation pin-spot audio system infrastructure
下一代 pin-spot 音频系统基础设施的设计和系统化
- 批准号:
23H03425 - 财政年份:2023
- 资助金额:
$ 12.37万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
I-Corps: Using a digital audio signal to measure heart rate with a smartphone
I-Corps:使用数字音频信号通过智能手机测量心率
- 批准号:
2417580 - 财政年份:2023
- 资助金额:
$ 12.37万 - 项目类别:
Standard Grant
I-Corps: Using a digital audio signal to measure heart rate with a smartphone
I-Corps:使用数字音频信号通过智能手机测量心率
- 批准号:
2304562 - 财政年份:2023
- 资助金额:
$ 12.37万 - 项目类别:
Standard Grant
マルチタスク深層学習における補助損失の動的制御と音声コミュニケーションへの応用
多任务深度学习中辅助损失的动态控制及其在语音通信中的应用
- 批准号:
22K12105 - 财政年份:2022
- 资助金额:
$ 12.37万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Data-driven sound field measurement for high-resolution spatial audio analysis and its applications
数据驱动的高分辨率空间音频分析声场测量及其应用
- 批准号:
22H03608 - 财政年份:2022
- 资助金额:
$ 12.37万 - 项目类别:
Grant-in-Aid for Scientific Research (B)