Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
基本信息
- 批准号:RGPIN-2019-06487
- 负责人:
- 金额:$ 1.36万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Practitioners of quantitative sciences (statisticians, engineers, physicists, etc.) often face intractable quantities, such as calculating an integral which cannot be solved analytically or optimizing a function which is not known explicitly. In both cases, the quantity of interest is referred to as intractable since its exact (mathematical) value is out of reach. Such a situation is usually solved using stochastic numerical algorithms: those are iterative methods using sequences of random numbers implemented on a computer which return a numerical value to the user, approximating the exact solution to their intractable problem. Based on this numerical output, the practitioner can classify data, assess a model, interpret an experiment, etc. The algorithm convergence must thus be well understood and it should return a value which is probabilistically accurate. Since the 1950s, a lot of research in statistics and machine learning has been devoted to designing algorithms that have a solid theoretical foundation. Such algorithms include Markov chain Monte Carlo methods, Expected Maximization algorithm, the gradient algorithm, etc. Those are referred to as standard algorithms as they are known and used by most applied scientists: in regular situations, the algorithm, seen as a black-box, converges and returns a trustworthy solution to the problem as long as it iterates a sufficient amount of time. Paradoxically, the increasing computational capacity of today's computers challenges the efficiency of standard algorithms. We outline two situations where they scale poorly to the dimension of the problem. - big data: improved storage capacity and better data-acquisition devices mean that algorithms can be supplied with more data (and more accurate ones). Standard algorithms become computationally slow and in fact unusable in practice. - high-dimensionality: novel computer architectures (parallel computing, GPU, etc.) allow scientists to attempt solving more complex problems such as integrating functions of several hundred variables. Standard algorithms become statistically slow, i.e. they need much more iterations for achieving a given accuracy than for lower dimensional problems. The main research line of this proposal deals with the approximation of some standard algorithms. Expected outputs include statistical methods that are computationally and statistically more efficient than standard algorithms while still retaining, in some capacity, their black-box aspect. Designing an approximation framework that guarantees that most theoretical properties of the standard algorithm are preserved in the noisy version is essential. Promising results have already been obtained for some algorithms and have been successfully applied to social network analysis and computer vision. Current research aims at making those approximate methods more generic and unifying the theoretical frameworks for analyzing and designing new approximate algorithms.
定量科学(统计学家,工程师,物理学家等)的实践者通常会面临棘手的数量,例如计算无法分析的积分或不明确求出的函数的积分。在这两种情况下,利息量都被称为棘手,因为它的确切(数学)值是无法触及的。这种情况通常使用随机数值算法解决:这些是使用在计算机上实现的随机数序列的迭代方法,该计算机在计算机上实现的序列将数值返回给用户,近似于其棘手的问题的精确解决方案。基于此数值输出,从业人员可以对数据进行分类,评估模型,解释实验等。因此,必须对算法收敛性进行充分的了解,并且应该返回一个可能准确的值。自1950年代以来,大量统计和机器学习研究一直致力于设计具有扎实理论基础的算法。这种算法包括马尔可夫链蒙特卡洛方法,预期的最大化算法,梯度算法等。这些算法被称为标准算法,因为它们已知和使用了大多数应用科学家:在常规情况下,在常规情况下,算法,被视为黑色框,将其视为一个值得信任的解决方案,并返回了一个问题,该算法是一个值得的解决方案。矛盾的是,当今计算机的计算能力的提高挑战了标准算法的效率。我们概述了两种情况,在这些情况下,它们缩小到了问题的维度。 - 大数据:提高存储容量和更好的数据收购设备意味着可以为算法提供更多数据(且更准确的数据)。标准算法在计算上变慢,实际上在实践中无法使用。 - 高维度:新颖的计算机体系结构(并行计算,GPU等)允许科学家尝试解决更复杂的问题,例如整合数百个变量的功能。标准算法在统计学上变慢,即,与较低的维度问题相比,它们需要更多的迭代才能达到给定的准确性。该提案的主要研究行涉及某些标准算法的近似。预期输出包括计算和统计学上比标准算法更有效的统计方法,同时仍以某种身份保留其黑框方面。设计一个近似框架,以确保标准算法的大多数理论属性在噪声版本中保存至关重要。一些算法已经获得了有希望的结果,并已成功应用于社交网络分析和计算机视觉。当前的研究旨在使这些近似方法更加通用和统一理论框架,以分析和设计新的近似算法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Maire, Florian其他文献
Identification of Ion Series Using Ion Mobility Mass Spectrometry: The Example of Alkyl-Benzothiophene and Alkyl-Dibenzothiophene Ions in Diesel Fuels
- DOI:
10.1021/ac400731d - 发表时间:
2013-06-04 - 期刊:
- 影响因子:7.4
- 作者:
Maire, Florian;Neeson, Kieran;Giusti, Pierre - 通讯作者:
Giusti, Pierre
Efficient Bayesian inference for exponential random graph models by correcting the pseudo-posterior distribution
- DOI:
10.1016/j.socnet.2017.03.013 - 发表时间:
2017-07-01 - 期刊:
- 影响因子:3.1
- 作者:
Bouranis, Lampros;Friel, Nial;Maire, Florian - 通讯作者:
Maire, Florian
Traveling Wave Ion Mobility Mass Spectrometry Study of Low Generation Polyamidoamine Dendrimers
- DOI:
10.1007/s13361-012-0527-3 - 发表时间:
2013-02-01 - 期刊:
- 影响因子:3.2
- 作者:
Maire, Florian;Coadou, Gael;Lange, Catherine M. - 通讯作者:
Lange, Catherine M.
A Mutasynthesis Approach with a Penicillium chrysogenum ΔroqA Strain Yields New Roquefortine D Analogues
- DOI:
10.1002/cbic.201402686 - 发表时间:
2015-04-13 - 期刊:
- 影响因子:3.2
- 作者:
Ouchaou, Kahina;Maire, Florian;Overkleeft, Herman S. - 通讯作者:
Overkleeft, Herman S.
Atmospheric Solid Analysis Probe-Ion Mobility Mass Spectrometry of Polypropylene
- DOI:
10.1021/ac302109q - 发表时间:
2012-11-06 - 期刊:
- 影响因子:7.4
- 作者:
Barrere, Caroline;Maire, Florian;Giusti, Pierre - 通讯作者:
Giusti, Pierre
Maire, Florian的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Maire, Florian', 18)}}的其他基金
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
- 批准号:
RGPIN-2019-06487 - 财政年份:2021
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
- 批准号:
RGPIN-2019-06487 - 财政年份:2020
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
- 批准号:
RGPIN-2019-06487 - 财政年份:2019
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
- 批准号:
DGECR-2019-00269 - 财政年份:2019
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Launch Supplement
相似国自然基金
面向计算密集型应用的新型计算范式及其加速器关键技术
- 批准号:62374108
- 批准年份:2023
- 资助金额:48 万元
- 项目类别:面上项目
基于SDN的超密集边缘计算网络资源分配与调度关键技术研究
- 批准号:62202237
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
超密集移动边缘计算网络中安全高效的多重任务协作式卸载方案研究及资源优化
- 批准号:62261020
- 批准年份:2022
- 资助金额:35 万元
- 项目类别:地区科学基金项目
基于SDN的超密集边缘计算网络资源分配与调度关键技术研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
5G超密集边缘计算中的智能计算卸载机制研究
- 批准号:62002397
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
相似海外基金
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
- 批准号:
RGPIN-2018-04604 - 财政年份:2022
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
- 批准号:
RGPIN-2018-04604 - 财政年份:2021
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
- 批准号:
RGPIN-2019-06487 - 财政年份:2021
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Computationally Intensive Methods for Large Spatio-Temporal Data Sets
大型时空数据集的计算密集型方法
- 批准号:
RGPIN-2018-04604 - 财政年份:2020
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual
Approximations of computationally intensive statistical learning algorithms: theory and methods
计算密集型统计学习算法的近似:理论和方法
- 批准号:
RGPIN-2019-06487 - 财政年份:2020
- 资助金额:
$ 1.36万 - 项目类别:
Discovery Grants Program - Individual