CAREER: Information-Theoretic Foundations of Fairness in Machine Learning
职业:机器学习公平性的信息理论基础
基本信息
- 批准号:1845852
- 负责人:
- 金额:$ 54.79万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-02-01 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Machine learning algorithms can identify complex patterns in very large datasets. These algorithms are increasingly used in applications of significant social consequence, such as loan approval, hiring, and bail and sentencing decisions. However, real-world data may reflect discrimination patterns that exist in society at large. Consequently, decisions based on algorithms that learn from data are at risk of inheriting and, ultimately, reinforcing discriminatory and unfair social biases. This project aims to precisely characterize the operational limits of discrimination discovery and control in machine learning by combining legal and social science definitions of fairness with powerful mathematical tools from information theory, statistics, and optimization. This cross-disciplinary effort aims to provide fundamental theory and design guidelines for data scientists and engineers who will create the next generation of fair data-driven algorithms and applications. The technical results of this project will also inform the debate surrounding the social impact of machine learning. Moreover, this research will be used as a vessel for engaging students and researchers from diverse backgrounds in the applicability of information theory, machine learning, optimization, and, more broadly, math and engineering to social challenges.Automated methods for discovering and controlling discrimination in machine learning inherently face a trade-off between fairness and accuracy, and are limited by the dimensionality of the underlying data. This project creates a comprehensive information-theoretic framework that captures the limits of discrimination control by determining (i) how to systematically identify data features that may lead to discrimination; (ii) how to ensure fairness by producing new, information-theoretically grounded data representations; (iii) the fundamental information-theoretic trade-offs between fairness, distortion, and accuracy; and (iv) the impact of finite samples in discrimination detection and mitigation. The key advantage of the information-theoretic methodology adopted in this project is that it captures fundamental, algorithm-independent properties of discrimination, while being fertile ground for the development of novel mathematical tools and models relevant to both data scientists and information theorists. The theoretical component of this research weaves new connections between information theory and robust statistics by analyzing the impact of local perturbations of probability distributions on discrimination metrics, and creates new information-theoretic models useful in discrimination control, privacy, and representation learning. The applied component of this research develops robust, data-driven methods for measuring and mitigating discrimination that are immediately relevant for fair algorithmic decision-making in applications of consequence.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
机器学习算法可以识别非常大的数据集中的复杂模式。这些算法越来越多地用于具有重大社会影响的应用,例如贷款审批、招聘、保释和量刑决策。然而,现实世界的数据可能反映了整个社会存在的歧视模式。因此,基于从数据中学习的算法做出的决策存在继承并最终强化歧视性和不公平社会偏见的风险。该项目旨在通过将公平的法律和社会科学定义与来自信息论、统计学和优化的强大数学工具相结合,精确描述机器学习中歧视发现和控制的操作限制。这项跨学科的工作旨在为数据科学家和工程师提供基础理论和设计指南,帮助他们创建下一代公平的数据驱动算法和应用程序。该项目的技术成果还将为围绕机器学习的社会影响的辩论提供信息。此外,这项研究将成为吸引来自不同背景的学生和研究人员参与信息论、机器学习、优化以及更广泛的数学和工程学应对社会挑战的工具。发现和控制歧视的自动化方法机器学习本质上面临着公平性和准确性之间的权衡,并且受到底层数据维度的限制。该项目创建了一个全面的信息理论框架,通过确定(i)如何系统地识别可能导致歧视的数据特征来捕捉歧视控制的局限性; (ii) 如何通过产生新的、基于信息理论的数据表示来确保公平性; (iii) 公平性、扭曲性和准确性之间的基本信息论权衡; (iv) 有限样本对歧视检测和缓解的影响。该项目采用的信息论方法的主要优势在于,它捕获了基本的、独立于算法的歧视属性,同时为开发与数据科学家和信息理论家相关的新颖数学工具和模型提供了肥沃的土壤。这项研究的理论部分通过分析概率分布的局部扰动对歧视度量的影响,在信息论和稳健统计之间建立了新的联系,并创建了可用于歧视控制、隐私和表征学习的新信息理论模型。这项研究的应用部分开发了强大的、数据驱动的方法来衡量和减轻歧视,这些方法与结果应用中的公平算法决策直接相关。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准。
项目成果
期刊论文数量(33)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Bottleneck Problems: An Information and Estimation-Theoretic View
瓶颈问题:信息和估计理论的观点
- DOI:10.3390/e22111325
- 发表时间:2020-11
- 期刊:
- 影响因子:2.7
- 作者:Asoodeh, Shahab;Calmon, Flavio P.
- 通讯作者:Calmon, Flavio P.
Local Differential Privacy Is Equivalent to Contraction of an $f$-Divergence
局部差分隐私相当于 $f$-Divergence 的收缩
- DOI:10.1109/isit45174.2021.9517999
- 发表时间:2021-07
- 期刊:
- 影响因子:0
- 作者:Asoodeh, Shahab;Aliakbarpour, Maryam;Calmon, Flavio P.
- 通讯作者:Calmon, Flavio P.
Privacy Amplification of Iterative Algorithms via Contraction Coefficients
通过收缩系数实现迭代算法的隐私放大
- DOI:10.1109/isit44484.2020.9174133
- 发表时间:2020-06
- 期刊:
- 影响因子:0
- 作者:Asoodeh, Shahab;Diaz, Mario;Calmon, Flavio P.
- 通讯作者:Calmon, Flavio P.
A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via f-Divergences
更好的绑定提供一百轮:通过 f-Divergences 增强隐私保证
- DOI:10.1109/isit44484.2020.9174015
- 发表时间:2020-06
- 期刊:
- 影响因子:0
- 作者:Asoodeh, Shahab;Liao, Jiachun;Calmon, Flavio P.;Kosut, Oliver;Sankar, Lalitha
- 通讯作者:Sankar, Lalitha
Arbitrary Decisions are a Hidden Cost of Differentially Private Training
武断的决定是差别化私人训练的隐性成本
- DOI:10.1145/3593013.3594103
- 发表时间:2023-02-28
- 期刊:
- 影响因子:0
- 作者:B. Kulynych;Hsiang Hsu;C. Troncoso;F. Calmon
- 通讯作者:F. Calmon
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Flavio Calmon其他文献
Flavio Calmon的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Flavio Calmon', 18)}}的其他基金
Collaborative Research: CIF: Medium: Fundamental Limits of Privacy-Enhancing Technologies
合作研究:CIF:中:隐私增强技术的基本限制
- 批准号:
2312667 - 财政年份:2023
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Small: Approximate Coded Computing - Fundamental Limits of Precision, Fault-tolerance and Privacy
协作研究:CIF:小型:近似编码计算 - 精度、容错性和隐私的基本限制
- 批准号:
2231707 - 财政年份:2023
- 资助金额:
$ 54.79万 - 项目类别:
Standard Grant
FAI: Foundations of Fair AI in Medicine: Ensuring the Fair Use of Patient Attributes
FAI:医学中公平人工智能的基础:确保患者属性的公平使用
- 批准号:
2040880 - 财政年份:2021
- 资助金额:
$ 54.79万 - 项目类别:
Standard Grant
EAGER: AI-DCL: Collaborative Research: Understanding and Overcoming Biases in STEM Education using Machine Learning
EAGER:AI-DCL:协作研究:利用机器学习理解和克服 STEM 教育中的偏见
- 批准号:
1926925 - 财政年份:2019
- 资助金额:
$ 54.79万 - 项目类别:
Standard Grant
CIF: Medium: Collaborative Research: Information-theoretic Guarantees on Privacy in the Age of Learning
CIF:媒介:协作研究:学习时代隐私的信息理论保证
- 批准号:
1900750 - 财政年份:2019
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant
相似国自然基金
超大规模MIMO系统信道状态信息获取与无线传输理论研究
- 批准号:62371180
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于证据理论和量子决策的多源信息融合研究
- 批准号:62303382
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
理论指导的融合辅助信息的自监督学习研究
- 批准号:62376010
- 批准年份:2023
- 资助金额:51 万元
- 项目类别:面上项目
基于网络协同的综合客运枢纽导向信息全息设置理论
- 批准号:52372295
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
面向应急通信的移动信息网络弹性适变理论与方法
- 批准号:62341103
- 批准年份:2023
- 资助金额:150 万元
- 项目类别:专项基金项目
相似海外基金
CAREER: Towards Trustworthy Machine Learning via Learning Trustworthy Representations: An Information-Theoretic Framework
职业:通过学习可信表示实现可信机器学习:信息理论框架
- 批准号:
2339686 - 财政年份:2024
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant
CAREER: Information-Theoretic Measures for Fairness and Explainability in High-Stakes Applications
职业:高风险应用中公平性和可解释性的信息论测量
- 批准号:
2340006 - 财政年份:2024
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant
CAREER: Optimism in Causal Reasoning via Information-theoretic Methods
职业:通过信息论方法进行因果推理的乐观主义
- 批准号:
2239375 - 财政年份:2023
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant
CAREER: Information-Theoretic Approach to Turbulence: Causality, Modeling & Control
职业:湍流的信息理论方法:因果关系、建模
- 批准号:
2140775 - 财政年份:2021
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant
CAREER: Information-Theoretic and Statistical Foundations of Generative Models
职业:生成模型的信息理论和统计基础
- 批准号:
1942230 - 财政年份:2020
- 资助金额:
$ 54.79万 - 项目类别:
Continuing Grant