CAREER: Exploring and Exploiting Data-Centric Modeling for Fairness in Machine Learning

职业：探索和利用以数据为中心的建模以实现机器学习的公平性

基本信息

批准号：
2239257
负责人：
Na Zou
金额：
$ 54.77万
依托单位：
Texas A&M Engineering Experiment Station
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2023
资助国家：
美国
起止时间：
2023-05-01 至 2028-04-30
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2239257&HistoricalAwards=false
关键词：
CAREER Exploring Exploiting Data Centric

项目摘要

This project will lead to advances in dealing with data challenges to facilitate fairness in machine learning, promote broad utilization of machine-learning algorithms in high-stake applications, and ensure a fair and transparent decision-making process for future information systems. While machine-learning methods have achieved success in real-world applications, they often suffer from biases and show discrimination towards certain demographics especially in high-stakes applications, which risks significant harm to both society and individuals. Existing work focuses on “model-centric” computational approaches that build models while overlooking the importance of data quality. To tackle the challenges raised by the lack of high quality data and the lack of a comprehensive understanding of fairness in all its respects, this project will integrate model-centric with “data-centric” modeling, which systematically engineers the data needed for a fair decision-making process. The successful outcome of this multidisciplinary research will lead to effective and efficient algorithms that enhance the generalizability and trustworthiness of learned models, and improve the fairness of algorithms deployed in real-world systems in health informatics and disaster resilience. The education programs of this project will play an integral part in training the next generation of the U.S. workforce with critical Responsible Artificial Intelligence (RAI) technologies and attract and retain diverse members of the future workforce in STEM. The research goal of this project is to develop a computational framework for tackling data challenges in fairness through data-centric fairness mitigation solutions that explore and exploit data and prior knowledge. Complementing existing studies focusing on model-centric or data-driven approaches, this project investigates a novel research direction that systematically explores a data-centric fairness mitigation framework. Specifically, the research objectives include: (1) to explore and extract data characteristics on instances, features and a representative subset of examples in terms of fairness, allowing that fairness definitions and metrics may vary across real-world applications; (2) to expand and refine prior knowledge to guide the discrimination-mitigation process via instance augmentation, feature set expansion, and measurement redefinition perspectives; (3) to leverage interpretable and interactive data and prior knowledge as a key element for further improving fairness modeling; and (4) to demonstrate effectiveness on real-world applications including healthcare informatics and disaster resilience. The educational objectives are: (1) to incorporate responsible artificial intelligence (RAI) into curriculum design via integrating research findings and case studies into current and new courses; (2) to enhance public interest in and awareness of RAI by organizing data challenges and broadcasting information on social media platforms; and (3) to attract and retain women and underrepresented minorities to ensure a diverse future STEM workforce.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目将在应对数据挑战方面取得进展，以促进机器学习的公平性，促进高级应用程序中机器学习算法的广泛利用，并确保为未来信息系统进行公平透明的决策过程。尽管机器学习方法在现实世界应用中取得了成功，但它们经常遭受偏见的困扰，并对某些人口统计学表现出歧视，尤其是在高风险应用中，这可能会对社会和个人造成重大伤害。现有的工作着重于建立模型的“以模型为中心”的计算方法，同时忽略了数据质量的重要性。为了应对缺乏高质量数据以及对所有方面的公平性的全面理解所带来的挑战，该项目将将以模型为中心与“以数据为中心”建模相结合，该建模系统地从系统地设计了公平决策过程所需的数据。这项多学科研究的成功结果将导致有效，有效的算法，从而增强了学识渊博的模型的普遍性和可信度，并改善了在健康信息和灾难弥补方面部署在现实世界中的算法的公平性。该项目的教育计划将在培训下一代的美国劳动力方面具有不可或缺的作用，并使用关键的负责人人工智能（RAI）技术，并吸引并留住STEM中未来劳动力的不同成员。该项目的研究目标是开发一个计算框架，通过以数据为中心的公平解决方案来应对公平性的数据挑战，以探索和利用数据和先验知识。该项目补充了以模型为中心或数据驱动方法的现有研究，研究了一个新型的研究方向，该方向系统地探索了以数据为中心的公平缓解框架。具体而言，研究目标包括：（1）在公平性方面探索和提取有关实例，特征和代表性示例的代表性子集的数据特征，从而使公平性定义和指标可能在现实世界中的应用程序中有所不同；（2）扩展和完善先验知识，以通过实例扩展，功能集扩展和测量重新定义观点来指导歧视缓解过程；（3）利用可解释的和交互式数据以及先验知识作为进一步改善公平建模的关键要素；（4）在包括医疗保健信息和灾难弹性在内的现实应用程序中证明有效性。教育目标是：（1）将负责任的人工智能（RAI）纳入当前和案例研究；（2）通过组织数据挑战并在社交媒体平台上广播信息来增强对RAI的兴趣和意识；（3）吸引和保留妇女和代表性不足的少数民族以确保潜水员未来的STEM劳动力。该奖项反映了NSF的法定任务，并使用基金会的知识分子优点和更广泛的影响审查标准，通过评估被认为是宝贵的支持。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Na Zou其他文献

PolyJet 3D Printing: Predicting Color by Multilayer Perceptron Neural Network

PolyJet 3D 打印：通过多层感知器神经网络预测颜色

DOI：
10.1016/j.stlm.2022.100049
发表时间：
2022
期刊：
Annals of 3D Printed Medicine
影响因子：
0
作者：
Xingjian Wei;Na Zou;Li Zeng;Zhijian Pei
通讯作者：
Zhijian Pei

A Data Adaptive Biological Sequence Representation for Supervised Learning

用于监督学习的数据自适应生物序列表示

DOI：
10.1007/s41666-018-0038-5
发表时间：
2018
期刊：
Journal of Healthcare Informatics Research
影响因子：
5.9
作者：
Hande Cakin;Berk Gorgulu;M. Baydogan;Na Zou;Jing Li
通讯作者：
Jing Li

Retiring $Δ$DP: New Distribution-Level Metrics for Demographic Parity

退休 $Δ$DP：人口平等的新分配水平指标

DOI：
发表时间：
2023
期刊：
arXiv.org
影响因子：
0
作者：
Xiaotian Han;Zhimeng Jiang;Hongye Jin;Zirui Liu;Na Zou;Qifan Wang;Xia Hu
通讯作者：
Xia Hu

The <math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si5.svg" display="inline" id="d1e96" class="math"><mi>p</mi></math>th moment stability of stochastic functional differential equations with mixed time-varying delay and its applications

DOI：
10.1016/j.jfranklin.2024.107425
发表时间：
2025-01-01
期刊：
Research article
影响因子：
作者：
Na Zou;Hongfeng Guo;Chuan Zhang;Jianting Fu;Yingxin Guo
通讯作者：
Yingxin Guo

Identification of the hybrids between Lilium brownii and L. davidii using fluorescence in situ hybridization (FISH)

使用荧光原位杂交 (FISH) 鉴定布朗百合和戴维百合之间的杂交种

DOI：
10.17660/actahortic.2019.1237.13
发表时间：
2019-04
期刊：
Acta Horticulturae
影响因子：
0
作者：
Like Wu;Wei Zheng;Kongzhong Xiao;Jie Zeng;Luomin Cui;Hui Li;Yanmei Liu;Na Zou;Junhuo Cai;Shujun Zhou
通讯作者：
Shujun Zhou