Developing unbiased AI/Deep learning pipelines to strengthen lung cancer health disparities research
开发公正的人工智能/深度学习管道以加强肺癌健康差异研究
基本信息
- 批准号:10841956
- 负责人:
- 金额:$ 30.55万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-04-10 至 2028-03-31
- 项目状态:未结题
- 来源:
- 关键词:Administrative SupplementAgeAlgorithmsAreaBaptist ChurchBioinformaticsBiomedical EngineeringBlack PopulationsBlack raceCancer BiologyCancer CenterCancer PatientCancer health equityCatchment AreaCell CommunicationCellsClinical TrialsCollaborationsCombined Modality TherapyComprehensive Cancer CenterComputing MethodologiesDataData AnalysesData SetDepositionDevelopmentDiseaseDisparity populationEnsureEntropyEquityEthnic PopulationEventExcisionFemaleFundingFutureGenderGene ExpressionGenesGenomicsGoalsHealth Disparities ResearchHealth PromotionIncidenceInequalityInterdisciplinary StudyInvestigationLabelMalignant neoplasm of lungMethodologyMissionModelingMolecularNon-Small-Cell Lung CarcinomaOxidation-ReductionPaperParentsPathway interactionsPatientsPerformancePhaseProcessRaceReadinessReduce health disparitiesResearchResearch PersonnelResolutionResource DevelopmentSamplingSampling BiasesSex BiasStandardizationTestingTherapeuticTrainingUnited States National Institutes of HealthWorkblack lungblack patientcancer health disparitycancer typedata sharingdeep learningdeep learning modelepidemiologic dataethnic biasforestgene regulatory networkhealth disparityimprovedinnovationinsightlipid metabolismmalemortalitynovelnovel therapeuticsparent grantpredictive modelingracial biasracial health disparityracial populationresponsesingle cell sequencingsingle-cell RNA sequencingskillssuccesstargeted treatmenttherapeutically effectivetumortumor microenvironment
项目摘要
SUMMARY
Our funded R01 entitled “Overcoming racial health disparities in lung cancer through innovative mechanism-
based therapeutic strategies” proposes to generate high-resolution spatial gene expression and single-cell
sequencing (scRNA-Seq) profiles of tumors in Black and White patients with NSCLC. In addition to the data
analysis proposed in the R01 to depict the differences between White and Black patients, these data provide a
comprehensive resource for the development of AI-ML models that can help us to gain further insight into the
cellular landscapes of lung cancer and its tumor microenvironment, thus informing us on novel combination
therapies that would overcome health disparities and achieve equity among different racial/ethnic groups. In
response to the NOT-OD-23-082 entitled, Administrative Supplements to Support Collaborations to Improve the
AI/ML-Readiness of NIH-Supported Data, we seek to develop pipelines to transform the data into an AI/DL-ready
format to enable application of AI/deep learning research in high-resolution single cell sequencing data from this
R01 and other studies. Moreover, we will explore computational methodologies to overcome the commonly
existing issue – sampling bias, such as unbalanced races or genders, and ensure that data is ready for a fairer
AI/DL model training and prediction in a fair fashion. We will achieve these goals through two Specific Aims 1) to
develop a “Fairness” pipeline to mitigate the effects of sampling bias due to population inequalities in Black and
White, Male and Female patients and to transform scRNA-Seq data into AI/deep learning model-ready format,
and 2) to validate the preprocessed data in Specific Aim 1 on a scDL pipeline with additional scRNA-Seq data
collected from the parent R01 and publicly available unbalanced scRNA-Seq datasets. The Fairness pipeline
will take advantages of the Gerchberg-Saxton algorithm (GS) that can transform raw data into an AI/DL-ready
format that is more suitable for AI investigations by correcting bias caused by sampling inequalities in Black and
White, Male and Female patients. We hypothesize that once the GS transformation is completed, the remaining
features of the scRNA-Seq dataset will have more uniform contribution in the DL model training phase easing to
understanding and investigation of gene regulatory networks, cell-to-cell interactions, therapeutic-relevant
pathways, and identifying potential targets. The application of the proposed pipeline will overcome the bias issue
in publicly available datasets and the data generated in the future. All data will be well documented in a CSV
format with unique column labels including cell labels and patient ID. Other information such as sample ID, race,
gender, cancer type/subtype, age, etc., will also be documented and available for public use through data sharing.
We believe that this effort can enable biologically meaningful discoveries regarding cancer disparities without
the impact from data bias, which aligns well with the NIH (National Institutes of Health) mission to promote health
and reduce health disparities.
概括
我们资助的R01题为“通过创新机制克服肺癌的种族健康差异-
基于治疗策略”提出生成高分辨率空间基因表达和单细胞
除了数据之外,还对黑人和白人 NSCLC 患者的肿瘤进行测序 (scRNA-Seq) 分析。
R01 中提出的分析描述了白人和黑人患者之间的差异,这些数据提供了
用于开发 AI-ML 模型的综合资源,可以帮助我们进一步深入了解
肺癌的细胞景观及其肿瘤微环境,从而为我们提供新的组合
能够克服健康差异并实现不同种族/族裔群体之间平等的疗法。
对题为“支持合作改进行政补充”的 NOT-OD-23-082 的回应
NIH 支持的数据的 AI/ML 就绪性,我们寻求开发管道将数据转换为 AI/DL 就绪型
格式,以实现人工智能/深度学习研究在高分辨率单细胞测序数据中的应用
此外,我们将探索克服常见问题的计算方法。
现有问题 – 抽样偏差,例如种族或性别不平衡,并确保数据为更公平的数据做好准备
我们将通过两个具体目标来实现这些目标:1)以公平的方式进行 AI/DL 模型训练和预测。
制定“公平”渠道,以减轻黑人和黑人人口不平等造成的抽样偏差的影响
白人、男性和女性患者,并将 scRNA-Seq 数据转换为 AI/深度学习模型就绪格式,
2) 使用额外的 scRNA-Seq 数据在 scDL 管道上验证特定目标 1 中的预处理数据
从父 R01 和公开可用的不平衡 scRNA-Seq 数据集收集。
将利用 Gerchberg-Saxton 算法 (GS),将原始数据转换为 AI/DL 就绪的数据
通过纠正黑人和黑人中抽样不平等引起的偏差,更适合人工智能调查的格式
我们勇敢地说,一旦 GS 改造完成,剩下的人。
scRNA-Seq 数据集的特征将在 DL 模型训练阶段有更统一的贡献
了解和研究基因调控网络、细胞间相互作用、治疗相关
途径,并确定潜在目标的应用将克服偏差问题。
公开可用的数据集中,并且将来生成的数据将详细记录在 CSV 中。
具有独特的列标签的格式,包括细胞标签和患者 ID 其他信息,例如样本 ID、种族、
性别、癌症类型/亚型、年龄等也将被记录下来并通过数据共享供公众使用。
我们相信,这项努力可以在不影响癌症差异的情况下,实现具有生物学意义的发现。
数据偏差的影响,这与 NIH(美国国立卫生研究院)促进健康的使命非常吻合
并减少健康差距。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Liang Liu其他文献
Liang Liu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Liang Liu', 18)}}的其他基金
A novel role of mutant p53 in intronic polyadenylation and impairment of DNA repair
突变体 p53 在内含子多聚腺苷酸化和 DNA 修复损伤中的新作用
- 批准号:
10535488 - 财政年份:2021
- 资助金额:
$ 30.55万 - 项目类别:
A novel role of mutant p53 in intronic polyadenylation and impairment of DNA repair
突变体 p53 在内含子多聚腺苷酸化和 DNA 修复损伤中的新作用
- 批准号:
10358369 - 财政年份:2021
- 资助金额:
$ 30.55万 - 项目类别:
Tumor microenvironment at single cell level in black and white NSCLC patients
黑人和白人 NSCLC 患者单细胞水平的肿瘤微环境
- 批准号:
10057810 - 财政年份:2020
- 资助金额:
$ 30.55万 - 项目类别:
Epigenetic control of target gene activity by hairless in skin homeostasis
无毛在皮肤稳态中对靶基因活性的表观遗传控制
- 批准号:
8827678 - 财政年份:2014
- 资助金额:
$ 30.55万 - 项目类别:
Epigenetic control of target gene activity by hairless in skin homeostasis
无毛在皮肤稳态中对靶基因活性的表观遗传控制
- 批准号:
8700942 - 财政年份:2014
- 资助金额:
$ 30.55万 - 项目类别:
相似国自然基金
无线供能边缘网络中基于信息年龄的能量与数据协同调度算法研究
- 批准号:62372118
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
面向年龄相关性黄斑变性诊断的迁移学习算法研究
- 批准号:62371328
- 批准年份:2023
- 资助金额:53 万元
- 项目类别:面上项目
基于信息年龄的自组网分布式及时信息调度算法研究
- 批准号:62102232
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
异质动态网络上年龄结构传染病模型及算法研究
- 批准号:11701348
- 批准年份:2017
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
视网膜年龄相关性黄斑病变OCT图像的三维分割算法研究
- 批准号:61401294
- 批准年份:2014
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Secondary Analysis and Integration of Existing Data Related to Chronic Orofacial Pain and Placebo Effects - Administrative Supplement
与慢性口面部疼痛和安慰剂效应相关的现有数据的二次分析和整合 - 行政补充
- 批准号:
10741330 - 财政年份:2023
- 资助金额:
$ 30.55万 - 项目类别:
Implementing best practices in software design for Network Level Analysis
实施网络级分析软件设计的最佳实践
- 批准号:
10839638 - 财政年份:2022
- 资助金额:
$ 30.55万 - 项目类别:
Machine learning approaches for improving EEG data utility in SUDEP research
用于提高 SUDEP 研究中脑电图数据效用的机器学习方法
- 批准号:
10593406 - 财政年份:2021
- 资助金额:
$ 30.55万 - 项目类别:
Harnessing multimodal data to enhance machine learning of children’s vocalizations
利用多模态数据增强儿童发声的机器学习
- 批准号:
10411575 - 财政年份:2021
- 资助金额:
$ 30.55万 - 项目类别: