Enabling AI-based Mouse Genetic Discovery
实现基于人工智能的小鼠基因发现
基本信息
- 批准号:10724522
- 负责人:
- 金额:$ 77.97万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-08-01 至 2027-07-31
- 项目状态:未结题
- 来源:
- 关键词:AccelerationAffectAllelesAmino Acid SequenceArtificial IntelligenceCandidate Disease GeneClustered Regularly Interspaced Short Palindromic RepeatsCommunitiesDNA Sequence AlterationDataData AnalysesData SetData SourcesDatabasesDevelopmentDiabesityDiabetes MellitusDiseaseDisease modelEngineeringEvaluationExhibitsExonsFoundationsGenerationsGenesGeneticGenetic EnhancementGenetic ModelsGenetic VariationGenome engineeringGenomicsHealthHealthcareHodgkin DiseaseHomologous GeneHumanInbred StrainIndividualKnock-in MouseKnock-outLaboratory miceLymphomaMachine LearningMalignant NeoplasmsMapsMeasuresMethodsMusNetwork-basedObesityPaperPatternPhenotypePublic HealthPublishingResearchTandem Repeat SequencesTrainingValidationVariantcandidate identificationcausal variantcomputational pipelinescomputerized toolsdisease phenotypegenetic variantgenome wide association studygraph neural networkhuman diseasehuman modelimprovedinnovationknockout genemodel organismmouse geneticsmouse genomemouse modelnovelprotein protein interactionpublic databaseresponsetrait
项目摘要
Abstract
No model organism has contributed more than the laboratory mouse to improving human health. Many genetic
factors and therapies for human diseases were initially discovered or characterized in mice, before they were
transitioned to human use.
Large-scale efforts are underway to integrate recent advances in artificial
intelligence (AI) into human healthcare, but very few AI advances have been used for analysis of the data
produced using the model organism that has formed the foundation for many healthcare innovations. We
recently developed an AI-based computational pipeline that could identify causative genetic factors for murine
genetic models of human biomedical traits and diseases. After assessing the strength of allelic associations
with the phenotypic response pattern exhibited by the inbred strains; this AI pipeline uses a machine-learning
trained method to analyze 29M published papers and assess candidate gene-phenotype relationships; and the
information obtained from assessment of their protein-protein interaction network and protein sequence
features of the candidate genes are also incorporated into the graph neural network-based analysis.
This project will produce a markedly enhanced AI pipeline (AIv2) that will greatly accelerate the pace of genetic
discovery using murine genetic models. First, long read genomic sequencing (LRS) and computational tools
are used to produce a more complete map of the pattern of genetic variation among the inbred strains, which
also includes alleles for two major types of genetic variation (structural variants, tandem repeats), which are
poorly characterized using conventional sequencing methods. Second, we develop two additional
computational tools for the AI, which facilitate candidate gene prioritization through the evaluation of: (i) the
phenotypes exhibited by 8200 mouse lines with individual gene knockouts (KOs); and (ii) the results of 5700
human GWAS covering many biomedical phenotypes to determine if alleles within the human homologues of
candidate murine genes affect an analyzed trait. The ability of AIv2 to accelerate genetic discovery will be
demonstrated by using it to identify new genetic factors through analysis of a public database with >10,307
datasets, which measure biomedical or disease-related responses in panels of inbred strains. Since it is critical
to experimentally confirm some of the computational findings, genetic factors for two murine models of human
diseases that are major public health problems (cancer, diabetes/obesity), which were identified by the AI
pipeline, will be experimentally validated. CRISPR engineering is used to revert the causative mutation(s) to
wildtype on the genetic background of the strain exhibiting the disease phenotype, and the genome engineered
mice are analyzed to assess the contribution of the genetic factor to the disease phenotype.
抽象的
没有任何模型生物体比实验室小鼠对改善人类健康的贡献更大。很多遗传
人类疾病的因素和疗法最初是在小鼠身上发现或表征的,然后才被研究出来。
转变为人类使用。
正在进行大规模的努力来整合人工技术的最新进展
人工智能(AI)进入人类医疗保健领域,但很少有人工智能进步被用于数据分析
使用模型生物体生产,该生物体已成为许多医疗保健创新的基础。我们
最近开发了一种基于人工智能的计算管道,可以识别小鼠的致病遗传因素
人类生物医学特征和疾病的遗传模型。评估等位基因关联的强度后
具有近交系表现出的表型反应模式;这个人工智能管道使用机器学习
训练有素的方法来分析 2900 万篇发表的论文并评估候选基因-表型关系;和
通过评估蛋白质-蛋白质相互作用网络和蛋白质序列获得的信息
候选基因的特征也被纳入基于图神经网络的分析中。
该项目将产生显着增强的人工智能管道(AIv2),这将大大加快遗传研究的步伐
使用小鼠遗传模型进行发现。一、长读长基因组测序(LRS)和计算工具
用于生成近交系遗传变异模式的更完整图谱,
还包括两种主要遗传变异类型(结构变异、串联重复)的等位基因,它们是
使用传统测序方法很难表征。其次,我们开发了另外两个
人工智能的计算工具,通过评估以下内容促进候选基因优先排序:(i)
8200 个个体基因敲除 (KO) 小鼠品系表现出的表型; (ii) 5700 的结果
人类 GWAS 涵盖许多生物医学表型,以确定等位基因是否属于人类同源物
候选小鼠基因影响分析的性状。 AIv2 加速基因发现的能力将是
通过对包含 >10,307 个公共数据库的分析,使用它来识别新的遗传因素,从而证明了这一点
数据集,用于测量近交菌株组中的生物医学或疾病相关反应。既然很关键
为了通过实验证实一些计算结果,人类的两种小鼠模型的遗传因素
人工智能识别出的重大公共卫生问题疾病(癌症、糖尿病/肥胖)
管道,将进行实验验证。 CRISPR 工程用于将致病突变恢复为
表现出疾病表型的菌株遗传背景的野生型,以及基因组工程
对小鼠进行分析以评估遗传因素对疾病表型的贡献。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Genetic Discovery Enabled by A Large Language Model.
由大型语言模型实现的基因发现。
- DOI:
- 发表时间:2023-11-12
- 期刊:
- 影响因子:0
- 作者:Tu, Tao;Fang, Zhouqing;Cheng, Zhuanfen;Spasic, Svetolik;Palepu, Anil;Stankovic, Konstantina M;Natarajan, Vivek;Peltz, Gary
- 通讯作者:Peltz, Gary
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
GARY A PELTZ其他文献
GARY A PELTZ的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('GARY A PELTZ', 18)}}的其他基金
Computational Methods for Identification of Genetic Factors Affecting the Response to Drug Abuse
识别影响药物滥用反应的遗传因素的计算方法
- 批准号:
10198889 - 财政年份:2017
- 资助金额:
$ 77.97万 - 项目类别:
Computational Methods for Identification of Genetic Factors Affecting the Response to Drug Abuse
识别影响药物滥用反应的遗传因素的计算方法
- 批准号:
9926473 - 财政年份:2017
- 资助金额:
$ 77.97万 - 项目类别:
Computational Methods for Identification of Genetic Factors Affecting the Response to Drug Abuse
识别影响药物滥用反应的遗传因素的计算方法
- 批准号:
10515960 - 财政年份:2017
- 资助金额:
$ 77.97万 - 项目类别:
Computational Methods for Identification of Genetic Factors Affecting the Response to Drug Abuse
识别影响药物滥用反应的遗传因素的计算方法
- 批准号:
10515960 - 财政年份:2017
- 资助金额:
$ 77.97万 - 项目类别:
Computational Methods for Identification of Genetic Factors Affecting the Response to Drug Abuse
识别影响药物滥用反应的遗传因素的计算方法
- 批准号:
10406825 - 财政年份:2017
- 资助金额:
$ 77.97万 - 项目类别:
Computational Methods for Identification of Genetic Factors Affecting the Response to Drug Abuse
识别影响药物滥用反应的遗传因素的计算方法
- 批准号:
10075085 - 财政年份:2017
- 资助金额:
$ 77.97万 - 项目类别:
Stem Cell-Based In vivo Models of Human Genetic Liver Diseases
基于干细胞的人类遗传性肝病体内模型
- 批准号:
8812710 - 财政年份:2015
- 资助金额:
$ 77.97万 - 项目类别:
相似国自然基金
KIR3DL1等位基因启动子序列变异影响其差异表达的分子机制研究
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
NUP205双等位基因突变影响纤毛发生而致内脏转位合并先天性心脏病的机理研究
- 批准号:
- 批准年份:2021
- 资助金额:54 万元
- 项目类别:面上项目
全基因组范围内揭示杂交肉兔等位基因特异性表达模式对杂种优势遗传基础的影响
- 批准号:32102530
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
等位基因不平衡表达对采后香蕉果实后熟与品质形成的影响
- 批准号:31972471
- 批准年份:2019
- 资助金额:57 万元
- 项目类别:面上项目
高温影响水稻不同Wx等位基因表达及直链淀粉含量的分子机制研究
- 批准号:31500972
- 批准年份:2015
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Integrative genomic and functional genomic studies to connect variant to function for CAD GWAS loci
整合基因组和功能基因组研究,将 CAD GWAS 位点的变异与功能联系起来
- 批准号:
10639274 - 财政年份:2023
- 资助金额:
$ 77.97万 - 项目类别:
Project 2: Therapeutic Gene Editing for Friedreich's Ataxia
项目 2:弗里德赖希共济失调的治疗性基因编辑
- 批准号:
10668768 - 财政年份:2023
- 资助金额:
$ 77.97万 - 项目类别:
The role of USP27X-Cyclin D1 axis in HER2 Therapy Resistant Breast Cancer
USP27X-Cyclin D1 轴在 HER2 治疗耐药乳腺癌中的作用
- 批准号:
10658373 - 财政年份:2023
- 资助金额:
$ 77.97万 - 项目类别:
Virus and olfactory system interactions accelerate Alzheimer's disease pathology
病毒和嗅觉系统相互作用加速阿尔茨海默病病理学
- 批准号:
10669880 - 财政年份:2023
- 资助金额:
$ 77.97万 - 项目类别:
A genomics-based strategy to precision phenotyping and drug repositioning in cardiometabolic diseases
基于基因组学的心脏代谢疾病精准表型分析和药物重新定位策略
- 批准号:
10564666 - 财政年份:2023
- 资助金额:
$ 77.97万 - 项目类别: