Secure Outsourcing of Genotype Imputation for Privacy-aware Genomic Analysis (RO1HE21)
用于隐私意识基因组分析的基因型插补的安全外包 (RO1HE21)
基本信息
- 批准号:10587347
- 负责人:
- 金额:$ 58.32万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-05-17 至 2027-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Project Summary/Abstract
Population scale genome sequencing projects such as The 1000 Genomes, TOPMed, and All of US Program
will generate genotype data for millions of individuals. This number increases substantially if the recreational
usage of genetic data from genealogy companies, such as 23andme, is accounted for. Sharing and analyzing
this data create monumental challenges for the privacy of participants. Recently the hackers began targeting
genealogy databases such as the hacking of GEDmatch in 2020. Due to the large scale and high dimensions of
genomic data, analysis workflows require large computational resources. This incentivizes companies, hospitals,
and research labs to use outsourcing services from third parties to analyze and interpret genomic data such that
the genomic data is stored on untrusted 3rd party servers.
In this proposal, we focus on the secure outsourcing of genotype imputation, which is a computationally intensive
and central task in large-scale genotype analysis. Genotype imputation is the prediction of missing or low-quality
variant genotypes using a small set of variant genotypes that are measured using, for example, genotyping
arrays, low-coverage, or targeted sequencing. It is a vital step for analyzing raw genomic data for quality control,
predicting missing genotypes, variant phasing, and fine mapping of associations to identify causal variants. When
combined with sparse arrays, imputation can greatly reduce the cost of population-scale and family-based
genotyping. For example, the All of Us Project will rely on a custom genotyping array, Infinium Global Diversity
Panel, to decrease the cost of genotyping millions of individuals. Imputation methods will be of vital importance
for this task. To perform these enormous tasks, the imputation methods require large computational resources
and are often outsourced to 3rd party “imputation servers”. These servers will soon process thousands, If not
millions, of genomes and store sensitive genomic data. Unfortunately, these services are not strictly secure
neither from unauthorized hackers nor from curious users who have authorized access to the servers. There is
an urgent need for privacy-aware imputation methods that can be deployed on even untrusted 3rd party services
such as high-performance cloud platforms so that outsourcing can be safely performed at population scale.
Our proposed methods use state-of-the-art homomorphic encryption that provides perfect genomic data security
while in transit, at rest, and even while imputation is being performed. We design new and efficient “encryption-
amenable” methods and frameworks for protecting the study participants and their families, and for protecting
the population panels, i.e., underrepresented populations. Our benchmarks show that secure methods achieve
high imputation accuracy even on commodity hardware with comparable time as the state-of-the-art non-secure
methods. Proposed methods can provide practical population-scale genomic privacy and security for imputation
and association studies.
项目摘要/摘要
人口量表基因组测序项目,例如1000个基因组,顶部和我们所有人
将为数百万个个体生成基因型数据。如果娱乐性,这个数字会大大增加
从族谱公司(例如23andme)中使用遗传数据。共享和分析
该数据为参与者的隐私带来了每月的挑战。最近,黑客开始瞄准
家谱数据库,例如2020年的GedMatch黑客攻击。
基因组数据,分析工作流程需要大量的计算资源。这激励公司,医院,
研究实验室使用第三方的外包服务来分析和解释基因组数据,以便
基因组数据存储在不受信任的第三方服务器上。
在此提案中,我们专注于基因型插补的安全外包,这是计算密集型的
大规模基因型分析中的中心任务。基因型插补是缺失或低质量的预测
使用一组少数变体基因型的变体基因型,这些基因型是使用基因分型测量的
阵列,低覆盖或靶向测序。这是分析用于质量控制的原始基因组数据的重要步骤,
预测缺失的基因型,变体分阶段和关联的精细映射以识别因果变异。什么时候
结合稀疏的阵列,插补可以大大降低人口规模和基于家庭的成本
基因分型。例如,我们所有的项目都将依靠自定义的基因分型阵列,Infinium全球多样性
面板,以降低基因分型数百万个个体的成本。插补方法至关重要
为此任务。要执行这些巨大的任务,插补方法需要大量的计算资源
并经常将其外包给第三方“归合服务器”。这些服务器将很快处理数千个(如果没有)
数百万,基因组和储存敏感基因组数据。不幸的是,这些服务并不是严格安全的
既不来自未经授权的黑客,也不来自授权访问服务器的好奇用户。有
迫切需要在不受信任的第三方服务上部署的隐私意识插补方法
例如高性能云平台,以便可以在人口规模上安全地执行外包。
我们提出的方法使用最先进的同型加密,可提供完美的基因组数据安全
在运输途中,休息,甚至在执行插补时。我们设计了新的和高效的“加密 -
适合保护研究参与者及其家人的方法和框架,并保护
人口小组,即代表性不足的人群。我们的基准表明安全方法达到
即使在商品硬件上,具有可比的时间作为最先进的非安全性的高插金精度
方法。建议的方法可以为插补提供实用的人口规模的基因组隐私和安全性
和协会研究。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

暂无数据
数据更新时间:2024-06-01
相似海外基金
Demographic Patterns of Eugenic Sterilization in Five U.S. States: Mixed Methods Investigation of Reproductive Control of the 'Unfit'
美国五个州优生绝育的人口统计模式:“不健康者”生殖控制的混合方法调查
- 批准号:1064088610640886
- 财政年份:2023
- 资助金额:$ 58.32万$ 58.32万
- 项目类别:
UEM Collaborative Research Ethics Education Program
UEM 合作研究伦理教育计划
- 批准号:1075552310755523
- 财政年份:2023
- 资助金额:$ 58.32万$ 58.32万
- 项目类别:
Assessing risk for firearm injury and attitudes about new gun violence prevention laws in Michigan to enhance policy implementation
评估密歇根州枪伤风险和对新枪支暴力预防法的态度,以加强政策实施
- 批准号:1081121410811214
- 财政年份:2023
- 资助金额:$ 58.32万$ 58.32万
- 项目类别:
Flexible Control Authority With a Robotic Arm: Facilitating Seamless Transitions Between User and Robot Control in Multi-Action Manipulation Tasks.
机械臂的灵活控制权限:促进多动作操作任务中用户和机器人控制之间的无缝过渡。
- 批准号:1063770710637707
- 财政年份:2023
- 资助金额:$ 58.32万$ 58.32万
- 项目类别: