Genomics, EHRs, GPUs, and Next Generation Computational Statistics
基因组学、EHR、GPU 和下一代计算统计
基本信息
- 批准号:10264804
- 负责人:
- 金额:$ 64.43万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-08-26 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:AdmixtureAlgorithmic AnalysisAlgorithmic SoftwareAlgorithmsAreaAttentionBig Data MethodsCloud ComputingCodeCommunicationComputer softwareComputersComputing MethodologiesDataData SetDiseaseDocumentationDoseEducational workshopElectronic Health RecordEnvironmentEvolutionFosteringFutureGenesGeneticGenetic ProgrammingGenetic studyGenomicsGenotypeGoalsHaplotypesHuman Genome ProjectLanguageLeadMapsMedicineMethodsMinorMissionModelingModernizationNaturePrecision HealthReproducibilityResearch PersonnelScientistSoftware ValidationStatistical AlgorithmStatistical Data InterpretationSystems AnalysisTechniquesTestingTrainingUnited States Department of Veterans AffairsUpdateVariantVeteransalgorithm developmentbiobankcluster computingcommunity buildingdesigndesign and constructionelectronic dataflexibilitygenetic analysisgenetic informationgenetic pedigreegenome wide association studyhandheld mobile devicehealth datahigh dimensionalityhuman diseaseimprovedinnovationinsurance claimsmathematical modelnext generationopen sourceparallel computerprogramssexsocialsoftware developmentstatisticstheoriestooltraitwearable devicewebinar
项目摘要
Abstract
The future challenges of statistical genetics are enormous. Data sets continue to grow; studies with 106 cases
and 107 markers have become feasible, but current algorithms and software do not scale to this size. We need to
rethink and rebuild many of our statistical analysis techniques and tools to scale effectively. In addition, health data
will soon be commonly collected from mobile and wearable devices, dramatically increasing its volume and utility.
Precision health and predictive medicine raise the stakes even further. Concurrently, the nature of computing
is rapidly changing. To take advantage of hardware advances, particularly ubiquitous parallel computing, new
statistical approaches and algorithms and new programming paradigms must be brought online.
This renewal proposal targets the application of state-of-the-art statistical techniques and tools to develop
genetic analysis algorithms that can scale to studies with millions of subjects, such as the US Department of
Veterans Affairs' Million Veteran Program (MVP) and the UK Biobank. Biobank-scale data sets have many ben-
efits, particularly the potential power to detect the subtle effects of each of the many genes involved in common
diseases. Another benefit is that these data sets can be more representative of the populace by including large
numbers of people from multiple ancestries, different social strata, and all sexes. To effectively and efficiently
analyze these massive data sets requires advances in the current statistical genetics tools. Effective statistical
analysis takes many forms: algorithms that converge in fewer iterations, powerful statistics that accommodate all
available data, and computational methods that take advantage of massively parallel computing hardware such
as graphics processing units (GPUs) and other coprocessors. We will deliver algorithms that can directly handle
biobank-scale data sets for many computationally-challenging statistical genetics tasks, including genome-wide
association studies (GWAS) with trait data from electronic health records (EHRs). More generally, our algorithm
focus will benefit all scientific fields driven by computational statistics and high-dimensional optimization.
Of course, for statistical algorithm development to be immediately useful it must be accompanied by fast,
easy-to-use software. We will promptly deliver open-source software that (1) enables interactive and reproducible
analyses with informative intermediate results, (2) provides quality graphics, (3) scales to big data analytics, (4)
embraces parallel and distributed computing, (5) adapts to rapid hardware evolution, (6) allows cloud computing,
and (7) fosters easy communication between clinicians, geneticists, statisticians, and computer scientists. Recent
breakthroughs in computer languages bring all these goals within reach.
Our overall objective is the design and construction of state-of-the-art statistical genetics algorithms and
software for modern, massive genetic and EHR data. Numerical accuracy, computational efficiency, and software
sustainability are our priorities. We will deliver a unified, cross-platform, high-level, reproducible, interactive
analysis environment that is fast and efficient even for biobank-scale data sets.
抽象的
统计遗传学的未来挑战是巨大的。数据集继续增长; 106例研究
和107个标记已经变得可行,但是当前的算法和软件并未扩展到此尺寸。我们需要
重新考虑和重建我们许多统计分析技术和工具以有效扩展。此外,健康数据
通常很快就会从移动设备和可穿戴设备中收集,从而大大增加了其数量和实用性。
精确的健康和预测医学进一步增加了赌注。同时,计算的性质
正在迅速变化。利用硬件进步,尤其是无处不在的并行计算,新的
统计方法和算法以及新的编程范例必须在线上。
该更新建议针对的是最先进的统计技术和工具来开发
遗传分析算法可以扩展到数百万受试者(例如美国)的研究
退伍军人事务的百万退伍军人计划(MVP)和英国生物库。生物银行规模的数据集有许多BEN-
拟合,特别是检测共同基因中每个基因的微妙影响的潜在能力
疾病。另一个好处是,这些数据集可以通过包括大型
来自多个祖先,不同社会阶层和所有性别的人数。有效和有效
分析这些大规模数据集需要在当前的统计遗传学工具中进展。有效的统计
分析采用多种形式:算法在更少的迭代中收敛,强大的统计数据可容纳所有人
可用的数据以及利用大量并行计算硬件这样的计算方法
作为图形处理单元(GPU)和其他协处理器。我们将提供可以直接处理的算法
许多计算统计遗传学任务,包括全基因组的生物银行规模数据集
协会研究(GWAS)与电子健康记录(EHRS)的性状数据有关。更一般地,我们的算法
焦点将有益于由计算统计和高维优化驱动的所有科学领域。
当然,要使统计算法的开发立即有用,必须伴随快速,
易于使用的软件。我们将迅速提供开源软件(1)启用交互式和可重现
具有信息中间结果的分析,(2)提供了质量图形,(3)大数据分析的量表,(4)
拥抱并行和分布式计算,(5)适应快速硬件的演变,(6)允许云计算,
(7)促进了临床医生,遗传学家,统计学家和计算机科学家之间的轻松交流。最近的
计算机语言的突破带来了所有这些目标。
我们的总体目标是设计和构建最先进的统计遗传学算法和
现代,大量遗传和EHR数据的软件。数值准确性,计算效率和软件
可持续性是我们的优先事项。我们将提供一个统一的,跨平台,高级,可重复的,互动的
即使对于生物银行规模的数据集,也是快速且有效的分析环境。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Eric Sobel其他文献
Eric Sobel的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Eric Sobel', 18)}}的其他基金
Genomics GPUs and next generation computational statistics
基因组学 GPU 和下一代计算统计
- 批准号:
8539067 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Genomics, EHRs, GPUs, and Next Generation Computational Statistics
基因组学、EHR、GPU 和下一代计算统计
- 批准号:
10450816 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Genomics GPUs and next generation computational statistics
基因组学 GPU 和下一代计算统计
- 批准号:
8324508 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Genomics, EHRs, GPUs, and Next Generation Computational Statistics
基因组学、EHR、GPU 和下一代计算统计
- 批准号:
10672959 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Genomics GPUs and next generation computational statistics
基因组学 GPU 和下一代计算统计
- 批准号:
8085977 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Genomics, GPUs, and Next Generation Computational Statistics
基因组学、GPU 和下一代计算统计
- 批准号:
9100873 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Genomics, GPUs, and Next Generation Computational Statistics
基因组学、GPU 和下一代计算统计
- 批准号:
8888381 - 财政年份:2011
- 资助金额:
$ 64.43万 - 项目类别:
Computer Cluster and Storage to Support Whole Genome Sequencing and Analysis
支持全基因组测序和分析的计算机集群和存储
- 批准号:
7595696 - 财政年份:2009
- 资助金额:
$ 64.43万 - 项目类别:
COMPILING AND TESTING STATISTICAL GENETICS APPLICATIONS
编译和测试统计遗传学应用程序
- 批准号:
7627683 - 财政年份:2007
- 资助金额:
$ 64.43万 - 项目类别:
COMPILING AND TESTING STATISTICAL GENETICS APPLICATIONS
编译和测试统计遗传学应用程序
- 批准号:
7369416 - 财政年份:2006
- 资助金额:
$ 64.43万 - 项目类别:
相似国自然基金
可执行程序中私有密码系统定位与分析
- 批准号:61872237
- 批准年份:2018
- 资助金额:16.0 万元
- 项目类别:面上项目
基于进化优化的大型软件演化中的错误定位
- 批准号:61673384
- 批准年份:2016
- 资助金额:60.0 万元
- 项目类别:面上项目
大鼠脑白质网络发育的研究及其DTI数据分析算法和软件平台的建立
- 批准号:81671770
- 批准年份:2016
- 资助金额:56.0 万元
- 项目类别:面上项目
数据驱动的软件过程挖掘研究
- 批准号:61662085
- 批准年份:2016
- 资助金额:40.0 万元
- 项目类别:地区科学基金项目
软件缺陷预测的度量元有效性及建模算法研究
- 批准号:61602534
- 批准年份:2016
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
相似海外基金
A Mobile Health Application to Detect Absence Seizures using Hyperventilation and Eye-Movement Recordings
一款使用过度换气和眼动记录检测失神癫痫发作的移动健康应用程序
- 批准号:
10696649 - 财政年份:2023
- 资助金额:
$ 64.43万 - 项目类别:
A software tool to facilitate variable-level equivalency and harmonization in research data: Leveraging the NIH Common Data Elements Repository to link concepts and measures in an open format
促进研究数据中变量级别等效性和协调性的软件工具:利用 NIH 通用数据元素存储库以开放格式链接概念和测量
- 批准号:
10821517 - 财政年份:2023
- 资助金额:
$ 64.43万 - 项目类别:
Brain Digital Slide Archive: An Open Source Platform for data sharing and analysis of digital neuropathology
Brain Digital Slide Archive:数字神经病理学数据共享和分析的开源平台
- 批准号:
10735564 - 财政年份:2023
- 资助金额:
$ 64.43万 - 项目类别:
Multi-modal Tracking of In Vivo Skeletal Structures and Implants
体内骨骼结构和植入物的多模式跟踪
- 批准号:
10839518 - 财政年份:2023
- 资助金额:
$ 64.43万 - 项目类别:
Development and evaluation of a combined X-ray transmission and diffraction imaging system for pathology
用于病理学的组合 X 射线透射和衍射成像系统的开发和评估
- 批准号:
10699271 - 财政年份:2023
- 资助金额:
$ 64.43万 - 项目类别: