Analysis Tools and Software for Second Generation Sequencing Data
第二代测序数据的分析工具和软件
基本信息
- 批准号:8280415
- 负责人:
- 金额:$ 32.21万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2010
- 资助国家:美国
- 起止时间:2010-08-11 至 2013-05-17
- 项目状态:已结题
- 来源:
- 关键词:AccountingAlgorithmsAttentionBiologicalComplexComputer softwareComputing MethodologiesDNADataData AnalysesData SetDetectionFluorescenceGenerationsGenetic TranscriptionGenomeGenomicsGenotypeGoalsInternationalMapsMeasurementMeasuresMethodologyMethodsMetricMicroRNAsModelingNucleotidesPlayPopulationPositioning AttributePriceProbabilityProceduresProcessRNAReadingReportingResolutionRoleSamplingShapesSoftware ToolsSourceStatistical MethodsStatistical ModelsStreamSystematic BiasTechnologyTimeUncertaintyVariantWeightbasechromatin immunoprecipitationcomparativedata managementgenome sequencingimprovedmeetingsnew technologyprototypepublic health relevancesoundtooluser-friendly
项目摘要
DESCRIPTION (provided by applicant): Second-generation sequencing (sec-gen) technology is poised to radically change how genomic data is obtained and used. Capable of sequencing millions of short strands of DNA in parallel, this technology can be used to assemble complex genomes for a small fraction of the price and time of previous technologies. In fact, a recently formed international consortium, the 1000 Genomes Project, plans to sequence the genomes of approximately 1,200 people. The possibility of comparative analysis at the sequence level of a large number of samples across multiple populations may be achievable within the next five years. These datasets also present unprecedented challenges in statistical analysis and data management. For example, a central goal of the 1000 Genomes Project is to quantify across-sample variation at the single nucleotide level. At this resolution, small error rates in sequencing prove significant, especially for rare variants. Furthermore, sec-gen sequencing is a relatively new technology for which potential biases and sources of obscuring variation are not yet fully understood. Therefore, modeling and quantifying the uncertainty inherent in the generation of sequencing reads is of utmost importance. Properly relating this uncertainty to the true underlying variation in the genome, especially, variation between and among populations will be essential for projects that use sec-gen sequencing data to meet their scientific goals. Although genome sequencing is the application that most attention has received, sec-gen technology is also being used to produce quantitative measurements related to applications previously associated with microarrays. Of these, chromatin immunoprecipitation followed by sequencing (ChIP- Seq) has been the most successful. Existing tools have been developed for analyzing one sample at a time. Methodology for drawing inference from multiple samples has not yet been developed. The demand for such methods will increase rapidly as the technology becomes more economical and multiple samples become standard. Other applications for which statistical methodology is needed are RNA and microRNA transcription analysis. In all these sequencing applications, a number of critical steps are required to convert raw intensity measures into the sequence reads that will be used in down-stream analysis. Ad-hoc approaches, that assign weights to each base call, are unsuitable. Our goal is to create a sound and unified statistical and computational methodology for representing and managing uncertainty throughout the sec-gen sequencing data analysis pipeline built on a robust, modular and extensible software platform.
PUBLIC HEALTH RELEVANCE: Second-generation sequencing technology is poised to radically change how genomic data is obtained and used. These datasets also present unprecedented challenges in statistical analysis and modeling and quantifying uncertainty inherent in the generation of sequencing reads is of utmost importance. We will develop data analysis tools for widely used applications using statistical methods that account for this uncertainty.
描述(由申请人提供):第二代测序(SEC-GEN)技术有望从根本上改变基因组数据的获得和使用。该技术能够并行地对数百万个短链DNA进行测序,可用于组装复杂的基因组,以占以前技术的价格和时间的一小部分。实际上,一个最近成立的国际财团,即1000个基因组项目,计划对大约1200人的基因组进行测序。在未来五年内,可以在多个人群中大量样本的序列水平进行比较分析的可能性。 这些数据集还在统计分析和数据管理中提出了前所未有的挑战。例如,1000个基因组项目的核心目标是量化单个核苷酸水平的跨样本变异。在此分辨率下,测序中的较小错误率证明是显着的,尤其是对于稀有变体而言。此外,SEC-GEN测序是一项相对较新的技术,尚未完全了解其潜在的偏见和掩盖差异来源。因此,建模和量化测序读取中固有的不确定性至关重要。将这种不确定性与基因组的真正潜在变化相关,尤其是人群之间和人群之间的差异对于使用SEC-GEN测序数据来满足其科学目标的项目至关重要。 尽管基因组测序是大多数关注的应用,但SEC-GEN技术也用于生成与以前与微阵列相关的应用相关的定量测量。其中,染色质免疫沉淀,然后进行测序(Chipseq)是最成功的。已经开发了一次用于一次分析一个样本的工具。从多个样本中绘制推断的方法尚未开发。随着技术变得更加经济并且多个样本成为标准,对这种方法的需求将迅速增加。需要统计方法论的其他应用包括RNA和microRNA转录分析。在所有这些测序应用程序中,需要进行许多关键步骤,以将原始强度度量转换为将用于下游分析的序列读取。临时方法(为每个基本呼叫分配权重)是不合适的。我们的目标是创建一个声音和统一的统计和计算方法,用于在SEC-GEN测序数据分析管道中代表和管理不确定性,建立在坚固,模块化和可扩展的软件平台上。
公共卫生相关性:第二代测序技术有望从根本上改变基因组数据的获取和使用方式。这些数据集还在统计分析,建模和量化测序读取中固有的不确定性中提出了前所未有的挑战。我们将使用统计方法来为广泛使用的应用程序开发数据分析工具,以说明这种不确定性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Rafael Angel Irizarry其他文献
Rafael Angel Irizarry的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Rafael Angel Irizarry', 18)}}的其他基金
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
- 批准号:
9979396 - 财政年份:2020
- 资助金额:
$ 32.21万 - 项目类别:
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
- 批准号:
10666501 - 财政年份:2020
- 资助金额:
$ 32.21万 - 项目类别:
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
- 批准号:
10267687 - 财政年份:2020
- 资助金额:
$ 32.21万 - 项目类别:
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
- 批准号:
10448436 - 财政年份:2020
- 资助金额:
$ 32.21万 - 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
- 批准号:
10461727 - 财政年份:2019
- 资助金额:
$ 32.21万 - 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
- 批准号:
9922327 - 财政年份:2019
- 资助金额:
$ 32.21万 - 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
- 批准号:
10159937 - 财政年份:2019
- 资助金额:
$ 32.21万 - 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
- 批准号:
10612937 - 财政年份:2019
- 资助金额:
$ 32.21万 - 项目类别:
Biomedical Data Science Online Curriculum on HarvardX
HarvardX 生物医学数据科学在线课程
- 批准号:
8829975 - 财政年份:2014
- 资助金额:
$ 32.21万 - 项目类别:
Biomedical Data Science Online Curriculum on HarvardX
HarvardX 生物医学数据科学在线课程
- 批准号:
9130901 - 财政年份:2014
- 资助金额:
$ 32.21万 - 项目类别:
相似国自然基金
基于自适应分级多层图注意力机制的疾病关联微生物预测模型及算法研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
基于自适应分级多层图注意力机制的疾病关联微生物预测模型及算法研究
- 批准号:62272064
- 批准年份:2022
- 资助金额:54.00 万元
- 项目类别:面上项目
注意力引导的复杂场景精确室内定位关键算法研究
- 批准号:62102459
- 批准年份:2021
- 资助金额:24.00 万元
- 项目类别:青年科学基金项目
注意力引导的复杂场景精确室内定位关键算法研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向电力系统安全评估的深度稀疏图注意力卷积集成模型和增量学习算法研究与应用
- 批准号:
- 批准年份:2020
- 资助金额:55 万元
- 项目类别:面上项目
相似海外基金
Deep Learning Based Natural Language Processing Markers of Anxiety and Depression
基于深度学习的自然语言处理的焦虑和抑郁标记
- 批准号:
10723819 - 财政年份:2023
- 资助金额:
$ 32.21万 - 项目类别:
Personalized Profiles of Pathology in Pediatric Traumatic Brain Injury
小儿创伤性脑损伤的个性化病理学概况
- 批准号:
10542834 - 财政年份:2022
- 资助金额:
$ 32.21万 - 项目类别:
Personalized Profiles of Pathology in Pediatric Traumatic Brain Injury
小儿创伤性脑损伤的个性化病理学概况
- 批准号:
10377732 - 财政年份:2022
- 资助金额:
$ 32.21万 - 项目类别:
A Mobile Game for Domain Adaptation and Deep Learning in Autism Healthcare
用于自闭症医疗领域适应和深度学习的手机游戏
- 批准号:
10596139 - 财政年份:2021
- 资助金额:
$ 32.21万 - 项目类别:
A Mobile Game for Domain Adaptation and Deep Learning in Autism Healthcare
用于自闭症医疗领域适应和深度学习的手机游戏
- 批准号:
10443542 - 财政年份:2021
- 资助金额:
$ 32.21万 - 项目类别: