IIBR:Informatics:Toward an Automated RNA-seq Bioinformatician
IIBR:信息学:走向自动化 RNA-seq 生物信息学家
基本信息
- 批准号:1937540
- 负责人:
- 金额:$ 54.61万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-06-01 至 2023-05-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Measurement of gene expression --- which genes are active in which conditions --- is an indispensable tool for understanding biological systems. Analysis of gene expression from modern genomic sequencing technologies requires the use of sophisticated software such as read mappers, transcript assemblers, and expression abundance estimators. A software program implementing one of these steps typically has a large number of user-settable parameters that influence how the analysis algorithm performs. Scientists,biologists, and clinical researchers must often tune these parameters by hand or through other ad hoc means. The goal of this project is to automate this process by designing and implementing a framework for automatically learning high-performing parameters for gene expression analysis software. This project also aims to develop algorithms, software, and methodology to make this framework practical and useful. This will allow more researchers to obtain high-quality gene expression analyses with significantly less effort and will also enable improved analysis of large data sets where per-sample parameter tuning by hand is impractical. Reproducibility of biological results will also be enhanced since the choice of parameters is explicitly ceded to an automated, repeatable process. This research will make biological studies involving gene expression more accurate and less costly. A number of educational and outreach activities for various levels of students (elementary through undergraduate) are planned to enhance community understanding of gene expression and its analysis.The developed processes will be implemented in several wrapper tools for parameter optimization that can be dropped into existing RNA-seq analysis pipelines to improve accuracy at each step. The research to design these tools will be broken down into several more tractable steps. The first step will be learning, for each tool, a collection of representative parameter vectors by analyzing large collections of existing RNA-seq samples. In the second step, machine learning methods, based on a combination of techniques such as Bayesian Optimization, genetic algorithms, and classification approaches, will be used to design techniques to select parameter vectors from these sets that are predicted to offer high performance. In the third step, techniques for providing human-interpretable rationales for the automatic parameter choices will be designed and implemented. The design of this system will also enhance our practical knowledge of techniques for such parameter optimization in other application domains within biology. Results from the project can be founThis award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
基因表达的测量---哪些基因在哪些条件下是活跃的---是理解生物系统的必不可少的工具。 对现代基因组测序技术的基因表达的分析需要使用复杂的软件,例如读取映射器,成绩单汇编器和表达丰度估计器。实施这些步骤之一的软件程序通常具有大量可用用户安置的参数,这些参数会影响分析算法的执行方式。 科学家,生物学家和临床研究人员必须手动或其他临时手段来调整这些参数。该项目的目的是通过设计和实施一个框架来自动学习基因表达分析软件的高性能参数来自动化此过程。该项目还旨在开发算法,软件和方法,以使该框架实用和有用。这将使更多的研究人员能够以明显较少的努力获得高质量的基因表达分析,并且还可以改善对每样本参数调整手工不切实际的大型数据集的分析。由于参数的选择明确割让到自动重复的过程,因此生物学结果的可重复性也将得到增强。这项研究将使涉及基因表达的生物学研究更加准确,成本较低。计划为各个级别的学生(通过本科生)进行许多教育和外展活动,以增强社区对基因表达及其分析的理解。开发过程将在几种包装器工具中实施,以进行参数优化,可以将其放入现有的RNA-SEQ分析管道中,以提高每个步骤的准确性。 设计这些工具的研究将被分解为多个可进行的步骤。对于每个工具,第一步是通过分析现有RNA-Seq样本的大量集合来学习代表性参数向量的集合。在第二步中,基于贝叶斯优化,遗传算法和分类方法等技术组合的机器学习方法将用于设计技术,从这些集合中选择参数向量,这些参数向量预计提供了高性能。在第三步中,将设计和实施为自动参数选择提供人解剖理由的技术。该系统的设计还将增强我们对生物学其他应用领域此类参数优化技术的实践知识。该项目的结果可以是Founthis Award反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛影响的评论标准来评估值得支持的。
项目成果
期刊论文数量(12)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Revisiting the complexity of and algorithms for the graph traversal edit distance and its variants
- DOI:10.1186/s13015-024-00262-6
- 发表时间:2024-04-29
- 期刊:
- 影响因子:1
- 作者:Qiu,Yutong;Shen,Yihang;Kingsford,Carl
- 通讯作者:Kingsford,Carl
Creating and Using Minimizer Sketches in Computational Genomics
- DOI:10.1089/cmb.2023.0094
- 发表时间:2023-08-30
- 期刊:
- 影响因子:1.7
- 作者:Zheng,Hongyu;Marcais,Guillaume;Kingsford,Carl
- 通讯作者:Kingsford,Carl
How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design
- DOI:10.1145/3406325.3451036
- 发表时间:2021-06
- 期刊:
- 影响因子:0
- 作者:Maria-Florina Balcan;Dan F. DeBlasio;Travis Dick;Carl Kingsford;T. Sandholm;Ellen Vitercik
- 通讯作者:Maria-Florina Balcan;Dan F. DeBlasio;Travis Dick;Carl Kingsford;T. Sandholm;Ellen Vitercik
Reinforcement Learning for Robotic Liquid Handler Planning
机器人液体处理机规划的强化学习
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Ferdosi, Mohsen;Ge, Yuejun;Kingsford, Carl
- 通讯作者:Kingsford, Carl
Personalized Neural Architecture Search for Federated Learning
- DOI:
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Minh Hoang;Carleton Kingsford
- 通讯作者:Minh Hoang;Carleton Kingsford
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Carleton Kingsford其他文献
Carleton Kingsford的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Carleton Kingsford', 18)}}的其他基金
Conference: NSF-NIH Joint Workshop on Foundational AI in Biology
会议:NSF-NIH 生物学基础人工智能联合研讨会
- 批准号:
2325301 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
III:Small: Expressiveness of Genome Graphs: Construction, Comparison, and Heterogeneity
III:小:基因组图的表现力:构建、比较和异质性
- 批准号:
2232121 - 财政年份:2023
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
Workshop on Future Directions for Algorithms in Biology
生物学算法未来方向研讨会
- 批准号:
1748493 - 财政年份:2017
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
AF: Small: Multiscale Spectral Signatures for Local and Multi-objective Biological Network Alignment
AF:小:用于局部和多目标生物网络比对的多尺度光谱特征
- 批准号:
1319998 - 财政年份:2013
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
- 批准号:
1256087 - 财政年份:2012
- 资助金额:
$ 54.61万 - 项目类别:
Continuing Grant
CAREER: Model-based Reconstruction of Ancient Biological Networks
职业:基于模型的古代生物网络重建
- 批准号:
1053918 - 财政年份:2011
- 资助金额:
$ 54.61万 - 项目类别:
Continuing Grant
相似国自然基金
2023年(第四届)国际生物数学与医学应用研讨会
- 批准号:12342004
- 批准年份:2023
- 资助金额:8.00 万元
- 项目类别:专项项目
突变和修饰重塑蛋白质亚细胞定位的生物信息学研究
- 批准号:32370698
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
知识引导与数据驱动的肝内胆管癌调控关键信号通路识别的信息学模型与应用
- 批准号:32370694
- 批准年份:2023
- 资助金额:50.00 万元
- 项目类别:面上项目
基于生物信息学的类风湿性关节炎患者衰弱预测模型的构建与验证
- 批准号:82301786
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于结构表征的蛋白质与长链非编码RNA相互作用预测的生物信息学方法研究
- 批准号:62373216
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
TRiPOD: Toward Reusable Phenotypes in Observational Data for AD/ADRD - managing definitions and correcting bias
TRiPOD:在 AD/ADRD 观察数据中实现可重复使用的表型 - 管理定义和纠正偏差
- 批准号:
10642888 - 财政年份:2021
- 资助金额:
$ 54.61万 - 项目类别:
CRII: HCC: Practical Steps Toward Integrating the Tools of Emergency Management with Crisis Informatics Techniques
CRII:HCC:将应急管理工具与危机信息学技术相结合的实际步骤
- 批准号:
2105069 - 财政年份:2021
- 资助金额:
$ 54.61万 - 项目类别:
Standard Grant
TRiPOD: Toward Reusable Phenotypes in Observational Data for AD/ADRD - managing definitions and correcting bias
TRiPOD:在 AD/ADRD 观察数据中实现可重复使用的表型 - 管理定义和纠正偏差
- 批准号:
10279554 - 财政年份:2021
- 资助金额:
$ 54.61万 - 项目类别:
Developing a kit-based research use only (RUO) translocation assay for deployment as a lab developed test (LDT) toward changing outcomes for patients with driver-negative tumors
开发基于试剂盒的仅供研究使用 (RUO) 的易位测定,作为实验室开发的测试 (LDT) 部署,以改变驱动阴性肿瘤患者的结果
- 批准号:
10678597 - 财政年份:2020
- 资助金额:
$ 54.61万 - 项目类别:
Toward improved understanding of sex differences in drug response: developing gene and pathway-based informatics methods to examine sex-differential genetic effects
提高对药物反应中性别差异的理解:开发基于基因和通路的信息学方法来检查性别差异遗传效应
- 批准号:
10181072 - 财政年份:2019
- 资助金额:
$ 54.61万 - 项目类别: