Maintenance and Development of RepeatMasker and RepeatModeler
RepeatMasker和RepeatModeler的维护和开发
基本信息
- 批准号:8235956
- 负责人:
- 金额:$ 39.77万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2003
- 资助国家:美国
- 起止时间:2003-08-15 至 2014-01-31
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsAmino Acid SequenceAnimalsBioinformaticsBirdsClassificationCodeComplexComputer softwareConsensusConsensus SequenceDNADNA MaintenanceDNA Transposable ElementsDatabasesDetectionDevelopmentElementsEvolutionFamilyGenbankGenomeGenomicsGrantHereditary DiseaseHuman GenomeLabelLengthLibrariesLinkMaintenanceMasksMetadataMutationOrganismOutputPeptide Sequence DeterminationPerformancePhylogenetic AnalysisPhylogenyPopulation StudyProcessPropertyProtein DatabasesQuality ControlResearchResourcesSequence AnalysisSeriesSource CodeSpeedTimeUnited States National Institutes of HealthUse of New TechniquesVariantbasecomputerized toolsdesigngenome sequencinghomologous recombinationimprovedmammalian genomeopen sourceprogramspublic health relevancetoolweb pageweb site
项目摘要
DESCRIPTION (provided by applicant): Most eukaryotic genomes include vast numbers of interspersed repeats (IRs), which are the remnants of mostly selfishly amplified transposable elements. Transposable elements have an exceptionally wide-ranging mutagenic effect on genomes, while recognition of IRs provide unparalleled information on genome evolution and is crucial in many aspects of bioinformatics. This grant would continue support for the maintenance and further development of RepeatMasker, a computational tool that has become the de facto standard for identification and characterization of IRs, and support the development of RepeatModeler, a program designed to derive RepeatMasker-grade databases of IR consensus sequences. The source codes for these tools are freely available to the public. Development will emphasize the following: a) As sequencing of new vertebrate species continues to accelerate, further development of the de novo repeat identifying program RepeatModeler is a priority. Unlike other such programs in development, RepeatModeler is specifically geared for the analysis of mammalian and bird genomes. b) Now the RepeatMasker code has been completely refactored, the emphasis of its development shifts towards increasing its sensitivity and accuracy, and addition of options like a mode for analyzing low coverage assemblies and recognition of chimaeric elements created by homologous recombinations. c) The maintenance of the DNA consensus sequence database with many RepeatMasker-specific metadata, the Transposable Element protein database, and the website with, among others, a growing number of pre-annotated genomes, will take an effort that is more likely to grow than to shrink in size. d) We aim to further automate and refine the process of "phylogenetic labeling" of consensus sequences in the library, and to expand the databases with refined sets of subfamily sequences, which will make the prediction of potentially polymorphic elements and the precise time of older insertions possible. As part of our efforts to increase the sensitivity and speed of the RepeatMasker program, we propose to develop an improved search engine, starting from the open source BLASTZ code.
PUBLIC HEALTH RELEVANCE: RepeatMasker is the default tool to annotate the repetitive portion of complex genomes like the human genome, which is an essential and standard process in any genomic sequence analysis. In recent years it has become clear that interspersed repeats are responsible for a large fraction of "copy number" or structural allelic variations like large deletions, duplications and insertion of foreign DNA, which are far more common than previously assumed and are much more likely than small mutations to be associated with phenotypic differences and genetic diseases.
描述(由申请人提供):大多数真核基因组都包括大量散布的重复序列(IRS),这些重复序列是大多数自私放大的转座元素的残余物。转座元素对基因组具有异常广泛的诱变作用,而对IRS的识别则提供了有关基因组进化的无与伦比的信息,并且在生物信息学的许多方面至关重要。该赠款将继续支持ReponMasker的维护和进一步开发,ReponMasker是一种计算工具,已成为IRS识别和表征的事实上的标准,并支持ReponModeler的开发,该计划旨在得出IR共识序列的ReponMasker级数据库。这些工具的源代码可供公众免费使用。开发将强调以下内容:a)随着新脊椎动物的测序继续加速,从头重复识别程序重复模型的进一步发展是一个优先事项。与开发中的其他此类程序不同,重复模型专门用于分析哺乳动物和鸟类基因组。 b)现在已经完全重构了重复验证码,其发展的重点转向提高其敏感性和准确性,并添加了诸如分析低覆盖范围组件的模式以及对同源重组产生的嵌合元件的识别的模式。 c)维持DNA共识序列数据库,其中许多重复验证者特异性元数据,可转座元素蛋白质数据库以及带有越来越多的预通量基因组的网站将采取的努力比缩小大小更可能成长。 d)我们的目标是进一步自动化和完善图书馆共识序列的“系统发育标记”的过程,并使用精致的亚家族序列扩展数据库,这将使潜在的多态性元素和较旧插入的精确时间的预测成为可能。作为提高ReponMasker程序的灵敏度和速度的努力的一部分,我们建议从开源Blastz代码开始开发改进的搜索引擎。
公共卫生相关性:ReponMasker是注释复杂基因组(如人类基因组)的重复部分的默认工具,这是任何基因组序列分析中的必不可少的和标准过程。近年来,很明显,散布的重复序列是造成大部分“拷贝数”或结构性等位基因变化的很大一部分,例如大删除,重复和外源DNA的插入,这比以前假定的更为普遍,并且比小突变更可能与表型差异和遗传疾病相关。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Arian Fredericus Anthonius Smit其他文献
Arian Fredericus Anthonius Smit的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Arian Fredericus Anthonius Smit', 18)}}的其他基金
Further development of the FEAST software, and its use for novel gene predictions
FEAST 软件的进一步开发及其在新基因预测中的应用
- 批准号:
7287974 - 财政年份:2007
- 资助金额:
$ 39.77万 - 项目类别:
Further development of the FEAST software, and its use for novel gene predictions
FEAST 软件的进一步开发及其在新基因预测中的应用
- 批准号:
7668499 - 财政年份:2007
- 资助金额:
$ 39.77万 - 项目类别:
Further development of the FEAST software, and its use for novel gene predictions
FEAST 软件的进一步开发及其在新基因预测中的应用
- 批准号:
7473268 - 财政年份:2007
- 资助金额:
$ 39.77万 - 项目类别:
Development and Maintenance of RepeatMasker
RepeatMasker的开发与维护
- 批准号:
9905539 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
Maintenance and Development of RepeatMasker
RepeatMasker的维护与开发
- 批准号:
7288796 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
Maintenance and Development of RepeatMasker and GESTALT
RepeatMasker和GESTALT的维护和开发
- 批准号:
6912723 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
Development and Maintenance of RepeatMasker
RepeatMasker的开发与维护
- 批准号:
8697975 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
Maintenance and Development of RepeatMasker and GESTALT
RepeatMasker和GESTALT的维护和开发
- 批准号:
6788158 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
Maintenance and Development of RepeatMasker and RepeatModeler
RepeatMasker和RepeatModeler的维护和开发
- 批准号:
7785285 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
Maintenance and Development of RepeatMasker and GESTALT
RepeatMasker和GESTALT的维护和开发
- 批准号:
6676869 - 财政年份:2003
- 资助金额:
$ 39.77万 - 项目类别:
相似国自然基金
基于祖先序列重构的D-氨基酸解氨酶的新酶设计及分子进化
- 批准号:32271536
- 批准年份:2022
- 资助金额:54.00 万元
- 项目类别:面上项目
模板化共晶聚合合成高分子量序列聚氨基酸
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
模板化共晶聚合合成高分子量序列聚氨基酸
- 批准号:22201105
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
基于祖先序列重构的D-氨基酸解氨酶的新酶设计及分子进化
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
C-末端40个氨基酸插入序列促进细菌脂肪酸代谢调控因子FadR转录效率的机制研究
- 批准号:82003257
- 批准年份:2020
- 资助金额:24 万元
- 项目类别:青年科学基金项目
相似海外基金
Valens-Poly Sequencing Polyclonal Antibodies for Drug Discovery
用于药物发现的 Valens-Poly 测序多克隆抗体
- 批准号:
8981571 - 财政年份:2012
- 资助金额:
$ 39.77万 - 项目类别:
Structural bioinformatics software for epitope selection and antibody engineering
用于表位选择和抗体工程的结构生物信息学软件
- 批准号:
9009304 - 财政年份:2012
- 资助金额:
$ 39.77万 - 项目类别:
Structural bioinformatics software for epitope selection and antibody engineering
用于表位选择和抗体工程的结构生物信息学软件
- 批准号:
8781169 - 财政年份:2012
- 资助金额:
$ 39.77万 - 项目类别:
Valens-Poly Sequencing Polyclonal Antibodies for Drug Discovery
用于药物发现的 Valens-Poly 测序多克隆抗体
- 批准号:
9150642 - 财政年份:2012
- 资助金额:
$ 39.77万 - 项目类别:
Metagenomic detection of emerging viruses in the blood supply
血液供应中新出现病毒的宏基因组检测
- 批准号:
8207225 - 财政年份:2011
- 资助金额:
$ 39.77万 - 项目类别: