Modeling gene expression in yeast using large degenerate libraries
使用大型简并文库模拟酵母中的基因表达
基本信息
- 批准号:10172925
- 负责人:
- 金额:$ 35.09万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-08-01 至 2023-05-31
- 项目状态:已结题
- 来源:
- 关键词:3&apos Untranslated Regions5&apos Untranslated RegionsAffectAlternative SplicingBinding SitesBiologicalBiological AssayBiologyCellsChemicalsComplexComputer ModelsDNADNA BindingDataData SetDiseaseElementsEngineeringEnsureEukaryotaGene ExpressionGene Expression ProcessGene LibraryGenesGenetic TranscriptionGenetic VariationGenomeGenome engineeringGenotypeGrowthHumanHuman GeneticsHuman GenomeIndividualIntronsInvestigationKnowledgeLeadLearningLibrariesMessenger RNAMetabolic PathwayModelingMutationNucleic Acid Regulatory SequencesNucleotidesOrganismPharmaceutical PreparationsPhenotypeProcessPropertyProteinsRNARNA BindingRNA SplicingRNA-Binding ProteinsRegulationRegulatory ElementReporter GenesResearchSaccharomyces cerevisiaeSourceSpecific qualifier valueSumSynthetic GenesTestingTrainingTranslatingTranslationsUntranslated RNAUntranslated RegionsValidationVariantWorkYeastsbasecombinatorialconvolutional neural networkdeep learningdesignfitnessgenetic regulatory proteinhuman diseaseimprovedmembermetabolic engineeringnext generation sequencingnovelnovel sequencing technologypredictive modelingpromoterprotein expressionscale upsynthetic biologytool
项目摘要
PROJECT SUMMARY
Short sequence elements in DNA and RNA determine the levels and composition of mRNAs and proteins,
making it critical that we can accurately model how any given sequence will affect transcription, splicing or
translation. Such models of cis-regulation will fill in gaps in our knowledge of these core gene expression
processes. Additionally, as large numbers of human genomes are sequenced, the ability to predict the effects
of sequence variation on the ultimate levels of proteins will be integral to the interpretation of variation in
regulatory sequences. Similarly, the construction of metabolic pathways with defined levels of expression and
the engineering of synthetic gene networks require accurate knowledge of how regulatory sequences affect
expression. This application seeks to use the yeast Saccharomyces cerevisiae as a test case for learning how
any short regulatory sequence affects protein levels. A predictive model will be trained on a set of libraries two
orders of magnitude more complex than have been characterized to date. Libraries will be generated of a
growth reporter gene with a million random sequences of 50 nucleotides that comprise either a DNA element
that regulates transcription or an RNA element that regulates splicing or translation. The libraries will be
transformed into yeast, and the yeast will be placed under selection such that they grow according to the ability
of each random sequence to contribute to protein expression. A convolution neural network approach will be
used to learn the relationship between these “fitness” phenotypes and their associated genotypes. Although
yeast is a single-celled eukaryote, it has been the source of most of the original findings on gene expression,
and these findings form the basis for much of our knowledge of more complex eukaryotes. Furthermore, the
short sequences in yeast that comprise the DNA- and RNA-binding sites of regulatory proteins tend to be
comparable in size to those of other organisms. Yeast is used often in synthetic biology and metabolic
engineering, and the work proposed here will result in novel tools for quantitatively controlling its gene
expression. Initial results with a library of 5' untranslated regions (UTRs) indicate that we can construct a
model to account for a large fraction of the observed variability in expression, and that the model extends to
native sequence elements. The model allowed us to forward engineer 5' UTRs to have increased activity.
Specific aims of this application are to assess the effects of random sequences targeted to upstream
regulatory elements, core promoter elements, 5' UTRs, introns and 3' UTRs; to learn predictive and
interpretable models using convolutional neural networks and to identify novel functional cis-regulatory
elements; and to validate our models on native sequences and combinatorial libraries, and by engineering
synthetic sequence elements with user-specified properties. In sum, the proposal seeks to construct a
comprehensive and predictive model of regulatory sequence–function relationships for a well-studied single-
celled eukaryote, providing a basis for similar studies on other organisms.
项目摘要
DNA和RNA中的短序列元素决定了mRNA和蛋白质的水平和组成,
至关重要的是,我们可以准确地建模任何给定序列如何影响转录,剪接或
翻译。这样的顺式调节模型将填补我们对这些核心基因表达的知识的空白
过程。另外,随着大量人类基因组的测序,预测影响的能力
蛋白质最终水平上的序列变化将是解释的不可或缺的一部分
调节序列。同样,建造具有明确表达水平和的代谢途径
合成基因网络的工程需要准确了解调节序列如何影响
表达。该应用程序旨在将酿酒酵母的酵母糖疗法用作测试案例,以了解如何
任何短的调节序列都会影响蛋白质水平。预测模型将在一组库中进行培训两个
比迄今为止表征的数量级要复杂得多。库将生成
成长记者基因,具有50个核动肽的一百万个随机序列,这些核苷酸构成了DNA元素
调节转录或调节剪接或翻译的RNA元素。图书馆将是
变成酵母,将酵母放在选择下,以使它们根据能力生长
每个随机序列有助于蛋白质表达。卷积神经网络方法将是
用于学习这些“健身”表型及其相关基因型之间的关系。虽然
酵母是一种单细胞真核生物,它一直是基因表达的大多数原始发现的来源,
这些发现构成了我们对更复杂的真核生物知识的基础。此外,
酵母中的简短序列包括调节蛋白的DNA和RNA结合位点倾向于
与其他生物的大小相当。酵母经常用于合成生物学和代谢
工程以及此处提出的工作将导致新颖的工具用于定量控制其基因
表达。使用5'非翻译区域(UTRS)的库的初始结果表明,我们可以构建一个
解释表达式观察到的可变性的很大一部分的模型,并且该模型延伸至
天然序列元素。该模型使我们能够将工程师5'UTR提高活性。
该应用的具体目的是评估针对上游的随机序列的影响
调节元素,核心启动子元素,5'UTR,内含子和3'UTR;学习预测和
使用卷积神经网络的可解释模型,并识别新型功能顺序调节
元素;并通过本机序列和组合库验证我们的模型,并通过工程
具有用户指定属性的合成序列元素。总而言之,该提议旨在构建
研究良好的单人的调节序列 - 功能关系的全面和预测模型
细胞真核生物,为其他生物的类似研究提供了基础。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Effects of sequence motifs in the yeast 3' untranslated region determined from massively parallel assays of random sequences.
- DOI:10.1186/s13059-021-02509-6
- 发表时间:2021-10-18
- 期刊:
- 影响因子:12.3
- 作者:Savinov A;Brandsen BM;Angell BE;Cuperus JT;Fields S
- 通讯作者:Fields S
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
STANLEY FIELDS其他文献
STANLEY FIELDS的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('STANLEY FIELDS', 18)}}的其他基金
INTERROGATION OF E3 UBIQUITIN LIGASE CATALYSIS BY DEEP MUTATIONAL SCANNING
通过深度突变扫描研究 E3 泛素连接酶催化作用
- 批准号:
8365800 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
GENOME-WIDE ANALYSIS OF NASCENT TRANSCRIPTION IN SACCHAROMYCES CEREVISIAE
酿酒酵母新生转录的全基因组分析
- 批准号:
8365819 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
MASSIVELY PARALLEL MEASUREMENT OF SRC KINASE ACTIVITY AND DRUG RESISTANCE IN VIV
VIV 中 SRC 激酶活性和耐药性的大规模并行测量
- 批准号:
8365921 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
UNDERSTANDING THE MOLECULAR BASIS OF SELECTIVITY IN AKAP
了解 AKAP 选择性的分子基础
- 批准号:
8365785 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
HIGH-RESOLUTION MAPPING OF PROTEIN SEQUENCE-FUNCTION RELATIONSHIPS
蛋白质序列-功能关系的高分辨率绘图
- 批准号:
8365920 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
LARGE SCALE MEASUREMENT OF EPISTASIS TO IDENTIFY MUTATIONS THAT STABILIZE PROTEI
大规模测量上位性以鉴定稳定蛋白质的突变
- 批准号:
8365793 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
WIDE VARIATION IN ANTIBIOTIC RESISTANCE PROTEINS IDENTIFIED BY FUNCTIONAL METAGE
通过功能计量鉴定的抗生素抗性蛋白的广泛变异
- 批准号:
8365808 - 财政年份:2011
- 资助金额:
$ 35.09万 - 项目类别:
相似海外基金
Emerging mechanisms of viral gene regulation from battles between host and SARS-CoV-2
宿主与 SARS-CoV-2 之间的战斗中病毒基因调控的新机制
- 批准号:
10725416 - 财政年份:2023
- 资助金额:
$ 35.09万 - 项目类别:
Regulation of RNA sensing and viral restriction by RNA structures
RNA 结构对 RNA 传感和病毒限制的调节
- 批准号:
10667802 - 财政年份:2023
- 资助金额:
$ 35.09万 - 项目类别:
Mechanisms of viral RNA maturation by co-opting cellular exonucleases
通过选择细胞核酸外切酶使病毒 RNA 成熟的机制
- 批准号:
10814079 - 财政年份:2023
- 资助金额:
$ 35.09万 - 项目类别: