Text Analytics, Machine Learning & Biomedical Data Science
文本分析、机器学习
基本信息
- 批准号:8941588
- 负责人:
- 金额:$ 241.03万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:3-DimensionalAddressAgingAlgorithmsAlzheimer&aposs DiseaseAnnual ReportsApplications GrantsArchivesBig DataBiologicalBiological AssayBiological MarkersBiomedical ResearchCatalogingCatalogsCategoriesCellsClassificationClinicalClinical DataClinical InformaticsCodeCollaborationsCollectionCommunicationCommunitiesComplexComputational LinguisticsComputersComputing MethodologiesCoupledCritiquesDataData AnalysesData SetDatabasesDevelopmentDiagnosisDisciplineDisease AssociationDrosophila genomeEngineeringEpidemiologic StudiesEpidemiologyEvaluationEvidence Based MedicineExtramural ActivitiesFeedbackFosteringFundingFunding AgencyGene Expression ProfilingGenesGenomicsGoalsGrant Review ProcessGuidelinesHigh-Throughput Nucleotide SequencingHumanImageImage AnalysisImageryImmunoglobulin GenesIndividualInformaticsInformation ResourcesInformation SciencesInvestigationJob DescriptionKnowledgeLeadershipLearningLocationLysosomesMachine LearningManagement Information SystemsMeasuresMedicalMelissaMethodologyMethodsMetricModelingMolecularMolecular BankMouth DiseasesNational Institute of Allergy and Infectious DiseaseNational Institute of Dental and Craniofacial ResearchNational Institute of Diabetes and Digestive and Kidney DiseasesNational Institute of Drug AbuseNational Institute of Neurological Disorders and StrokeNatural Language ProcessingOccupationalOnline SystemsPatternPeer ReviewPerceptionPilot ProjectsPlasmaPlasmidsProductionProteinsProteomicsProtocols documentationReportingResearchResearch InfrastructureResearch PersonnelResearch Project GrantsResource SharingRetrievalRunningSalivaSalivary ProteinsSamplingScienceScientistSemanticsSocial NetworkSoftware EngineeringStimulusStudy SectionSurveysSystemSystemic diseaseSystems BiologyTechniquesTechnologyTextTrainingTranscriptTranscription Initiation SiteTransgenic OrganismsTranslational ResearchUnited States National Institutes of HealthV(D)J RecombinationWorkbasebehavioral/social sciencebiological systemsbiomedical informaticscancer epidemiologycluster computingcomputer sciencecomputing resourcesdata managementdata miningdata sharingdesignevidence baseexperiencefluorescence imaginghigh throughput screeningimprovedinformation organizationinnovationinsightinteroperabilityknowledge basemacrophageneuroimagingnoveloptogeneticspeerprogramsprototyperepositoryresearch studyscreeningsocial science researchtext searchingtool
项目摘要
The Text Analytics, Machine Learning, and Biomedical Data Science, which operates within the Collaborative Research Office in Computer and Information Science (CROCIS), Division of Computational Bioscience of CIT, is collaborating with NIH investigators to build a critical mass in text and numerical analytics that is envisioned to encompass a number of pertinent and related disciplines in biomedical research including semantic interoperability, knowledge engineering, computational linguistics, text and data mining, natural language processing, machine learning, and visualization. The program is intended to foster advances in critical domains at NIH including biomedical and clinical informatics, translational research, genomics, proteomics, systems biology, "big data" analysis, and portfolio analysis. In 2013, collaborative efforts in support of these goals included the following.
- In collaboration with NIAID, CROCIS is developing a new algorithm capable of analyzing V(D)J recombination in thousands of immunoglobulin gene sequences produced by high throughput sequencing.
- CROCIS is working with Melissa Friesen of NCI to develop methodologies to improve exposure classification in occupational epidemiologic studies. Initial effort of this collaboration involves a tool that helps experts to classify free-text job descriptions into standard occupational codes. Machine-learning based classification methods will also be utilized to help with evaluating exposure-disease associations.
- In collaboration with NINDS, CROCIS has implemented and compared several methods to locate and characterize lysosomes in 3-D fluorescence images. The goal is to be able to calculate the pH of each lysosome in the image, for which the ability to resolve their locations is an important step.
- In collaboration with NIA, we are applying machine learning and visualization techniques on large biological datasets to discover novel patterns of functional gene or protein interactions as related to aging. Omnimorph, a graphic data analysis tool, is being developed for multidimensional data visualization. In this collaboration, we are also developing a model to predict the progression of Alzheimer's disease using plasma proteomic biomarker data from the Alzheimer's Disease Neuroimaging Initiative (ADNI).
- Machine-learning methods have been devised and implemented to identify and refine transcription start sites in the fruit fly genome found using cap analysis gene expression (CAGE). This effort is in collaboration with Brian Oliver of NIDDK.
- CROCIS is collaborating with NIAID in developing an image analysis pipeline to quantify individual transcript molecules in macrophage cells to help understand the molecular mechanism of macrophage adaptation to various stimuli at the single-cell level.
- A freely available plasmid database that is interoperable with popular freeware is currently being developed for the NIDA Optogenetics and Transgenic Technology Core. The plasmid database offers a versatile yet simple platform for scientists to store and analyze their plasmid data. Motivated by the need for a more comprehensive approach to archiving plasmid data, the database platform is enriched with numerous components beyond the repository, serving as an informatics platform designed to enhance the efficiency and analytic capabilities of scientists.
- In collaboration with CSR, CROCIS is applying text analytics to provide CSR leadership with evidence-based decision support in evaluation of the grant review process. The effort so far has concentrated on exploratory analysis against the NIH portfolio to evaluate clustering methods and assess intrinsic measures of cluster quality. Content-based application referral tools are being developed to help evaluate the merit of PIs study section requests, and to recommend the most suitable study section for an application if no requests are made. In addition, CROCIS is analyzing text from quick feedback surveys on peer review. This effort includes evaluating a pilot study to evaluate the feasibility of analyzing free text from peer reviewers on their perception of the study section quality. If successful, the pilot results will be used to as initial input for a full-scale implementation.
- CROCIS has been collaborating with the Molecular Libraries Program (MLP), part of the NIH Common Fund, to develop the Common Assay Reporting System (CARS). CARS is an integrated system for managing bioassay information and facilitating communication between all the high-throughput screening centers within the Molecular Libraries Probe Production Centers Network (MLPCN). Goals for this collaboration include: 1) Track project status and related issues at each of the screening centers within the MLPCN, and provide the means for information collection, sharing and retrieval among the centers and the program office at NIH. 2) Establish a standardized protocol to describe raw data from the experiments and report screening data to the scientific community.
- The human salivary protein catalog has been made available online on a community-based Web portal developed by CROCIS, in collaboration with NIDCR, to enable scientists to add their own research data, share results, and discover new knowledge. This is a major step towards the discovery and use of saliva biomarkers to diagnose oral and systemic diseases.
- CROCIS investigators worked with the Office of Extramural Research (OER) on applying machine-learning methods to identify important terms that peer reviewers use to describe innovative applications. The goal of the effort was to develop a lexicon of terms that can help estimate the innovation level of a grant application based on peer review critiques from the applications NIH Summary Statement.
- Although the scientific impact of NCI consortia on the advancement of cancer epidemiology research is understood to be significant, accurate quantitative metrics of this impact are needed by program leadership. We are developing methods to track citations to clinical guidelines in the context of evidence-based medicine that could provide funding agencies and program directors insight into individual consortia's contributions in advancing medical knowledge. This work is being conducted in collaboration with Epidemiology and Genomics Research Program (EGRP), NCI.
- Based on its experience in building novel models for classifying research grants and projects, CROCIS is collaborating with DPCPSI/OD and other ICs to develop the Portfolio Learning Tool, a comprehensive classification workflow system that will allow users to select from multiple classification algorithms, feature spaces, and training regimes, to build and run their own classifiers. A particular prototype of this system is being tailored to assist NCI Intramural investigators in reporting their research to the Annual Report system. CROCIS has been developing an augmented support vector machine (SVM) that augments a training set by sampling from a corpus of unknowns and runs a large ensemble on various samples of this augmented space. The results obtained from this classifier suggest that, when coupled with an effective annotation strategy, such a classifier can be quite effective at categorizing a research portfolio.
- The Office of Behavioral and Social Sciences Research (OBSSR) is conducting a pilot investigation in collaboration with CROCIS to evaluate the efficacy of machine learning models for the classification of five BSSR-relevant research categories.
文本分析,机器学习和生物医学数据科学在计算机和信息科学的协作研究办公室(Crocis)(CROCIS),CIT的计算生物科学的划分,正在与NIH研究人员进行合作,以在文本和数值分析中建立一个批判性质量,这些质量和数值分析被设想,包括在内,包括多个实质性和相关的计算型元素研究,包括众多的元素研究,用于综合性的元素,用于综合性元素。采矿,自然语言处理,机器学习和可视化。 该计划旨在促进NIH关键领域的进步,包括生物医学和临床信息学,转化研究,基因组学,蛋白质组学,系统生物学,“大数据”分析和投资组合分析。 2013年,为支持这些目标的协作努力包括以下内容。
- 与NIAID合作,Crocis正在开发一种能够分析V(d)J重组的新算法,以高吞吐量测序产生的数千种免疫球蛋白基因序列。
-Crocis与NCI的Melissa Friesen合作,开发了改善职业流行病学研究中暴露分类的方法。这项合作的初步努力涉及一种工具,该工具可以帮助专家将自由文本的作业描述分类为标准职业代码。基于机器学习的分类方法也将用于评估暴露症疾病关联。
- 与Ninds合作,Crocis已实施并比较了几种定位和表征3-D荧光图像中溶酶体的方法。 目的是能够计算图像中每个溶酶体的pH值,而解决位置的能力是重要的一步。
- 与NIA合作,我们正在大型生物数据集上应用机器学习和可视化技术,以发现与衰老相关的功能基因或蛋白质相互作用的新型模式。 Omnimorph是一种图形数据分析工具,正在开发用于多维数据可视化。在这项合作中,我们还开发了一个模型,以使用阿尔茨海默氏病神经影像学计划(ADNI)的血浆蛋白质组学生物标志物数据(ADNI)来预测阿尔茨海默氏病的进展。
- 已经设计并实施了机器学习方法,以识别和完善使用CAP分析基因表达(CAGE)发现的果蝇基因组中的转录起始位点。 这项工作与Niddk的Brian Oliver合作。
-Crocis正在与NIAID合作开发图像分析管道,以量化巨噬细胞中的单个转录物分子,以帮助了解巨噬细胞在单细胞水平上适应各种刺激的分子机制。
- 目前正在为NIDA光学遗传学和转基因技术核心开发与流行免费软件可互操作的免费质粒数据库。质粒数据库为科学家提供了一个多功能但简单的平台,以存储和分析其质粒数据。由于需要采用更全面的方法来存档质粒数据的动机,数据库平台充满了存储库以外的许多组件,它是一个信息平台,旨在提高科学家的效率和分析能力。
- 与CSR合作,Crocis正在应用文本分析,以在评估赠款审查过程中为CSR领导提供基于证据的决策支持。 迄今为止的努力集中在针对NIH投资组合的探索性分析上,以评估聚类方法并评估集群质量的内在度量。 正在开发基于内容的应用转介工具,以帮助评估PIS研究部分请求的优点,并在不提出请求时建议使用最合适的研究部分。 此外,Crocis正在分析有关同行评审的快速反馈调查中的文本。 这项工作包括评估一项试点研究,以评估分析同伴评论者对研究部分质量的自由文本的可行性。 如果成功,则试点结果将用作全面实施的初始输入。
- Crocis一直与NIH普通基金的一部分分子图书馆计划(MLP)合作,以开发共同的测定报告系统(CARS)。 CARS是一个集成系统,用于管理生物测定信息并促进分子库中所有高通量筛选中心之间的通信,探测生产中心网络(MLPCN)。此协作的目标包括:1)MLPCN内每个筛选中心的跟踪项目状态和相关问题,并为NIH的中心和计划办公室之间提供信息收集,共享和检索的手段。 2)建立标准化协议,以描述实验中的原始数据,并向科学界报告筛选数据。
- 人类唾液蛋白目录已在Crocis与NIDCR合作开发的基于社区的Web门户网站上在线提供,以使科学家能够添加自己的研究数据,共享结果并发现新知识。这是朝着发现和使用唾液生物标志物来诊断口腔和系统性疾病的主要一步。
- Crocis研究人员与校外研究办公室(OER)合作,采用机器学习方法来确定同伴审稿人用来描述创新应用的重要术语。 努力的目的是开发术语的词典,可以帮助估计基于NIH摘要声明的同行评审批评的赠款申请的创新水平。
- 尽管NCI联盟对癌症流行病学研究的进步的科学影响被认为是重要的,但计划领导需要对这种影响的准确定量指标。我们正在开发在基于循证医学的背景下追踪引用临床准则的方法,该药物可以为个人财团在提高医学知识方面的贡献提供资金机构和计划董事的洞察力。这项工作正在与NCI的流行病学和基因组学研究计划(EGRP)合作进行。
- 基于其在为研究赠款和项目分类的新颖模型方面的经验,Crocis正在与DPCPSI/OD和其他ICS合作开发Portfolio Learning Tool,这是一个全面的分类工作流程系统,该工具将允许用户从多个分类算法,特征空间,培训方案,构建和运行自己的分类器中选择。 该系统的一个特殊原型是为了帮助NCI内壁内研究人员向年度报告系统报告其研究。 Crocis一直在开发增强的支持向量机(SVM),该机器通过从未知数的语料库中取样而增加了训练,并在此增强空间的各个样本上运行了一个大合奏。从该分类器获得的结果表明,当与有效的注释策略结合使用时,这种分类器可以非常有效地对研究组合进行分类。
- 行为和社会科学研究办公室(OBSSR)正在与Crocis合作进行试点调查,以评估机器学习模型对五个与BSSR相关的研究类别的分类的功效。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
calvin a johnson其他文献
calvin a johnson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('calvin a johnson', 18)}}的其他基金
Text Analytics, Knowledge Engineering, & High Performance Computing
文本分析、知识工程、
- 批准号:
8565613 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
- 批准号:
7593224 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
Text Analytics, Machine Learning & High Performance Computing
文本分析、机器学习
- 批准号:
8746900 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
- 批准号:
7145121 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
Informatics, Machine Learning & Biomedical Data Science
信息学、机器学习
- 批准号:
9146134 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
- 批准号:
6988044 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
- 批准号:
7733758 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
- 批准号:
6988075 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
Collective Intelligence, Knowledge Infrastructure, & High Performance Computing
集体智慧、知识基础设施、
- 批准号:
7970400 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
- 批准号:
7296863 - 财政年份:
- 资助金额:
$ 241.03万 - 项目类别:
相似国自然基金
时空序列驱动的神经形态视觉目标识别算法研究
- 批准号:61906126
- 批准年份:2019
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
本体驱动的地址数据空间语义建模与地址匹配方法
- 批准号:41901325
- 批准年份:2019
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大容量固态硬盘地址映射表优化设计与访存优化研究
- 批准号:61802133
- 批准年份:2018
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
IP地址驱动的多径路由及流量传输控制研究
- 批准号:61872252
- 批准年份:2018
- 资助金额:64.0 万元
- 项目类别:面上项目
针对内存攻击对象的内存安全防御技术研究
- 批准号:61802432
- 批准年份:2018
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Fluency from Flesh to Filament: Collation, Representation, and Analysis of Multi-Scale Neuroimaging data to Characterize and Diagnose Alzheimer's Disease
从肉体到细丝的流畅性:多尺度神经影像数据的整理、表示和分析,以表征和诊断阿尔茨海默病
- 批准号:
10462257 - 财政年份:2023
- 资助金额:
$ 241.03万 - 项目类别:
Core D: Integrated Computational Analysis Core
核心D:综合计算分析核心
- 批准号:
10555896 - 财政年份:2023
- 资助金额:
$ 241.03万 - 项目类别:
The contribution of air pollution to racial and ethnic disparities in Alzheimer’s disease and related dementias: An application of causal inference methods
空气污染对阿尔茨海默病和相关痴呆症的种族和民族差异的影响:因果推理方法的应用
- 批准号:
10642607 - 财政年份:2023
- 资助金额:
$ 241.03万 - 项目类别:
p16INK4a+ fibroblasts regulate epithelial regeneration after injury in lung alveoli through the SASP
p16INK4a成纤维细胞通过SASP调节肺泡损伤后的上皮再生
- 批准号:
10643269 - 财政年份:2023
- 资助金额:
$ 241.03万 - 项目类别:
Commercial translation of high-density carbon fiber electrode arrays for multi-modal analysis of neural microcircuits
用于神经微电路多模态分析的高密度碳纤维电极阵列的商业转化
- 批准号:
10761217 - 财政年份:2023
- 资助金额:
$ 241.03万 - 项目类别: