CAREER: A Theoretical Exploration of Efficient and Accurate Clustering Algorithms
职业生涯:高效准确聚类算法的理论探索
基本信息
- 批准号:2337832
- 负责人:
- 金额:$ 64.77万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-05-01 至 2029-04-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Clustering, a fundamental technique in data analysis and machine learning, involves grouping data points to ensure higher similarities within a group (cluster) than across clusters. Since its inception in the early 20th century, clustering has proven highly valuable in diverse fields such as biology, economics, marketing, statistics, computer science, and social network analysis. This CAREER project aims to create a unified framework along with various tools and techniques to design optimal approximation algorithms for a broad range of NP-hard clustering problems. Despite significant progress in developing efficient approximations for computationally hard clustering problems, the analysis of the large-scale and complex datasets remains challenging resulting in suboptimal accuracy and limiting practical applications in science and engineering. The algorithmic development and analysis are expected to have a direct impact on the fields of data science and bioinformatics. The educational components of the project include a 3-day summer workshop for high school students, training and mentoring of graduate students, development of new courses, and the fostering of student participation from underrepresented groups in theoretical computer science.The project comprises two primary thrusts. The first thrust focuses on addressing clustering challenges in strings with special emphasis on centroid-based clustering problems such as k-median and k-center. This study will encompass various metric spaces including edit distance, Ulam, and Kendall tau. The objective is to formulate methodologies that transcend the limitations set by the triangle inequality, yielding polynomial-time algorithms with approximations arbitrarily close to one. The second thrust focuses on issues related to hierarchical clustering. Here the objective is to evaluate how well the techniques and frameworks originally devised for addressing string clustering problems can be adapted for aggregating hierarchical clusters. Additionally, the project will develop new methods for constructing hierarchical clusters specifically tailored to address challenges related to large datasets. The new tools and combinatorial methods developed in this project are expected to enhance algorithms for clustering problems and should find applicability in various other domains including communication complexity, streaming algorithms, and distributed systems.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
聚类是数据分析和机器学习中的一项基本技术,涉及对数据点进行分组,以确保组(簇)内的相似性高于簇间的相似性。自 20 世纪初诞生以来,集群已被证明在生物学、经济学、市场营销、统计学、计算机科学和社交网络分析等不同领域具有极高的价值。该职业项目旨在创建一个统一的框架以及各种工具和技术,为广泛的 NP 难聚类问题设计最佳逼近算法。尽管在开发计算困难的聚类问题的有效近似方法方面取得了重大进展,但对大规模和复杂数据集的分析仍然具有挑战性,导致精度不佳并限制了科学和工程中的实际应用。算法的开发和分析预计将对数据科学和生物信息学领域产生直接影响。该项目的教育组成部分包括为高中生举办为期 3 天的夏季研讨会、研究生培训和指导、新课程开发以及促进理论计算机科学领域代表性不足群体的学生参与。该项目包括两个主要目标。第一个重点是解决字符串中的聚类挑战,特别强调基于质心的聚类问题,例如 k 中值和 k 中心。这项研究将涵盖各种度量空间,包括编辑距离、Ulam 和 Kendall tau。目标是制定超越三角不等式限制的方法,产生近似值任意接近于 1 的多项式时间算法。第二个重点关注与层次聚类相关的问题。这里的目标是评估最初为解决字符串聚类问题而设计的技术和框架如何适用于聚合层次聚类。此外,该项目将开发新的方法来构建专门定制的分层集群,以解决与大型数据集相关的挑战。该项目中开发的新工具和组合方法预计将增强聚类问题的算法,并应在其他各个领域找到适用性,包括通信复杂性、流算法和分布式系统。该奖项反映了 NSF 的法定使命,并被认为值得支持通过使用基金会的智力优点和更广泛的影响审查标准进行评估。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Debarati Das其他文献
Crystal engineering with isosteric triether and triamine linked aromatic tri-carboxylic acids: iso-structurality and synthon interplay in their co-crystals and salts with bis(pyridyl) derivatives
等排三醚和三胺连接芳香族三羧酸的晶体工程:其共晶和双(吡啶基)衍生物盐中的同构结构和合成子相互作用
- DOI:
10.1039/c8nj04600j - 发表时间:
2018-12-03 - 期刊:
- 影响因子:3.3
- 作者:
Debarati Das;S;ipan Roy;ipan;K. Biradha - 通讯作者:
K. Biradha
Approximate Trace Reconstruction via Median String (in Average-Case)
通过中值串进行近似迹重建(在平均情况下)
- DOI:
10.4230/lipics.fsttcs.2021.11 - 发表时间:
2021-07-20 - 期刊:
- 影响因子:0
- 作者:
Diptarka Chakraborty;Debarati Das;Robert Krauthgamer - 通讯作者:
Robert Krauthgamer
Modulating the effective ionic radii of trivalent dopants in ceria using a combination of dopants to improve catalytic efficiency for the oxygen evolution reaction
使用掺杂剂组合调节二氧化铈中三价掺杂剂的有效离子半径,以提高析氧反应的催化效率
- DOI:
10.1039/d4ra03360d - 发表时间:
2024-05-28 - 期刊:
- 影响因子:3.9
- 作者:
Debarati Das;Jyoti Prakash;Anisha Bandyopadhyay;Annu Balhara;U. K. Goutam;R. Acharya;S. K. Gupta;K. Sudarshan - 通讯作者:
K. Sudarshan
A single hydrogen bond that tunes flavin redox reactivity and activates it for modification
调节黄素氧化还原反应性并激活其进行修饰的单氢键
- DOI:
10.1039/d4sc01642d - 发表时间:
2024-04-24 - 期刊:
- 影响因子:8.4
- 作者:
Debarati Das;Anne - 通讯作者:
Anne
Influence of Selenium on Growth, Antioxidants Production and Physiological Parameters of Rice (Oryza sativa L.) Seedlings and Its Possible Reversal by Coapplication of Sulphate
硒对水稻(Oryza sativa L.)幼苗生长、抗氧化剂产生和生理参数的影响及其联合施用硫酸盐可能逆转的影响
- DOI:
10.4236/ajps.2019.1012158 - 发表时间:
2019-12-05 - 期刊:
- 影响因子:0
- 作者:
Debarati Das;P. Seal;A. K. Biswas - 通讯作者:
A. K. Biswas
Debarati Das的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Debarati Das', 18)}}的其他基金
Travel: NSF Student Travel Grant for TCS for All Meeting at STOC 2023 and Professional Mentoring Panel at FOCS 2023
旅行:为 STOC 2023 的所有会议和 FOCS 2023 的专业指导小组提供 TCS 的 NSF 学生旅行补助金
- 批准号:
2326395 - 财政年份:2023
- 资助金额:
$ 64.77万 - 项目类别:
Standard Grant
Travel: NSF Student Travel Grant for TCS for All Meeting at STOC 2023 and Professional Mentoring Panel at FOCS 2023
旅行:为 STOC 2023 的所有会议和 FOCS 2023 的专业指导小组提供 TCS 的 NSF 学生旅行补助金
- 批准号:
2326395 - 财政年份:2023
- 资助金额:
$ 64.77万 - 项目类别:
Standard Grant
相似国自然基金
基于新一代信息技术的复杂油气储层地震勘探理论和方法
- 批准号:42330801
- 批准年份:2023
- 资助金额:231 万元
- 项目类别:重点项目
主动源与被动源数据联合驱动的多震源地震勘探理论方法研究
- 批准号:42130805
- 批准年份:2021
- 资助金额:290 万元
- 项目类别:重点项目
煤系气高效勘探开发的岩石力学地层理论方法体系研究
- 批准号:
- 批准年份:2020
- 资助金额:300 万元
- 项目类别:重点项目
地震波的形态特征属性与智能识别方法研究
- 批准号:41904098
- 批准年份:2019
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
深度偏移地震数据特征剖析与深度域直接反演方法研究
- 批准号:41874153
- 批准年份:2018
- 资助金额:63.0 万元
- 项目类别:面上项目
相似海外基金
Sibling-Support for Adolescent Girls (SSAGE): A whole-family, gendertransformative approach to preventing mental illness among forcibly displaced adolescent girls
青春期女孩兄弟姐妹支持 (SSAGE):一种全家庭、性别变革的方法,用于预防被迫流离失所的青春期女孩的精神疾病
- 批准号:
10730656 - 财政年份:2023
- 资助金额:
$ 64.77万 - 项目类别:
Experimental and theoretical studies of trace species in Earth materials, from mantle geodynamics to environmental applications and mineral exploration
地球材料中痕量物质的实验和理论研究,从地幔地球动力学到环境应用和矿物勘探
- 批准号:
RGPIN-2018-04106 - 财政年份:2022
- 资助金额:
$ 64.77万 - 项目类别:
Discovery Grants Program - Individual
Experimental and theoretical studies of trace species in Earth materials, from mantle geodynamics to environmental applications and mineral exploration
地球材料中痕量物质的实验和理论研究,从地幔地球动力学到环境应用和矿物勘探
- 批准号:
RGPIN-2018-04106 - 财政年份:2022
- 资助金额:
$ 64.77万 - 项目类别:
Discovery Grants Program - Individual
Experimental and theoretical studies of trace species in Earth materials, from mantle geodynamics to environmental applications and mineral exploration
地球材料中痕量物质的实验和理论研究,从地幔地球动力学到环境应用和矿物勘探
- 批准号:
RGPIN-2018-04106 - 财政年份:2021
- 资助金额:
$ 64.77万 - 项目类别:
Discovery Grants Program - Individual
A theoretical exploration of the evolution of wind pollination
风授粉演化的理论探索
- 批准号:
566053-2021 - 财政年份:2021
- 资助金额:
$ 64.77万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Master's