EAGER: Learning to Efficiently Rank with Cascades

EAGER：学习使用级联进行有效排名

基本信息

批准号：
1144034
负责人：
Jimmy Lin
金额：
$ 15万
依托单位：
University of Maryland, College Park
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2011
资助国家：
美国
起止时间：
2011-09-01 至 2014-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1144034&HistoricalAwards=false
关键词：
EAGER Learning Efficiently Rank Cascades

项目摘要

Text search is undeniably vital to today's information-based societies, helping users locate relevant information in web pages, journal articles, news stories, blogs, emails, tweets, and a myriad of other sources. Naturally, users desire results that are not only good but also fast. Learning to rank, the dominant approach to information retrieval (IR) today, focuses almost exclusively on effectiveness, often neglecting the runtime speed (i.e., efficiency) of the ranking functions. This project contributes to the emerging research area of learning to efficiently rank, which aims to let algorithm designers capture, model, and reason about tradeoffs between effectiveness and efficiency in a unified framework. Specifically, this project explores a novel cascade model for retrieval, where ranking is broken into a finite number of distinct stages. Each stage considers successively richer and more complex features, but over successively smaller candidate document sets. The intuition is that although complex features are more time-consuming to compute, examining fewer documents offsets the additional overhead. In other words, the cascade model views retrieval as a multi-stage progressive refinement problem. Based on the survey of the current state-of-the-art, knowledge, this is the first project to explore this approach to the ranking problem, marking a substantial departure from previous "monolithic" ranking functions. Although exploration in this uncharted area carries some risk, this research promises to open up a new frontier in IR research. This project aims to narrow the chasm between academic and industrial IR research by bringing together theoretical IR research and practical considerations in "real-world" search. It is expected that the cascade model will be of interest to web search engine companies, thus providing a path from the exploratory research results to significant impact in production systems. Furthermore, this work dovetails with the emerging area of green computing: more efficient algorithms use less energy, hence help reduce the environmental footprint of web-scale services. The project web site (http://www.umiacs.umd.edu/~jimmylin/projects/) includes more information about this project and will be used for the release of a prototype as part of the Ivory open-source retrieval toolkit.

无可否认，文本搜索对于当今的信息社会至关重要，它可以帮助用户在网页、期刊文章、新闻报道、博客、电子邮件、推文和无数其他来源中查找相关信息。自然地，用户不仅希望获得好的结果，而且希望获得快速的结果。学习排序是当今信息检索 (IR) 的主要方法，它几乎完全关注有效性，常常忽略排序函数的运行速度（即效率）。该项目为学习高效排名这一新兴研究领域做出了贡献，旨在让算法设计者在统一的框架中捕获、建模和推理有效性和效率之间的权衡。具体来说，该项目探索了一种新颖的级联检索模型，其中排名被分为有限数量的不同阶段。每个阶段都会考虑逐渐丰富和复杂的特征，但会考虑逐渐较小的候选文档集。直觉是，虽然复杂的特征计算起来更耗时，但检查更少的文档可以抵消额外的开销。换句话说，级联模型将检索视为多阶段渐进细化问题。基于对当前最先进知识的调查，这是第一个探索这种排名问题方法的项目，标志着与以前的“单一”排名功能的实质性背离。尽管探索这个未知领域存在一定风险，但这项研究有望开辟红外研究的新领域。该项目旨在通过将理论 IR 研究和“现实世界”搜索中的实际考虑结合起来，缩小学术和工业 IR 研究之间的鸿沟。预计级联模型将引起网络搜索引擎公司的兴趣，从而提供从探索性研究结果到对生产系统产生重大影响的途径。此外，这项工作与新兴的绿色计算领域相吻合：更高效的算法使用更少的能源，因此有助于减少网络规模服务的环境足迹。该项目网站 (http://www.umiacs.umd.edu/~jimmylin/projects/) 包含有关该项目的更多信息，并将用于作为 Ivory 开源检索工具包的一部分发布原型。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Jimmy Lin其他文献

Sensornet

传感器网

DOI：
发表时间：
2009
期刊：
Encyclopedia of Database Systems
影响因子：
0
作者：
Rodney Topor;Kenneth Salem;Amarnath Gupta;K. Goda;John F. Gehrke;N. Palmer;Mohamed Sharaf;Alexandros Labrinidis;J. Roddick;Ariel Fuxman;Renée J. Miller;Wang;Anastasios Kementsietsidis;Philippe Bonnet;D. Shasha;Ronald Peikert;Bertram Ludäscher;S. Bowers;T. McPhillips;Harald Naumann;K. Voruganti;J. Domingo;Ben Carterette;Panagiotis G. Ipeirotis;Marcelo Arenas;Y. Manolopoulos;Y. Theodoridis;V. Tsotras;B. Carminati;Jan Jurjens;Eduardo B. Fernandez;Murat Kantarcıoǧlu;Jaideep Vaidya;Indrakshi Ray;Athena Vakali;Cristina Sirangelo;E. Pitoura;Himanshu Gupta;Surajit Chaudhuri;G. Weikum;Ulf Leser;David W. Embley;Fausto Giunchiglia;P. Shvaiko;Mikalai Yatskevich;Edward Y. Chang;Christine Parent;S. Spaccapietra;E. Zimányi;G. Anadiotis;S. Kotoulas;Ronny Siebes;Grigoris Antoniou;D. Plexousakis;J. Bailey;François Bry;Tim Furche;Sebastian Schaffert;David Martin;Gregory D. Speegle;Krithi Ramamritham;P. Chrysanthis;Kai;Stéphane Bressan;S. Abiteboul;D. Suciu;G. Dobbie;Tok Wang Ling;Sugato Basu;Ramesh Govindan;Michael H. Böhlen;C. S. Jensen;Jianyong Wang;K. Vidyasankar;A. Chan;Serge Mankovski;S. Elnikety;P. Valduriez;Yannis Velegrakis;Mario A. Nascimento;Michael Huggett;Andrew U. Frank;Yanchun Zhang;Guandong Xu;R. Snodgrass;Alan Fekete;Marcus Herzog;Konstantinos Morfonios;Y. Ioannidis;E. Wohlstadter;M. Matera;F. Schwagereit;Steffen Staab;Keir Fraser;Jingren Zhou;M. Mokbel;Walid G. Aref;Mirella M. Moro;Markus Schneider;Panos Kalnis;Gabriel Ghinita;Michael F. Goodchild;Shashi Shekhar;James Kang;Vijayaprasath Gandhi;Nikos Mamoulis;Betsy George;Michel Scholl;Agnès Voisard;Ralf Hartmut Güting;Yufei Tao;Dimitris Papadias;Peter Revesz;G. Kollios;E. Frentzos;Apostolos N. Papadopoulos;Bernhard Thalheim;Jovan Pehcevski;Benjamin Piwowarski;S. Theodoridis;Konstantinos Koutroumbas;George Karabatis;Don Chamberlin;Philip A. Bernstein;Michael H. Böhlen;J. Gamper;Ping Li;Kazimierz Subieta;S. Harizopoulos;Ethan Zhang;Yi Zhang;Theodore Johnson;Hans;S. Fienberg;Jiashun Jin;Radu Sion;C. Paice;Nikos Hardavellas;Ippokratis Pandis;Edie M. Rasmussen;Hiroshi Yoshida;G. Graefe;Bernd Reiner;Karl Hahn;K. Wada;T. Risch;Jiawei Han;Bolin Ding;Lukasz Golab;Michael Stonebraker;Bibudh Lahiri;Srikanta Tirthapura;Erik Vee;Yanif Ahmad;U. Çetintemel;Mitch Cherniack;S. Zdonik;Mariano P. Consens;M. Lalmas;R. Baeza;D. Hiemstra;Peer Krögerand;Arthur Zimek;Nick Craswell;Carson Kai;Maxime Crochemore;Thierry Lecroq;Arie Shoshani;Jimmy Lin;Hwanjo Yu;David B. Lomet;H. Hinterberger;Ninghui Li;Phillip B. Gibbons;Mouna Kacimi;Thomas Neumann
通讯作者：
Thomas Neumann

PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

PartsList：一个基于网络的系统，用于根据不同的属性（包括全基因组表达和相互作用信息）对蛋白质折叠进行动态排名。

DOI：
发表时间：
2001
期刊：
Nucleic Acids Research
影响因子：
14.9
作者：
Jiang Qian;Brad Stenger;Cyrus A. Wilson;Jimmy Lin;R. Jansen;S. Teichmann;Jong H. Park;W. G. Krebs;Haiyuan Yu;Vadim Alexandrov;N. Echols;M. Gerstein
通讯作者：
M. Gerstein