Regularization Methods for Online Learning

在线学习的正则化方法

基本信息

批准号：
0830410
负责人：
Peter Bartlett
金额：
$ 30万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2008
资助国家：
美国
起止时间：
2008-09-01 至 2011-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0830410&HistoricalAwards=false
关键词：
Regularization Methods Online Learning

项目摘要

There are many sequential decision problems which can be appropriately modeled as a repeated game, in which the decision-maker is competing with an adversary. For instance, in the problem of virus detection in a computer network, the aim is to label incoming packets as either clean or infected, while a hacker aims to design infected packets that escape detection. Similar problems arise in other areas of computer security (including spam filtering and detection of denial-of service attacks), in internet search (such as deciding if a highly-linked web page is genuinely authoritative and should have high page rank), and in financial applications (such as portfolio optimization). In these problems, the decision-maker aims to perform almost as well as the best element of some comparison class. Even for decision problems that are not inherently adversarial, it is often appealing to model them in this way, since the assumptions are sufficiently weak that effective learning algorithms for these adversarial settings are very widely applicable. Many of the key algorithmic approaches to online learning problems can be viewed as methods involving regularization, an idea that has its origins in the solution of ill-posed problems, such as statistical estimation problems. This project aims to exploit this regularization viewpoint in the analysis and design of methods for complex online learning problems. In particular, its aims are (1) To develop techniques for decision problems with limited feedback. (2) To develop techniques for decision problems with complex losses that cannot be simply decomposed into a sum across trials. (3) To develop efficient learning algorithms that can simultaneously compete effectively with a variety of rich comparison classes and a variety of constraints on the adversary. (4) To improve our understanding of the relationships between online decision problems (in adversarial settings) and statistical decision problems (in probabilistic settings). Successful research outcomes of this project are likely to increase our understanding of complex sequential decision problems and to provide design methodologies for effective learning algorithms for these problems, and hence have a significant potential for practical impact in many application areas, including computer security and computational finance.

有许多顺序决策问题可以适当地建模为重复博弈，其中决策者与对手竞争。例如，在计算机网络中的病毒检测问题中，目标是将传入数据包标记为干净或受感染，而黑客的目标是设计逃脱检测的受感染数据包。类似的问题也出现在计算机安全的其他领域（包括垃圾邮件过滤和拒绝服务攻击检测）、互联网搜索（例如确定高度链接的网页是否真正具有权威性并且应该具有较高的页面排名）以及金融应用（例如投资组合优化）。在这些问题中，决策者的目标是表现得几乎与某些比较类别中的最佳元素一样好。即使对于本质上不是对抗性的决策问题，以这种方式对其进行建模通常也很有吸引力，因为假设足够弱，因此针对这些对抗性设置的有效学习算法非常广泛适用。在线学习问题的许多关键算法方法可以被视为涉及正则化的方法，正则化的想法起源于不适定问题的解决，例如统计估计问题。该项目旨在利用这种正则化观点来分析和设计复杂在线学习问题的方法。特别是，其目标是（1）开发针对反馈有限的决策问题的技术。 (2) 开发具有复杂损失的决策问题的技术，这些损失不能简单地分解为各个试验的总和。（3）开发高效的学习算法，能够同时与多种丰富的比较类和对对手的多种约束条件进行有效竞争。 (4) 提高我们对在线决策问题（在对抗性设置中）和统计决策问题（在概率设置中）之间关系的理解。该项目的成功研究成果可能会增加我们对复杂顺序决策问题的理解，并为这些问题的有效学习算法提供设计方法，因此在许多应用领域（包括计算机安全和计算金融）具有巨大的实际影响潜力。