CALD TECHNICAL REPORT ABSTRACTS

	CMU-CALD-05-112 Center for Automated Learning and Discovery School of Computer Science, Carnegie Mellon University CMU-CALD-05-112 No-Regret Algorithms for Structured Prediction Problems Geoffrey J. Gordon December 2005 CMU-CALD-05-112.pdf Keywords: Lagrangian Hedging algorithms, online learning, multiagent learning, online convex programming, structured prediction problems, extensive-form games, poker, Blackwell approachability No-regret algorithms are a popular class of online learning rules. Unfortunately, most no-regret algorithms assume that the set γ of allowable hypotheses is small and discrete. We consider instead prediction problems where γ has internal structure: γ might be the set of strategies in a game like poker, the set of paths in a graph, or the set of congurations of a data structure like a rebalancing binary search tree; or γ might be a given convex set (the "online convex programming" problem) or in general an arbitrary bounded set. We derive a family of no-regret learning rules, called Lagrangian Hedging algorithms, to take advantage of this structure. Our algorithms are a direct generalization of known no-regret learning rules like weighted majority and external-regret matching. In addition to proving regret bounds, we demonstrate one of our algorithms learning to play one-card poker. 45 pages

SCS Technical Report Collection School of Computer Science homepage This page maintained by reports@cs.cmu.edu