CMU-CS-04-161Computer Science Department School of Computer Science, Carnegie Mellon University
CMU-CS-04-161
Martin Zinkevich September 2004 Ph.D. Thesis
CMU-CS-04-161.ps
Keywords: Multi-agent, regret minimization, competitive analysis,
mechanism design, repeated bimatrix games, stochastic games
In the area of auction design, we consider online double auctions (exchanges) for commodities, where there are buyers and sellers trading indistinguishable goods who arrive and depart over time. We consider this from the perspective of the broker, who decides what trades occur and what payments are made. The broker may be interested in maximizing social welfare, maximizing its own profit, or maximizing market liquidity. We devise an algorithm for a broker who does not know what agents will arrive in the future that is within a logarithmic factor of the optimal offline algorithm, if the agents are truthful (although the algorithm does not motivate them to bid truthfully). This is the best competitive ratio that can be achieved. In the simpler bilateral trading problem with one buyer and one seller, we consider strategic agents and study direct mechanisms where both agents are motivated to bid truthfully, partially direct mechanisms where one agent is motivated to bid truthfully and the other agent will strategize, and general mechanisms where both agents will strategize. Agents which have to strategize must have knowledge about the typical values that buyers and sellers have in order to perform well. However, we show that higher efficiency for a broker who does not have knowledge of the domain requires more strategizing on the behalf of the buyer and seller. For instance, the broker can achieve a logarithmic factor of the optimal broker who knows the distributions if the buyer is motivated to be truthful and the seller strategizes. We also consider combinatorial auctions. We show a connection between query learning in machine learning theory and preference elicitation. We show how certain natural hypotheses spaces that can be learned with membership queries correspond to natural classes of preferences that can be learned with value queries. Finally, we consider repeated bimatrix games. One would like two reasonable agents who encounter each other many times in the same setting (e.g. a bimatrix game) to eventually perform well together. We show how a simple gradient ascent technique performs well in a bimatrix game, as well as in an arbitrary online convex programming domain. We introduce the concept of response regret, an extension of traditional concepts of regret, to account for the short-term effects that one agent's actions have on the other agent, which also can be extended to stochastic games and partially observable stochastic games. Two behaviors that minimize internal response regret in a stochastic game or a partially observable stochastic game will approach the set of correlated equilibria. One of the hardest parts of working in a domain with other intelligent agents is defining what it means to perform well and understanding what are the right assumptions to be made about the other agent. It is well known that there exists an algorithm that has no regret against an arbitrary algorithm. Furthermore, it is not hard to show that there exists an algorithm that achieves the minimum Nash equilibrium value against a no-external-regret algorithm. However, we show here that no algorithm can achieve both these guarantees. 156 pages
| |

Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |