Computer Science Department
School of Computer Science, Carnegie Mellon University
Coaching: Learning and Using Environment
This thesis is inspired by a simulated robot soccer environment with a coach agent who can provide advice to a team in a standard language. This author developed, in collaboration with others, this coach environment and standard language as the thesis progressed. The experiments in this thesis represent the largest known empirical study in the simulated robot soccer environment. A predator-prey domain and and a moving maze environment are used for additional experimentation. All algorithms are implemented in at least one of these environments and empirical validation is performed.
In addition to the coach problem formulation and decompositions, the thesis makes several main technical contributions: (i) Several opponent model representations with associated learning algorithms, whose effectiveness in the robot soccer domain is demonstrated. (ii) A study of the effects and need for coach learning under various limitations of the advice receiver and communication bandwidth. (iii) The Multi-Agent Simple Temporal Network, a multi-agent plan representation which is refinement of a Simple Temporal Network, with an associated distributed plan execution algorithm. (iv) Algorithms for learning an abstract Markov Decision Process from external observations, a given state abstraction, and partial abstract action templates. The use of the learned MDP for advice is explored in various scenarios.