Computer Science Department
School of Computer Science, Carnegie Mellon University


Tree Based Hierarchical Reinforcement Learning

William T. B. Uther

August 2002

Ph.D. Thesis

Keywords: Reinforcement learning, regression trees, Markov Decision Processes, Semi-Markov Decision Processes, constructive induction, hierarchical reinforcement learning

In this thesis we investigate methods for speeding up automatic control algorithms. Specifically, we provide new abstraction techniques for Reinforce-ment Learning and Semi-Markov Decision Processes (SMDPs). We introduce the use of policies as temporally abstract actions. This is different from pre-vious definitions of temporally abstract actions as we do not have termination criteria. We provide an approach for processing previously solved problems to extract these policies. We also contribute a method for using supplied or extracted policies to guide and speed up problem solving of new problems. We treat extracting policies as a supervised learning task and introduce the Lumberjack algorithm that extracts repeated sub-structure within a decision tree. We then introduce the TTree algorithm that combines state and temporal abstraction to increase problem solving speed on new problems. TTree solves SMDPs by using both user and machine supplied policies as temporally abstract actions while generating its own tree based abstract state representation. By combining state and temporal abstraction in this way, TTree is the only known SMDP algorithm that is able to ignore irrelevant or harmful subregions within a supplied abstract action while still making use of other parts of the abstract action.

163 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by