CMU-ML-18-101
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-18-101

Efficient Methods for Prediction and Control
in Partially Observable Environments

Ahmed Hefny

April 2018

Ph.D. Thesis

CMU-ML-18-101.pdf


Keywords: Dynamical Systems, Recursive Filters, Predictive State, Method of Moments, Reinforcement Learning


State estimation and tracking (also known as filtering) is an integral part of any system performing inference in a partially observable environment, whether it is a robot that is gauging an environment through noisy sensors or a natural language processing system that is trying to model a sequence of characters without full knowledge of the syntactic or semantic state of the text.

In this work, we develop a framework for constructing state estimators. The framework consists of a model class, referred to as predictive state models, and a learning algorithm, referred to as two-stage regression. Our framework is based on two key concepts: (1) predictive state: where our belief about the latent state of the environment is represented as a prediction of future observation features and (2) instrumental regression: where features of previous observations are used to remove sampling noise from future observation statistics, allowing for unbiased estimation of system dynamics.

These two concepts allow us to develop efficient and tractable learning methods that reduce the unsupervised problem of learning an environment model to a supervised regression problem: first, a regressor is used to remove noise from future observation statistics. Then another regressor uses the denoised observation features to estimate the dynamics of the environment.

We show that our proposed framework enjoys a number of theoretical and practical advantages over existing methods, and we demonstrate its efficacy in a prediction setting, where the task is to predict future observations, as well as a control setting, where the task is to optimize a control policy via reinforcement learning.

173 pages

Thesis Committee:
Geoffery Gordon (Chair)
Martial Hebert
Eric Xing
Byron Boots (Georgia Institute of Technology)

Manuela M. Veloso, Head, Machine Learning Department
Andrew W. Moore, Dean, School of Computer Science


SCS Technical Report Collection
School of Computer Science