CMU-CS-24-118
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-24-118

Enhancing Policy Transfer in Action Advising
for Reinforcement Learning

Yue (Sophie) Guo

Ph.D. Thesis

May 2024

CMU-CS-24-118.pdf


Keywords: Reinforcement Learning, Multi-agent System, Transfer Learning, Knowledge Transfer, Robotics, Deep Learning

As human students benefit from teachers' advice to accelerate their learning, so could agents benefit from advice. Agents can not only learn from humans, e.g. via human expert demonstrations, but also from other agents. This thesis focuses on action advising, a knowledge transfer technique built upon the teacher-student paradigm of reinforcement learning. In this approach, the teacher agent provides action advice calculated from its policy given the student's observations.

Although action advising has been studied over the past decade, the focus of the related work has primarily been on when to advise. We extend the current state-of-the-art by studying the following additional challenges: (1) In existing work, advice is given without explaining the rationale behind it. The student can therefore hardly understand teacher's decisions, or internalize the knowledge to generalize teacher's advice; (2) In many situations, the teacher might be suboptimal in a new environment, but there are no current approaches that enable the student to discern when some particular pieces of advice might not be applicable; (3) No present techniques enable the teacher to evaluate the quality of its advice before giving it to the student; (4) If the student interacts in a new environment, the teacher has limited knowledge as it does not collect the student’s data; (5) The teacher with a fixed pre-trained policy might not be able to provide flexible advice; (6) The advice has rarely been applied to human students.

In this thesis, we present our solutions to tackle the aforementioned challenges and exhibit the empirical effectiveness of our proposed methods. We also propose potential pathways for subsequent research. Our ultimate goal is to delve into the intricacies and potentials of action advising in reinforcement learning, thereby directing future advancements in the field.

Additionally, this thesis broadens its scope by examining further applications of transfer learning, showcasing its utility and adaptability in varied contexts beyond action advising. These explorations contribute to a deeper understanding of transfer learning's potential within the broader field of reinforcement learning.

167 pages

Thesis Committee:
Katia Sycara (Chair)
Fei Fang
Zico Kolter
Matthew E. Taylor (Unversity of Alberta)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu