CMU-ML-18-109
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-18-109

Teaching Machines to Classify from
Natural Language Interations

Shashank Srivastava

September 2018

Ph.D. Thesis

CMU-ML-18-109.pdf


Keywords: Machine learning, Natural Language Understanding, Semantic Parsing, Conversational Learning, Grounded Language, Computational Linguistics, Pragmatics, Concept Learning


Humans routinely learn new concepts using natural language communications, even in scenarios with limited or no labeled examples. For example, a human can learn the concept of a phishing email from natural language explanations such as 'phishing emails often request your bank account number'. On the other hand, purely inductive learning systems typically require a large collection of labeled data for learning such a concept. We believe that advances in Computational Linguistics and the growing ubiquity of computing devices together can enable people to teach computers classification tasks using natural language interactions.

Learning from language presents some key challenges. A preliminary challenge lies in the basic problem of learning to interpret language, which refers to an agent's ability to map natural language explanations in pedagogical contexts to formal semantic representations that computers can process and reason over. A second challenge is that of learning from interpretations, which refers to the mechanisms through which interpretations of language statements can be used by computers to solve learning tasks in the environment. We address aspects of both these problems, and provide an interface for guiding concept learning methods using language.

For learning from interpretation, we focus on concept learning (binary classification) tasks. We demonstrate that language can define rich and expressive features for learning tasks, and show that machine learning can benefit substantially from this ability. We also investigate assimilation of linguistic cues in everyday language that implicitly provide constraints for classification models (e.g., 'Most emails are not phishing emails'). In particular, we focus on conditional statements and linguistic quantifiers (such as usually, never, etc.), and show that such advice can be used to train classifiers even with few or no labeled examples of a concept.

For learning to interpret, we develop new algorithms for semantic parsing that incorporate pragmatic cues, including conversational history and sensory observation, to improve automatic language interpretation. We show that environmental context can enrich semantic parsing methods by not only providing discriminative features, but also reducing the need for expensive labeled data used for training them.

A separate but immensely valuable attribute of human language is that it is inherently conversational and interactive. We also briefly explore the possibility of agents that can learn to interact with a human teacher in a mixedinitiative setting, where the learner can also proactively engage the teacher by asking questions, rather than only passively listen. We develop a reinforce learning framework for learning effective question asking strategies in context of conversational concept learning

160 pages

Thesis Committee:
Tom Mitchell (Chair)
Taylor Berg-Kirkpatrick
William Cohen
Dan Roth (University of Pennsylvania)

Roni Rosenfeld, Head, Machine Learning Department
Andrew W. Moore, Dean, School of Computer Science


SCS Technical Report Collection
School of Computer Science