Machine Learning Department
School of Computer Science, Carnegie Mellon University


The Time and Location of Natural Reading
Processes in the Brain

Leila Wehbe

August 2015

Ph.D. Thesis


Keywords: Functional NeuroImaging, Predicting Brain Activity, Language Processing, FMRI, MEG, Naturalistic Experiments, Story Reading, Hypothesis Testing, Conditional Independence Testing

How is information organized in the brain during natural reading? Where and when do the required processes occur, such as the perception of individual words and the construction of sentence meanings. How are semantics, syntax and higher-level narrative structure represented? Answering these questions is core to understanding how the brain processes language and organizes complex information. However, due to the complexity of language processing, most brain imaging studies focus only on one of these questions using highly controlled stimuli which may not generalize beyond the experimental setting.

This thesis proposes an alternative framework to study language processing. We acquire data using a naturalistic reading paradigm, annotate the presented text using natural language processing tools and predict brain activity with machine learning techniques. Finally, statistical testing is used to form rigorous conclusions. We also suggest the use of direct non-parametric hypothesis tests that do not rely on any model assumptions, and therefore do not suffer from model misspecification.

Using our framework, we construct a brain reading map from functional magnetic resonance imaging data of subjects reading a chapter of a popular book. This map represents regions that our model reveals to be representing syntactic, semantic, visual and narrative information. Using this single experiment, our approach replicates many results from a wide range of classical studies that each focus on one aspect of language processing.

We extend our brain reading map to include temporal dynamics as well as spatial information by using magnetoencephalography. We obtain a spatio-temporal picture of how successive words are processed by the brain. We show the progressive perception of each word in a posterior to anterior fashion. For each region along this pathway we show a differentiation of the word properties that best explain its activity.

179 pages

Thesis Committee:
Tom Mitchell (Chair)
Eduard Hovy
Cosma Shalizi
Jack Gallant (University of California, Berkeley)

Brian Murphy (Queen's University Belfast)

Tom M. Mitchell, Head, Machine Learning Department
Andrew W. Moore, Dean, School of Computer Science

SCS Technical Report Collection
School of Computer Science