MACHINE LEARNING TECHNICAL REPORT ABSTRACTS

	CMU-ML-15-104 Machine Learning Department School of Computer Science, Carnegie Mellon University CMU-ML-15-104 The Time and Location of Natural Reading Processes in the Brain Leila Wehbe August 2015 Ph.D. Thesis CMU-ML-15-104.pdf Keywords: Functional NeuroImaging, Predicting Brain Activity, Language Processing, FMRI, MEG, Naturalistic Experiments, Story Reading, Hypothesis Testing, Conditional Independence Testing How is information organized in the brain during natural reading? Where and when do the required processes occur, such as the perception of individual words and the construction of sentence meanings. How are semantics, syntax and higher-level narrative structure represented? Answering these questions is core to understanding how the brain processes language and organizes complex information. However, due to the complexity of language processing, most brain imaging studies focus only on one of these questions using highly controlled stimuli which may not generalize beyond the experimental setting. This thesis proposes an alternative framework to study language processing. We acquire data using a naturalistic reading paradigm, annotate the presented text using natural language processing tools and predict brain activity with machine learning techniques. Finally, statistical testing is used to form rigorous conclusions. We also suggest the use of direct non-parametric hypothesis tests that do not rely on any model assumptions, and therefore do not suffer from model misspecification. Using our framework, we construct a brain reading map from functional magnetic resonance imaging data of subjects reading a chapter of a popular book. This map represents regions that our model reveals to be representing syntactic, semantic, visual and narrative information. Using this single experiment, our approach replicates many results from a wide range of classical studies that each focus on one aspect of language processing. We extend our brain reading map to include temporal dynamics as well as spatial information by using magnetoencephalography. We obtain a spatio-temporal picture of how successive words are processed by the brain. We show the progressive perception of each word in a posterior to anterior fashion. For each region along this pathway we show a differentiation of the word properties that best explain its activity. 179 pages Thesis Committee: Tom Mitchell (Chair) Eduard Hovy Cosma Shalizi Jack Gallant (University of California, Berkeley) Brian Murphy (Queen's University Belfast) Tom M. Mitchell, Head, Machine Learning Department Andrew W. Moore, Dean, School of Computer Science

SCS Technical Report Collection School of Computer Science