Machine Learning Department
School of Computer Science, Carnegie Mellon University
Researchers have discovered many successful algorithms and methodologies for
solving problems at the intersection of machine learning and education
research. This umbrella category, "educational data mining," has enjoyed a
series of successes that span the research process, from post-hoc data
analysis that generates models to the use of those models in successful
However, most of these successes have arisen from the use of pre-existing
psychological and educational constructs (e.g., guessing) and thus from
the use of semi-supervised or fully-supervised machine learning algorithms.
Algorithms for novel discovery, also known as unsupervised clustering, have
enjoyed significantly fewer successes in this domain, partially because
education data exhibit unique, complex structure.
This thesis is a mixture of algorithm development, simulation, and experimentation on real-world data, all designed to define and test a novel paradigm for clustering in education (and a range of other domains). This paradigm, target clustering, revolves around the inclusion of high-level targets, such as student learning from pre-test to post-test. This approach differs from other existing machine learning approaches in that it is designed completely, from the initial concept to the final execution, for solving educational research problems, taking advantage of the structural complexities that are problematic for other algorithms. This thesis includes a range of data sets drawn from a variety of research domains, but does not include new data from experiments in the psychological sense. However, the thesis includes analysis of methodology, results, and implications from an educational research perspective and relies entirely on education data and research problems.
||SCS Technical Report Collection
School of Computer Science homepage
This page maintained by email@example.com