CMU-CS-03-147
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-03-147
A Markov Model for the
Acquisition of Morphological Structure
Leonid Kontorovich, Dana Ron*, Yoram Singer**
June 2003
CMU-CS-03-147.ps
CMU-CS-03-147.pdf
Keywords: Morphology, Markov, probabilistic suffix tree
We describe a new formalism for word morphology. Our model
views word generation as a random walk on a trellis of units
where each unit is a set of (short) strings. The model
naturally incorporates segmentation of words into morphemes.
We capture the statistics of unit generation using a
probabilistic suffix tree (PST) which is a variant of variable
length Markov models. We present an efficient algorithm that
learns a PST over the units whose output is a compact stochastic
representation of morphological structure. We demonstrate the
applicability of our approach by using the model in an allomorphy
decision problem.
18 pages
* Tel-Aviv University
** Hebrew University
|