Computer Science Qatar
School of Computer Science, Carnegie Mellon University
The Qatar Arabic Language Bank Guidelines
Wajdi Zaghouani*, Nizar Habashy**, Behrang Mohit*
The Qatar Arabic Language Bank (QALB) is a corpus of Arabic text with manual corrections. The Arabic text comes from three sources: native speakers, non-native speakers, and machine translation (into Arabic). The corpus consists mainly of Modern Standard Arabic (MSA) texts but some dialect Arabic usage may occur. The goals of the annotation are: to provide training data for learning based Arabic error correction tools, and to provide a gold-standard to be used in the evaluation of error correction algorithms. This document is the reference guidelines for text correction in the QALB project.