Abstract
Chunking or shallow syntactic parsing is proving to be a task of interest to many natural language processing applications. The problem gets worse for the Arabic language because of its specific features that make it quite different and even more ambiguous than other natural languages when processed. In this paper, we present a method for chunking Arabic texts based on supervised learning. We use the Conditional Random Fields algorithm and the Penn Arabic Treebank to train the model. For the experimentation, we use over than 10,100 sentences as training data and 2,524 sentences for the test. The evaluation of the method consists of the calculation of the generated model accuracy and the results are very encouraging.

Nabil Khoufi, Chafik Aloulou, Lamia Hadrich Belguith. (2014) Arabic Text Chunking Using a CRF Model, Conference on Language and Technology 2014.
  • Viewed 1571
  • Downloads 0
Publisher
Center for Language Engineering
Country
Pakistan
City
Karachi
From
13-11-2014
To
15-11-2014