Center for Language Engineering
ناشر
Pakistan
ملک
Lahore
شہر
09-11-2012
تاریخِ آغاز
10-11-2012
تاریخِ اختتام


تلخیص
Part-of-Speech (POS) tagging is process of assigning unique grammatical tags to every word in a sentence. POS tagset is primary requirement of POS tagging process. This research paper discusses various grammatical classes of Sindhi with reference to POS tagset design and tagging. Various issues like tagset design considerations, tagset size and granularity, part of speech types, subtypes and their attributes for tagging are discussed in detail. General guidelines for designing Sindhi POS tagset of any possible size and granularity are given. Obligatory and proposed tagsets for Sindhi are presented which provide basis for further research in part of speech tagging, tagged corpus, chunking, syntax analysis, information retrieval, part of speech usage analysis and other natural language processing applications.

Mutee U Rahman. (2012) Developing a Part of Speech Tagset for Sindhi, Conference on Language and Technology 2012.
  • Viewed 1483
  • Downloads 942