Abstract
Optical Character Recognition (OCR) is the
process of converting printed, handwritten and typed
printed text into its equivalent machine readable form.
Scanning and comparison techniques are considered
to recognize printed text or numerical data. Once the
scanned document is converted into machine readable
form, the text can then be used in different
applications, just like normal machine readable text. It
saves time by not typing already printed material for
data entry. OCR software attempts to identify
characters by comparing figures to those stored in the
software library. The discipline of OCR is an offspring
of Pattern Recognition, Artificial Intelligence, and
Computer Vision. Arabic script (having characters that
are connected cursively) makes the recognition of
Urdu text more difficult as compared to a language
such as English having isolated characters when
forming a word. In this research paper, an analysis of
8 years research papers (2002 to 2009) on Urdu OCR
has been conducted to show the endeavors for the
development of offline Urdu OCR covering both
history and future work
Naila Fareen, Attash Durrani, Mohammad Abid Khan. (2012) Survey of Urdu OCR: An Offline Approach, Conference on Language and Technology 2012.
-
Viewed
1479 -
Downloads
160