A novel approach for recognition of handwritten compound Bangla characters, along with the Basic characters of Bangla alphabet, is presented here. Compared to English like Roman script, one of the major stumbling blocks in Optical Character Recognition (OCR) of handwritten Bangla script is the large number of complex shaped character classes of Bangla alphabet. In addition to 50 basic character classes, there are nearly 160 complex shaped compound character classes in Bangla alphabet. Dealing with such a large varieties of handwritten characters with a suitably designed feature set is a challenging problem. Uncertainty and imprecision are inherent in handwritten script. Moreover, such a large varieties of complex shaped characters, some of which have close resemblance, makes the problem of OCR of handwritten Bangla characters more difficult. Considering the complexity of the problem, the present approach makes an attempt to identify compound character classes from most frequently to less frequently occurred ones, i.e., in order of importance. This is to develop a frame work for incrementally increasing the number of learned classes of compound characters from more frequently occurred ones to less frequently occurred ones along with Basic characters. On experimentation, the technique is observed produce an average recognition rate of 79.25 after three fold cross validation of data with future scope of improvement and extension.
updated: Tue Feb 23 2010 06:44:32 GMT+0000 (UTC)
published: Mon Feb 22 2010 02:58:49 GMT+0000 (UTC)