Spell Detection and Correction in Independent System (Punjabi and Hindi)
Various systems are available for spell checking in Punjabi language. There is shortage of a system having Hindi spell correction. This paper presents a system which is made to check the spellings and to correct them using various techniques for both Punjabi text and Hindi text. In this proposed system, input is given in form of a paragraph that can have incorrect words and the system will generate accurate text after eliminating the errors. The system uses hybrid approach to implement the mis-spelled detection and Correction System. This hybrid approach is a combination of “database approach”, “modified rule based approach”, “Statistical Machine Approach up to n grams”, “Edit Distance approach” and use linguistic features of both Punjabi language and Hindi language. This system will detect and correct both Typographic and Cognitive types of errors. Corpora are essential for this. To develop the corpus of various Punjabi word entities which include the names of males, females, countries, locations, states, rivers, places, grammatical words from dictionary of Punjabi and also corpus is created for Hindi word entities which include the names of males, females, countries, locations , states, rivers, places, grammatical words from dictionary of Hindi. The corpus is created by using algorithm. The paragraph will be given and the system will give two options, whether that data ought to be inserted in corpora or directly you want to move for spell checking. The corpus tables in Punjabi and Hindi are linked. If we will insert Punjabi text then it will be inserted into tbpbidict and if hindi paragraph is given, it will be inserted into tbhindict. The proposed system works as the language detector also.
Keywords - Typographic Error, Cognitive Error, Statistical Machine Approach