Many organizations often lose expensive man hours on document classification. Small and big institutes incur waste of time as well as manpower sorting the documents made by students.To combat this issue, we propose a system that will intelligently classify the raw documents and store them in such a way that accessing them will become as easy as it has never been before regardless of time and place. The system will take any number of reports in document form and use text extraction to capture appropriate parts of the text. These parts will further be used for text classification where the system can use categorization or classification along with improvements like collaborative filtering and latent semantic analysis to categorize the document when domain conflict occurs. This system will help organizations to store their data electronically saving lot of man power that they waste in physical storing, classifying, accessing and maintaining the records.
It will also make reviewing and tracking all the documents easy making the organization much more efficient.
Keywords - Artificial Intelligence, Categorization, Classification, Domain, Query Processing, Storage, Text Extraction, Latent Semantic Analysis