Paper Title
Scalable Data Extraction from Business Documents Using Cloud-Based OCR Technology
Abstract
Optical Character Recognition (OCR) technology has undergone profound improvements to achieve the quality and efficiency needed for a number of business applications. AWS Textract is one OCR solution that stands out today as offering robust, cloud-based AI capabilities for text extraction from images and handwritten notes. Here, the development of a mobile and web-based application is being reported, which integrates OCR functions seamlessly using AWS Textract. The application uses PHP to handle backend operations, while MySQL is used as a database to store the extracted information. JavaScript, HTML, and CSS are used on the front end to handle the user interface. Such a combination of technologies makes it possible to provide a scalable, efficient, and reliable OCR solution meant for enterprise usage. The idea behind this project is very interesting for developers who want to implement such capabilities in applications but want to optimize performance and scalability.
Keyword - OCR, Tesseract, AWS Textract, Intelligent Text Recognition