Handwritten Text Into A Word Document Using Computer Vision And Deep Learning

Uncategorized

Authors: Assistant Professor P. Kamakshi Thai, A. Pruthvi, R. Akshith, V. Stephen Moses

 

Abstract: This project presents a deep learning-based system for converting handwritten notes into fully editable Microsoft Word documents. Instead of relying on traditional Convolutional Neural Networks (CNNs) and Bidirectional Long Short-Term Memory (BiLSTM) networks, this approach leverages Microsoft’s TrOCR model, a transformer-based OCR system designed for handwritten text recognition. The use of TrOCR, pre-trained and accessible via Hugging Face, significantly enhances transcription accuracy and reduces the complexity of training custom models. The process begins with image preprocessing to refine input quality, ensuring optimal text extraction. TrOCR’s powerful transformer architecture then deciphers handwritten text by leveraging attention mechanisms to model contextual dependencies within sequences. Following initial transcription, a post-processing module performs spell and grammar corrections, refining the extracted text for improved readability. The final structured output is formatted and automatically saved as a .docx file, enabling seamless integration with document generation tools. By employing state-of-the-art transformer-based OCR, this system achieves high readability and reliability, making it suitable for applications in education, legal documentation, healthcare records, and general document imaging. The transition to TrOCR eliminates the need for complex recurrent architectures, ensuring a streamlined and efficient recognition pipeline

DOI: 10.61137/ijsret.vol.11.issue3.151

 

× How can I help you?