Lightweight Retrieval-Augmented Generation System For CPU-Only Document Question Answering

Authors: Pratik Halnor, Om Kale, Vishal Gore, Abhishek Kahar, Devyani Jadhav

Abstract: Retrieval-Augmented Generation (RAG) improves the factual accuracy of Large Language Models by grounding responses in external documents. However, most existing systems rely on dense em-beddings, vector databases, and GPU-based computation, making them unsuitable for low-resource environments. This paper presents a lightweight RAG system designed specifically for CPU-only environments. The system integrates PDF text extraction and Optical Character Recognition (OCR) using PyMuPDF and Tesseract, followed by a keyword-based retrieval mechanism. The retrieved context is then passed to a language model API for response generation. Experimental evaluation demonstrates that the system achieves an accuracy of 83.3% with an average response time of approximately 2.2 seconds. The results highlight that efficient document intelligence systems can be developed without heavy computational requirements

Related posts

Follow Us on