Autopapermine: Research Paper Information Extractor

Authors: Jinta Johnson, Assistant Professor Athira B, Professor Dr. Shine Raj G

Abstract: This paper presents a lightweight and intelligent system for the automatic extraction of structured information from academic research papers in PDF format. The proposed system leverages Natural Language Processing (NLP) techniques, TF-IDF-based summarization, and Sentence-BERT semantic similarity to extract and analyze metadata such as title, authors, organizations, keywords, and references. Built using Python and Streamlit, the tool allows users to upload PDF documents, parse academic content, and interactively review summarized metadata, references, and semantic relevance—all in real-time. This paper details the system architecture, implementation pipeline, challenges, and experimental results, demonstrating its effectiveness and scope for future enhancements

Related posts

Follow Us on