Data Forge Shape Your Data into Clarity

Uncategorized

Authors: Lohitha Lakshmi K, Hema Sri S, Shaik Reshma, Hima Sai Nandhan P, Manoj Kumar Reddy S D V

Abstract: Data plays a key role in analysis and machine learning, but working with real-world datasets is often challenging because they usually contain missing values, duplicate entries, inconsistencies, and noise that can affect the accuracy of results. Data cleaning is therefore an essential step, yet it can be time-consuming and often requires programming knowledge, making it less convenient for many users. In this work, we present DataForge, a data preprocessing system designed to make the cleaning process simpler and more accessible. The platform allows users to upload datasets and perform cleaning operations without writing code, using a mix of statistical methods and simple intelligent techniques to handle issues such as missing data, outliers, and duplicate records. Overall, DataForge focuses on reducing the effort required for data preparation while still helping users work with more reliable datasets. This approach also helps users get a clearer idea of their data without going into too much technical detail.

× How can I help you?