A Survey on Machine Learning Handling Imbalanced Dataset in Credit Card Fraud

Uncategorized

A Survey on Machine Learning Handling Imbalanced Dataset in Credit Card Fraud/strong>
Authors:-Pawan Panchole, Rajesh Dhakad

Abstract- In the era of digital transaction people prefer to make online payments and purchases due to the convenience of time, transportation, etc. Credit card fraud has also increased significantly due to the growing trend of e-commerce. Fraudsters try to take advantage of card and internet payment information. Credit card and online payment information is often used by fraudsters for fraudulent purpose. Imbalanced dataset and high dimensionality of data are the key issues observed in credit card fraud detection. The use of various machine learning algorithm has been utilized for identifying anomalies in credit card transaction, focusing on the problem of imbalanced dataset and reduction of dimension which were carefully reviewed and studied. The study investigates the impact of imbalanced datasets on PCA-based fraud detection and provided detailed techniques such as Random Oversampling, SMOTE & Random Undersam- pling to handle imbalanced datasets and various classification as well as anomaly detection methods. Additionally, given the labelled nature of the dataset, various methods are reviewed like Logistic Regression, Random Forests, and Decision Trees. This study analyses and compares the performance of these methods before and after applying PCA and addressing data imbalance to assess their effectiveness in detecting credit card fraud.

DOI: 10.61137/ijsret.vol.10.issue6.391