Authors: Sagar Kumar, Harish Dutt Sharma, Ram Bhawan Singh
Abstract: Phishing attacks have emerged as one of the most significant cybersecurity threats, targeting users by creating fraudulent websites that mimic legitimate platforms to steal sensitive information. Traditional rule-based and blacklist-based detection techniques are often ineffective against newly generated phishing websites. This paper proposes a machine learning-based phishing website detection system that utilizes multiple classification algorithms to identify malicious URLs. The system extracts various URL-based and domain-based features such as URL length, presence of special characters, domain age, and HTTPS usage. Machine learning models including Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR) are evaluated. Experimental results demonstrate that the proposed approach achieves high accuracy and outperforms traditional detection methods.