Authors: Mrs. Kocherla Jayanthi
Abstract: The rapid evolution of cyber threats demands effective and adaptive intrusion detection systems to protect critical network infrastructures. This study seeks to evaluate the efficacy of supervised machine learning models in detecting network intrusions using the NSL-KDD dataset. The NSL-KDD dataset, a well-established benchmark for intrusion detection, undergoes thorough pre-processing, including handling missing values, feature normalization, and categorical encoding to ensure high-quality input data. We implement a range of supervised machine learning algorithms Decision Tree, Random Forest, Naïve Bayes, K-Nearest Neighbours (KNN), Gradient Boosted Trees, and Support Vector Machine (SVM) to classify network traffic as either benign or malicious. The process involves splitting the dataset into training and testing subsets, followed by hyperparameter optimization through grid search to enhance model performance. We evaluate the models using key metrics such as accuracy, confusion matrix, Receiver Operating Characteristic (ROC) curve, and Area Under the Curve (AUC). Our findings reveal that Random Forest and Gradient Boosted Trees achieve superior accuracy and lower false positive rates compared to other classifiers. The comparative analysis provides practical insights into each algorithm’s strengths and limitations for cybersecurity applications.