Authors: Praveen Kumar Yadav
Abstract: Monitoring and maintaining water quality in urban rivers is crucial for ensuring both environmental sustainability and public health. The Gomti River, which flows through the densely populated city of Lucknow, faces severe stress due to rapid urbanization, untreated wastewater discharge, and growing anthropogenic pressures. This study focuses on predicting three key water quality parameters—pH, nitrate (NO₃), and biochemical oxygen demand (BOD)—which are widely recognized as critical indicators of river water health and are frequently used in Water Quality Index (WQI) assessment. To achieve this, the Extra Trees Regressor (ETR) model was applied to water quality datasets collected from the Central Water Commission monitoring station between 2016 and 2023. The dataset was pre-processed, normalized, and divided into training and testing subsets. Model performance was evaluated using statistical metrics such as R², MAE, MSE, and RMSE. The results demonstrated that ETR provided highly accurate predictions, achieving R² values above 0.95 for all three parameters while minimizing error values (MAE and RMSE). The predicted WQI values showed close alignment with actual observations, confirming the robustness and reliability of the model. These findings highlight the potential of machine learning-based approaches in forecasting river water quality and supporting timely, data-driven decision-making for pollution control and river management. This research contributes to environmental monitoring and geoscientific applications by demonstrating how ML methods can enhance water quality assessment, strengthen pollution mitigation strategies, and promote sustainable river basin management. Future work will focus on integrating satellite-based land use and meteorological data to improve spatial analysis and extending the modeling framework to other river systems, thereby improving generalizability and applicability.
DOI: https://doi.org/10.5281/zenodo.17067318