Authors: Dr. C.K. Gomathy, VD Sasank, R Srishreya
Abstract: Big Data in healthcare leverages advanced analytics on massive, heterogeneous datasets (electronic health records, medical images, genomics, wearable sensor streams, etc.) to improve patient outcomes and operational efficiency. Traditional healthcare IT systems cannot cope with the volume, velocity, and variety of these data. Modern distributed platforms (Hadoop, Spark, cloud) and AI methods (machine learning, deep learning) are therefore crucial for enabling real-time predictive modeling and trend analysis in medicine. This paper reviews recent (2015–2025) developments in healthcare-focused Big Data analytics, including architectures, algorithms, and applications. A comprehensive end-to-end methodology is proposed, comprising data ingestion, preprocessing, distributed model training, and deployment via containerized services. We describe the implementation of a prototype healthcare analytics system and present experimental results demonstrating its scalability and accuracy for real-time patient risk prediction. The findings underscore that Big Data analytics has become a foundational tool in healthcare, enabling evidence-based clinical decision support, disease surveillance, and personalized medicine.