Authors: Meenakshi Sundaram, R. Jayanthi, Pradeep Das, Soundarya G.
Abstract: – In modern enterprise environments, ensuring high availability and service continuity has become a mission-critical requirement. Red Hat Enterprise Linux (RHEL), a cornerstone of many IT infrastructures, provides a robust platform for implementing self-healing mechanisms through lightweight and flexible automation. This review explores the concept of self-healing infrastructure with a focus on shell-based automation techniques tailored to RHEL environments. By leveraging native tools such as Bash scripting, systemd, cron jobs, inotify, and Ansible hooks, administrators can design reactive and proactive remediation systems capable of detecting, isolating, and correcting faults without human intervention. The increasing complexity of enterprise deployments, often compounded by limited operational windows and lean support teams, necessitates automation that is both scalable and transparent. Shell scripting remains a powerful ally due to its direct access to system resources, speed, and platform compatibility. Use cases examined in this review include automatic service restarts upon failure, filesystem monitoring and cleanup, dynamic network reconfiguration, and log anomaly detection all driven by lightweight shell scripts. Additionally, the paper examines how these automation techniques can integrate with broader observability frameworks such as ELK and Prometheus for telemetry-driven decision making. Scalability considerations, security constraints, execution reliability, and the evolution of event-driven remediation are discussed to position shell automation as a foundational element of resilient, self-healing systems. The study concludes by reflecting on emerging directions, such as AI-enhanced automation and Red Hat’s event-driven Ansible, and evaluates the continued relevance of shell scripting in modern DevOps and hybrid cloud architectures.
DOI: https://doi.org/10.5281/zenodo.16157141