Enhancing Collaborative Deep Learning with Swarm Intelligence and Federated Optimization
Authors:-Assistant Professor Dr. G. Babu, Sunil Kumar Nagar
Abstract-In the era of advanced artificial intelligence and machine learning, collaborative deep learning has emerged as a powerful approach to leverage distributed data and computational resources. However, a significant challenge that persists is ensuring the generalizability of models developed in collaborative environments. This project addresses the generalizability challenge in collaborative deep learning by proposing a novel framework that integrates advanced techniques in model training and validation. Deep learning models typically require data to be collected at a centralized location to learn effective representations, which introduces several issues such as communication costs and risks to data privacy. These issues are particularly critical in the case of clinical data, where patient privacy is paramount. In such contexts, distributed machine learning offers a viable solution where various data-holding sites can locally train a mutually agreed-upon model and share their knowledge. Federated learning (FL) facilitates this process using a client-server framework. Clients in the FL environment are independent small edge devices that retain their data locally, while the server acts as a central site that aggregates and distributes the knowledge learned by each client to others. The server receives locally trained weights from all participating clients, aggregates them, and then transfers the aggregated weights back to all clients before the next training round begins. This iterative process continues until the server achieves the desired accuracy. FL thus enables multiple clients to collaboratively train a shared global model without sharing their local data, preserving data privacy and addressing issues of limited data availability. However, FL faces challenges such as high communication costs for transferring weights, statistical data heterogeneity among clients, and the single point of failure of the server. Client heterogeneity arises mainly due to differences in data distribution among clients and their respective computational power. This project targets statistical data heterogeneity in the FL environment and proposes a simple yet effective attention-based approach to address this issue. Specifically, in the proposed setting, each client sends a mean representation to the centralized server along with the trained model’s weights. A similarity matrix is computed based on the similarity score of each client’s mean representation from every other participating client. This similarity matrix determines the weightage of each client’s model in the aggregated model. The centralized server computes the attention vector for each client using this similarity matrix and then broadcasts this attention vector to all clients. This attention mechanism is implemented both on the centralized server and the participating clients. We consider FedAvg, FedProx, and FedMomentum as baselines for comparison, and our proposed approach outperforms all of them. For statistical heterogeneity, we perform extensive experiments on FOOD101 and CIFAR10, demonstrating that our approachperforms well even with highly skewed data. To address the single point of failure issue in FL, we propose an efficient version of swarm learning. We demonstrate the effectiveness of context- aware swarm learning through experiments on the HAM10000 and ISIC Skin Lesion 2019 datasets. Additionally, to mitigate the high communication costs in FL, we propose BAFL (Federated Learning for Base Ablation), which introduces a fine-tuning approach to leverage the feature extraction ability of layers at different depths of deep neural networks. We evaluate the proposed approach using VGG-16 and ResNet-50 models on datasets including WBC, FOOD-101, and CIFAR-10, achieving up to two orders of magnitude reduction in total communication cost compared to conventional federated learning.
