A Review of the Most Well-Known Aggregation Algorithms for Federated Learning Applied to Large Language Models (LLMS)

Ygor Vieira; Oberdan Rocha; Davidson Martins

doi:10.34178/jbth.v8i4.529

Ygor Vieira
Oberdan Rocha
Davidson Martins

DOI: https://doi.org/10.34178/jbth.v8i4.529

Keywords: Federated Learning, Aggregation Algorithms, Large Language Models

Abstract

Research on federated learning has grown due to its ability to perform local training on distributed devices, especially in the context of artificial intelligence. However, there are still a few studies focused on the aggregation algorithms used in this type of learning, and even fewer addressing their application in large language models (LLMs). This article reviews the literature on federated learning with an emphasis on aggregation techniques applied to LLM training. A scarcity of specific studies was observed, along with the predominance of three algorithms: FedAvg, FedProx, and SCAFFOLD. Each was analyzed in terms of its strengths and weaknesses, including accuracy under data heterogeneity, convergence speed, and aspects of security and privacy. It is concluded that the future of aggregation algorithms in LLM training involves developing solutions that balance these aspects in an integrated manner.