FedSIT: Efficient federated fine-tuning with model splitting and importance-based tuning

Gu, Xinyu; Sun, Xinghua; Feng, Chenyuan; Wang, Xijun; Chen, Xiang

IJCNN 2025, INNS International Joint Conference on Neural Networks, 30 June-5 July 2025, Rome, Italy

The rapid scalability of large language models (LLMs) has driven significant advancements across various natural language processing tasks. However, the immense size of LLMs and the growing demand for large-scale datasets present challenges in fine-tuning these models in resource-constrained environments. Federated learning (FL) has emerged as a promising solution, enabling collaborative model fine-tuning on distributed private data without requiring data sharing. Despite its potential, the heavy computational and communication burdens imposed by LLMs hinder the widespread adoption of FL-based fine-tuning. To mitigate these challenges, we propose FedSIT (Federated Split Importance-Based Tuning), a novel federated fine-tuning framework designed to optimize LLM training in environments with limited computational resources. FedSIT splits the pretrained model into Bottom, Trunk, and Top layers, offloading the computationally intensive Trunk layer to the server while distributing the Bottom and Top layers to client devices. Additionally, FedSIT leverages layer importance scores to selectively finetune the most critical layers, reducing the number of parameters to be fine-tuned. Our extensive experiments demonstrate that FedSIT achieves comparable performance to existing methods while significantly reducing resource requirements, offering an efficient and scalable solution for federated fine-tuning of LLMs in real-world settings.

Detail

DOI

BIBTEX

Type:

Conférence

City:

Rome

Date:

2025-06-30

Department:

Systèmes de Communication

Eurecom Ref:

8287

© 2025 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.