Smaller, smarter, closer: The edge of collaborative generative AI

Morabito, Roberto; Jang, SiYoung
IEEE Internet Computing, 3 June 2025

The rapid adoption of generative AI (GenAI), particularly Large Language Models (LLMs), has exposed critical limitations of cloud-centric deployments, including latency, cost, and privacy concerns. Meanwhile, Small Language Models (SLMs) are emerging as viable alternatives for resource-constrained edge environments, though they often lack the capabilities of their larger counterparts. This article explores the potential of collaborative inference systems that leverage both edge and cloud resources to address these challenges. By presenting distinct cooperation strategies alongside practical design principles and experimental insights, we offer actionable guidance for deploying GenAI across the computing continuum. Ultimately, this work underscores the great potential of edge-first approaches in realizing the promise of GenAI in diverse, real-world applications.


DOI
Type:
Journal
Date:
2025-06-03
Department:
Communication systems
Eurecom Ref:
8253
Copyright:
© 2025 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
See also:

PERMALINK : https://www.eurecom.fr/publication/8253