As Generative AI models revolutionize computing, a critical challenge emerges: how to bring their capabilities closer to the edge (or even to the extreme edge) where devices are resource-constrained yet increasingly intelligent? This talk explores the evolving landscape of Edge AI, tracing the trajectory from cloud-dominated generative models to embedded intelligence on microcontrollers and heterogeneous edge platforms. Building on real-world experimentation and system-level insights, the presentation explores the trade-offs of deploying Language Models (LMs) at the edge, including considerations of performance, energy efficiency, and deployment cost. The role of LMs in automating key stages of the Edge AI lifecycle is also discussed, with a focus on hardware-aware code generation and model configuration to reduce manual effort and support scalable deployments. These developments are positioned within the broader edge–cloud continuum, advocating for edge-first GenAI strategies that selectively involve cloud resources only when necessary to balance efficiency, autonomy, and reach. Drawing from recent work, this talk outlines both the technical barriers and emerging opportunities for making generative AI deployable in decentralized, embedded environments, where latency, cost, and energy constraints are real, and intelligence must be adaptive by design.
From edge to Tiny: Reimagining AI in the era of generative and embedded intelligence
Summer School on Edge Artificial Intelligence, 1-4 September 2025, Stockholm, Sweden
Type:
Talk
City:
Stockholm
Date:
2025-09-03
Department:
Communication systems
Eurecom Ref:
8364
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Summer School on Edge Artificial Intelligence, 1-4 September 2025, Stockholm, Sweden and is available at :
See also:
PERMALINK : https://www.eurecom.fr/publication/8364