DRL-enabled SLO-aware task scheduling for large language models in 6G networks

Mekrache, Abdelkader; Ksentini, Adlen; Verikoukis, Christos

ICC 2025, IEEE International Conference on Communications, 8-12 June 2025, Montreal, Canada

With the rapid advancement of telecommunications, 6G networks are expected to become more intelligent and capable of making autonomous decisions. Artificial Intelligence (AI) will play a crucial role in achieving this, particularly through the use of Large Language Models (LLMs). These models are increasingly being adopted for networking tasks due to their advanced capabilities in coding, reasoning, and language processing. LLMs have significant potential to support the development of autonomous networks by reducing or even eliminating the need for human intervention. However, LLMs are computationally expensive, which necessitates their shared use across different 6G applications, i.e., a single LLM might be required to perform multiple tasks within a 6G network. To this end, routing tasks to the appropriate LLMs presents several challenges: (i) the arrival time of tasks is unpredictable, (ii) tasks must meet specific deadlines, which are part of the Service-Level Objectives (SLOs), and (iii) each LLM may perform better on different types of tasks, leading to varying task scores. In this paper, we propose a Deep Reinforcement Learning (DRL) approach for routing tasks to a set of LLMs (task scheduling). Our goal is to maximize task scores while ensuring their deadlines are met. Evaluations conducted under real-world conditions show that our DRL-based approach outperforms traditional methods like Round-Robin (RR) and random scheduling.

Detail

Document

DOI

BIBTEX

Type:

Conference

City:

Montreal

Date:

2025-06-08

Department:

Communication systems

Eurecom Ref:

8071

© 2025 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.