Simplified dynamic programming for Decentralized POMDPs with delayed sharing patterns via change of measure

Charalambous, Charalambos D.; Stavrou, Photios A
ECC 2026, European Control Conference, 7-10 July 2026, Reykjavik, Iceland

In this paper, we consider decentralized discretetime stochastic dynamical optimal control problems with multiple control strategies operating under delayed-sharing information
patterns, formulated within the framework of personby-person (PbP) optimality. We invoke Girsanov’s theorem to characterize PbP optimality under a reference probability
measure through value functions satisfying simplified dynamic programming (DP) equations, together with corresponding information states that serve as sufficient statistics for the strategies. The value functions and information states retain the fundamental properties of classical partially observable Markov decision problems (POMDPs), namely, both depend on the actions of the minimizing controls, rather than their strategies.
The main distinguishing feature of our DP approach is that each control strategy estimates the unobservable state process and the private information components of all other strategies solely from its own private information and the delayed-sharing information components, using information states.

Type:
Conference
City:
Reykjavik
Date:
2026-07-07
Department:
Communication systems
Eurecom Ref:
8847
Copyright:
© 2026 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
See also:

PERMALINK : https://www.eurecom.fr/publication/8847