Training modern deep neural networks (DNNs) requires hybrid parallelism. Automatic planners search data, tensor/model, and pipeline shardings with cost models, but decisions can drift from runtime optima due to framework/planner decoupling and overlap mis-modeling. We present MANUMATIC, a light-touch planner that lets users pin a few critical operator shardings while automatically deriving globally consistent strategies for the rest. Inside a binary recursive partitioner, MANUMATIC prioritizes pins via an infinite compromise price and decomposes multi-dimensional hints into two-way refinements; when hard constraints are infeasible, a soft-penalty variant applies. The design is profiling-free, preserves D-Rec’s short compilation time, and degenerates to D-Rec when no pins are given. Built atop D-Rec, MANUMATIC delivers consistent speedups without cost-model reengineering: on Mixtral-8 " role="presentation" style="box-sizing: inherit; display: inline-block; line-height: normal; font-size-adjust: none; word-spacing: normal; overflow-wrap: normal; text-wrap-mode: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; scroll-margin-top: 74px; position: relative;"> " role="presentation" style="box-sizing: inherit; display: inline-block; line-height: normal; font-size-adjust: none; word-spacing: normal; overflow-wrap: normal; text-wrap-mode: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; scroll-margin-top: 74px; position: relative;"> " role="presentation" style="box-sizing: inherit; display: inline-block; line-height: normal; font-size-adjust: none; word-spacing: normal; overflow-wrap: normal; text-wrap-mode: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; scroll-margin-top: 74px; position: relative;"> " role="presentation" style="box-sizing: inherit; display: inline-block; line-height: normal; font-size-adjust: none; word-spacing: normal; overflow-wrap: normal; text-wrap-mode: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; scroll-margin-top: 74px; position: relative;"> " role="presentation" style="box-sizing: inherit; display: inline-block; line-height: normal; font-size-adjust: none; word-spacing: normal; overflow-wrap: normal; text-wrap-mode: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; scroll-margin-top: 74px; position: relative;">
ManuMatic: Strategy injection for robust automatic hybrid parallelism in distributed DNN training
NPC 2025, 22nd IFIP International Conference on Network and Parallel Computing, 14-16 November 2025, Nha Trang, Vietnam / Also on Lecture Notes in Computer Science, Vol.16306
Type:
Conference
City:
Nha Trang
Date:
2025-11-14
Department:
Data Science
Eurecom Ref:
8487
Copyright:
© Springer. Personal use of this material is permitted. The definitive version of this paper was published in NPC 2025, 22nd IFIP International Conference on Network and Parallel Computing, 14-16 November 2025, Nha Trang, Vietnam / Also on Lecture Notes in Computer Science, Vol.16306 and is available at : https://doi.org/10.1007/978-3-032-10466-3_14
See also:
PERMALINK : https://www.eurecom.fr/publication/8487