Load Balancing under Data Locality: Extending Mean-Field Framework to Constrained Large Scale Systems

Debankur Mukherjee
Georgia Tech

Wednesday, Nov 8, 2023
5:00 - 6:00 PM
Y2E2 101



Abstract:

Large-scale parallel processing infrastructures such as data centers and cloud networks form the cornerstone of the modern digital environment. Central to their efficiency are resource management policies, especially load balancing algorithms (LBAs), which are crucial for meeting stringent delay requirement of tasks. A contemporary challenge in designing LBAs for today's data centers is navigating data locality constraints that dictate which tasks are assigned to which servers. These constraints can be naturally modeled as a bipartite graph between servers and various task types. Most LBA heuristics lean on mean-field approximation's accuracy. However, the non-exchangeability among servers induced by data locality invalidates mean-field framework, causing real-world system behaviours to significantly diverge from theoretical predictions. From a foundational standpoint, advancing our understanding in this domain demands the study of stochastic processes on large graphs, thus needing fundamental advancements in classical analytical tools.

In this presentation we will dive into recent advancements made in extending the accuracy of mean-field approximation for a broad class of graphs. In particular, we will talk about how to design resource efficient, asymptotically optimal locality constraints and how the system behavior changes fundamentally, depending on whether the above bipartite graph is an expander, a spatial graph, or is inhomogenenous in nature.



Operations Research Colloquia: http://or.stanford.edu/oras_seminars.html