There are other techniques that are optimal like Parallel Depth First scheduling from Blelloch.
Since then there has been a lot of work for distributed work-stealing.
Boggles my mind that at "cloud-scale" people still use round robin and assume uniform workloads and that at the server level there is uniform load as well.
I wish load balancers would default to one of these optimal algorithms. When I call a "sort" function from the standard library, 99% of the time I don't have to go read papers to pick which sorting algorithm to use. The default algorithm is good for most cases. EDIT: probably because many of them are adaptive algorithms.
Seems many load balancer providers missed that memo.
The issue is that the spread of actual cost of requests is usually extreme and opaque, and system load/capacity information is often very different between nodes.
And there is no standard way Iām aware of to usefully report back or communicate this data to the LB.
Probably because this would require both the load balancer and the webserver to support a custom protocol. My first guess is that some service meshes have an API that you could use for this.
There are other techniques that are optimal like Parallel Depth First scheduling from Blelloch.
Since then there has been a lot of work for distributed work-stealing.
Boggles my mind that at "cloud-scale" people still use round robin and assume uniform workloads and that at the server level there is uniform load as well.