Do We Need a New Orchestration System for GPUs?

brucethemoose2 · on April 6, 2023

There is something similar to what you described operating right now: Stable Horde: https://stablehorde.net/

I'm not sure if it has the priority queueing system you mentioned, or how much it has generalized to non-SD loads, but it is functional.

Also, you should consider using a Huggingface diffusers UI/backend instead of the automatic1111 UI, which is based on (and now inextricably linked to) the old Stability AI implementation. Maybe thats fine now, but its already a problem for optimizations people are cooking up (like Facebook's AITemplate, torch.compile, SHARK/MLIR for AMD GPUs, Intel OpenVINO...) which work with diffusers already.

I am bouncing between InvokeAI and VoltaML at the moment (as I cant get any of the optimizations to work in the auto1111 implementation anymore), but I can hardly keep up with everything on Github.

zenlikethat · on April 6, 2023

Oh yeah, Horde is super cool, I’ve been meaning to give it a spin. That’s good info about the web ui! I have noticed that relative to my hardware, the results seem slow. Lots could be causing that though.

brucethemoose2 · on April 6, 2023

Another factor is your system libs. An old PyTorch/CUDA combo, for instance, can hold 4000 GPUs back.

I have been quite happy with Arch's build of PyTorch (which targets CUDA 12.1 at the moment).

brianjking · on April 6, 2023

I'm only part of the way through the article, and yeah, interesting read.

zenlikethat · on April 6, 2023

Thanks!