Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Do We Need a New Orchestration System for GPUs? (nathanleclaire.com)
2 points by zenlikethat on April 6, 2023 | hide | past | favorite | 5 comments


There is something similar to what you described operating right now: Stable Horde: https://stablehorde.net/

I'm not sure if it has the priority queueing system you mentioned, or how much it has generalized to non-SD loads, but it is functional.

Also, you should consider using a Huggingface diffusers UI/backend instead of the automatic1111 UI, which is based on (and now inextricably linked to) the old Stability AI implementation. Maybe thats fine now, but its already a problem for optimizations people are cooking up (like Facebook's AITemplate, torch.compile, SHARK/MLIR for AMD GPUs, Intel OpenVINO...) which work with diffusers already.

I am bouncing between InvokeAI and VoltaML at the moment (as I cant get any of the optimizations to work in the auto1111 implementation anymore), but I can hardly keep up with everything on Github.


Oh yeah, Horde is super cool, I’ve been meaning to give it a spin. That’s good info about the web ui! I have noticed that relative to my hardware, the results seem slow. Lots could be causing that though.


Another factor is your system libs. An old PyTorch/CUDA combo, for instance, can hold 4000 GPUs back.

I have been quite happy with Arch's build of PyTorch (which targets CUDA 12.1 at the moment).


I'm only part of the way through the article, and yeah, interesting read.


Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: