> (Unlike say LLMs where GPT-4 is clearly dominating for now.)
A lot of this comes from people comparing GPT-4 to e.g. LLaMA-7B, because that's the thing that fits in memory on their laptop. Whereas you can run LLaMA-65B, and it's dramatically better, but it uses about 128GB of RAM and the hardware needed to run it fast is expensive.
And GPT-4 has even more parameters than that, but that's not a matter of the tooling, it's that someone needs to release a public model with more parameters.
That's part of the point though. I get better results from stable diffusion on my PC than out of DALL-E 2. (I still have some credits there, but little reason to use them.)
I can't do that with LLaMA-65B. (Although to be fair 128 GB RAM is not that much.) But I suspect it's still far less capable than GPT-4, is it not?
It depends what you're trying to get it to do. There are some prompts where the expected output is a piece of code or a paragraph containing some information, and once you reach the threshold where the code works or the information is correct, there isn't a lot of "better" left to get.
Then there are ones where more parameters matters.
Conversely, LLaMA doesn't say "I'm sorry Dave, I'm afraid I can't do that."
It's a popular GUI for stable diffusion models with many extensions. Like the sibling comment points out, everyone calls it like that because that's the handle of the original maintainer. (As in: Which web ui? Auto11's.)
It would be cool to see that become a GIMP plugin next, I think that would be more of a direct alternative to the workflow using generative fill within photoshop.
https://github.com/Mikubill/sd-webui-controlnet/discussions/...
So far it seems that the OSS diffusion models + tooling that we can run locally keep being state of the art. It makes me so happy.
(Unlike say LLMs where GPT-4 is clearly dominating for now.)