Hello everyone!
My name’s Benjamin, I’m the developer of ENFUGUE, a self-hosted Stable Diffusion Web UI that’s built around an intuitive canvas interface, while still trying to deliver the power and deep customization of the popular tab-and-slider web UI’s.
I’m taking it out of Alpha and into Beta with the v0.2 release, which brings SDXL support while still maintaining most of the feature set of 1.5 by allowing you to configure multiple checkpoints for various diffusion plans. It also has a ton of changes since 0.1 as suggested by other users, like the the ability to point ENFUGUE to the directories of other Web UI installations to share models and other files.
This is not monetized software in any way; I simply built the tool I wanted to use, and wanted to share. Thanks you taking a look!
I’m back! 0.2.1 is now released, which defaults MacOS to half-precision. It also includes SDXL LoRA and ControlNet support, which I did get working on my Mac. :) It’s available at https://github.com/painebenjamin/app.enfugue.ai/releases/tag/0.2.1.
As for API - there always was one! It was just never documented until now. There’s still a few endpoints left to document, but the big ones are covered. Documentation is at https://github.com/painebenjamin/app.enfugue.ai/wiki/JSON-API.
So, feedback. To begin with, it works! That’s a massive improvement and allowed me to actually try it. Civitai.com downloading works quite nicely and… the generation is kinda slow. Slower than my iPhone 13 pro with Draw Things, a minute give or take 10 seconds. Poor phone crunches the same model in 30 something seconds.
Don’t get me wrong, I appreciate it works to begin with, it’s also easy to setup, but there’s a fair amount of performance left on the table. Now depending on how much work there’s to do it might make sense to chase further performance, but that’s something only you can decide :D
You’re the best, thanks so much for trying it and getting it working!
I don’t think it’s ever not worth chasing improved performance, so I’m definitely going to continue looking for optimizations. While cannibalizing the code for Comfy and A1111, I saw a lot (and I mean a lot) of shortcuts being made over the official Stability code release that improves performance in specific situations. I’m going to try and see how I can leverage some of those shortcuts into options for the user to tune to their hardware.
This latest release has attracted some more developer attention (and also some inquiries from hosting providers about offering Enfugue in the cloud!) I’m hoping that some of the authors of those improvements find their way to the Enfugue repository and perhaps are inspired to contribute.
With that being said, TensorRT will definitely knock your socks off in terms of speed if you haven’t used it before, if you’ve got the hardware for it. I’d be happy to troubleshoot whatever went wrong with your Windows install - there should be up to three
enfugue-engine.log
files in your~/.cache/
directory that will have more information about what went wrong, if you’d like to share them here (or we can start a GitHub thread if you have that.)Thank you again for all your help!
Now knowing where to look, I did some fixing by myself! Main issue is that I had CUDA 10 and 12, no 11. Then after going insane about that tiny difference… I landed on something I lack the knowledge to decipher: “PyInstallerImportError: Failed to load dynlib/dll ‘C:\Program Files\NVIDIA GPU Computing Toolkit\TensorRT-8.6.1.6\lib\nvinfer_plugin.dll’. Most likely this dynlib/dll was not found when the application was frozen.”
All I can say is that the file is there.
Hey! I am able to reproduce that error by using the CUDA 12 version of TensorRT.
PyInstallerImportError: Failed to load dynlib/dll 'C:\\TensorRT-8.6.1.6\\lib\\nvinfer_plugin.dll'. Most likely this dynlib/dll was not found when the application was frozen.
Please make sure you downloaded the top file here, not the bottom.
I was able to modify my PATH and point to the right TensorRT, then restart the server, and it worked for me (no machine restart needed.)
Please let me know if that works for you :)
My request is dumb, the UI is glitching a little but hot damn 12 iterations per second! Impressive.
YOU GOT IT WORKING?
You are the first person to stick through to the end and do it. Seriously. Thank you so much for confirming that it works on some machine besides mine and monster servers in the cloud.
The configuration is obviously a pain point, but we’re running along the cutting edge with TensorRT on Windows at all. I’m hoping Nvidia makes it easier soon, or at least relaxes the license so I’m not running afoul if I redistribute required dll’s (for comparison, Nvidia publishes TensorRT binary libraries for Linux directly on pip, no license required.)
It’s also a pain that 11.7 is the best CUDA version for Stable Diffusion with TensorRT. I couldn’t even get 11.8, 12.0 or 12.1 to work at all on Windows with TensorRT (they work fine on their own.) On Linux, they would work, but would at best give me the same speed as regular GPU inference, and at worst would be slower, completely defeating the point.
Not going to lie, I almost gave up a few times. But I can also be stubborn… anyway since this is apparently the first confirmation it works, it’s probably be helpful if I mention that it’s a 12gb 3060. :)