Looks like it works.
Edit still see some performance issues. Needs more troubleshooting
Update: Registrations re-opened We encountered a bug where people could not log in, see https://github.com/LemmyNet/lemmy/issues/3422#issuecomment-1616112264 . As a workaround we opened registrations.
Thanks
First of all, I would like to thank the Lemmy.world team and the 2 admins of other servers @[email protected] and @[email protected] for their help! We did some thorough troubleshooting to get this working!
The upgrade
The upgrade itself isn’t too hard. Create a backup, and then change the image names in the docker-compose.yml
and restart.
But, like the first 2 tries, after a few minutes the site started getting slow until it stopped responding. Then the troubleshooting started.
The solutions
What I had noticed previously, is that the lemmy container could reach around 1500% CPU usage, above that the site got slow. Which is weird, because the server has 64 threads, so 6400% should be the max. So we tried what @[email protected] had suggested before: we created extra lemmy containers to spread the load. (And extra lemmy-ui containers). And used nginx to load balance between them.
Et voilà. That seems to work.
Also, as suggested by him, we start the lemmy containers with the scheduler disabled, and have 1 extra lemmy running with the scheduler enabled, unused for other stuff.
There will be room for improvement, and probably new bugs, but we’re very happy lemmy.world is now at 0.18.1-rc. This fixes a lot of bugs.
Thanks. Made a one-time 50€ contribution and started monthly payments as well. Hopefully this can help fund a server upgrade.
Not gonna lie, having concerns that without hardware upgrades this instance may be too big to upgrade going forward, if it isn’t already. Just what I’m seeing. Created an account on another instance (that I’m posting from) and I might just stay on this other instance. That said, I do appreciate the work put in to this one.
There is a lot of work being put into optimizing the Lemmy backend. There is a LOT of low hanging fruit in regards to performance gains from database bottlenecks.
Lemmy was an obscure platform with only ~1k total active users last month, it’s going to take time for the maintainers to get their bearings, the developer community to organize, and for everyone to figure out how to maintain and scale these operations at the number of users we’re seeing today and going forward.
The upgrade to 0.18.1 alone brings with it major performance improvements and there is more to come. We can get a lot more mileage out of the hardware we’re using today, it’s just that the platform blew up in scale before the all of the can bottlenecks were identified and worked out.
Would’ve been nice if Reddit didn’t shit itself as quickly and as explosively as it did
That’s what federation is for. I’m not going to jump from instance to instance, though. I’m here and I think I’ll stay here for the time being. Servers need to be paid anyway, no matter which instance it’s running on.
Please also consider making a donation to the maintainers! They’re the ones keeping these ships afloat by implementing new features and optimizations, while also coordinating and steering the overall direction of the open source project.
They were previously fully funded under a grant from NLNet, but that hinged on them being able to meet certain milestones by specific deadlines. Now that the project has blown up overnight they’re spending a LOT more time doing project management and addressing the needs of scaling, and are unable to meet the grant criteria.