Federation troubleshooting

Ruud · 2 years ago

Federation troubleshooting

phiresky · edit-2 2 years ago

I want to say that with 0.18 the definition of federation_workers has changed massively due to the improved queue. As in, whatever is good in 0.17 is not necessarily good for 0.18.

On 0.18, it probably makes sense to have it around 100 to 10’000. Setting it to 0 is also be an option (unlimited, that’s the default). Anything much higher is probably a bad idea.

On 0.18, retry tasks are also split into a separate queue which should improve things in general.

0 might have perf issues since every federation task is one task with the same scheduling priority as any other async task (like ui / user api requests). So if 10k federation tasks are running and 100 api requests are running then tokio will schedule the api requests with probability 100 / (10k+100) (if everything is cpu-limited). (I think, not 100% sure how tokio scheduling works)