Hello everyone,

Recently I have returned to managing a kubernetes cluster in my homelab with Ansible on RHEL distros. Since I haven’t touched to the installation stages since quite a long time I started to look for tutorials from the base installation to the cni configuration, MetalLB setup and metrics server installation.

In every single tutorial, I have seen major issues that made me pull my hair:

  • First and the worst, most tutorials obviously have the firewall disabled or tells you to deactivate it. Just. No. I know deactivating it makes everything much easier and many issues disappear as soon as you run a systemctl stop firewalld. But if you want to teach correcty, you wouldn’t recommend something that would make you fired on the spot.

  • CNI installations are straight forward but miss important information for troubleshooting. Stuff like putting flannel interfaces in the internal zone or adding some direct forwarding rules to firewalld can be necessary but again, everyone and their mothers have their firewall off so they never talk about it.

  • In MetalLB, the configMap used by the speakers is not created automatically by the official manifest. Missing it is impossible as the speaker straight up do not start and the logs are straightforward. Yet I have never seen one tutorial mention it.

  • Again in metalLB, if the controller is on a worker node, webhooks are not accessible and you cannot configure the load balancer. It’s rare-ish and easy to fix but again, never seen any mention of that

  • While Flannel, MetalLB, Weave, … clearly state which ports you need to open for their solutions, tutorials never do (firewall? Someone?)

  • The metrics server has some … Particularities (like the need to modify the startup arguments or the dnsPolicy). Those are easily found in the github issues due to how frequent they’re but I can never seem to find a tutorial mentionning those extra configuration to do.

  • Various basic stuff like a worker node + a cni being needed for coreDNS and the master node to become ready. Or how to verify your deployment of ingress/cni/metalLB is working correctly. If you are familiar with Kubernetes, it’s not too hard to find the solution to those but when most of your audience, it should be explicit to at least share a random nginx manifest to test if everything is good.

This is mainly a rant because it is crazy to see that a tutorial that is supposed to explain the documentation but faster is utterly useless because of course, you won’t get any forwarding issues between interfaces if your device is an open bar.

And that most of them are like this.

So to everyone who also tried to follow tutorials for the set up of their clusterw what was your experience with them? Were they also useless or did you find a gem that didn’t simply copy pasted the documentation and took screenshots of an working cluster setup without trying their guide?

  • @bigredgiraffe
    link
    English
    5
    edit-2
    1 year ago

    I have run into this issue a lot, I have always found that most of the tutorials set things up in isolation and never talk about integration points or how to build a whole solution.

    On the MetalLB configmap point, that’s another issue I have run into. In the earlier days of metallb it was configured differently and the configmap was automatically created but that has since changed, took me a bit to figure out when that changed as their docs aren’t explicit if I remember correctly. Annoying either way.

    I think the reason most tutorials turn off the firewall is in a well configured cloud environment like AWS the host firewall is redundant due to security groups and that is what everyone targets the tutorials for unfortunately and they never explain that even with “disable this if you have other mitigating controls in place” or something.

    I have also wondered if we have finally reached the era where the majority of content creators and consumers have never touched an on-prem network and don’t even think about that lens anymore, another good example of this is trying to configure MetalLB in a host with multiple interface that don’t have the same networks available (you know, like using dedicated interfaces for storage like you should), for a long time it just wasn’t possible and metallb would announce all networks on all interfaces which made it basically not functional heh. Whatever the reason is, you are not alone in being annoyed :D

    Anyway, these are great points, I have been pondering writing up a larger set of tutorial about my setup since it’s more similar to a small enterprise anymore, I should get on that hah.