Apparently one of the lemmy.ml admins was overzealous in banning all User-Agent strings that contained the word “bot”. Bans were entered for all of the individual strings containing that word which were observed in their webserver logs, which impacted kbin’s reported agent of “kbinBot”.

The issue has been fixed, and I observed that one of my kbin posts to a lemmy.ml community was successfully pushed to the original instance.


Edit:

Here are all the links that I’ve found with the lemmy.ml admins discussing the issue:

  • blightbowOP
    link
    fedilink
    8
    edit-2
    1 year ago

    That assumes they were using an expression based filter in the webserver config itself. If they were extracting user agent strings containing the word “bot” from their webserver logs and adding them to a static list of user agents to deny (particularly if it’s an external file referenced by the config that strings can be easily dumped into), it’s a plausible explanation. I can especially see this happening if they did a blind sort by log volume and only inserted the 20 biggest results or somesuch.

    Even if this was the case, was someone in a position to observe that one of those strings contained “kbin”? Yes. Was it possible they still didn’t notice? Yes, especially if shell pipelines are involved. Was it possible for someone to notice but assume that this wasn’t the kbin software itself, but a third-party tool that someone else wrote? Also yes. Still possible that all of this is bullshit? Still yes!

    Full disclosure: I’ve worked in the webserver and webapp adjacent spaces for a long time, and I have a lot of appreciation for how much damage one person’s stupid change without peer review can do in massive production environments. :) I am admittedly biased toward applying Hanlon’s razor in these situations.

    • Deceptichum
      link
      fedilink
      41 year ago

      If they were doing that, others with bot in the name would have been caught, no?

      Yet the people who tested it said that wasn’t the case.

      • blightbowOP
        link
        fedilink
        91 year ago

        Like I said, a blind sort by volume of the top n user agents in their logs containing the word bot would be enough to do it. Drop the output of that sort into a text file or a hash table, then create a user agent filter in the nginx config that blocks the specific strings seen in that file.

        It is very much the sort of thing that a single admin can do by accident, and the exact sort of problem I would expect to see with rapidly growing instances operated by a very small number of tech enthusiasts.

      • Teppic
        link
        fedilink
        31 year ago

        From the response it is likely that many other specifically identified phrases which do contain the word ‘bot’ have indeed been blocked (presumably still are).

        The slight variations in kbinBot which were subsequently tried wouldn’t previously have shown up in the logs and so wouldn’t have been added to the blacklist.