Threativore: An anti-spam automoderator for Lemmy

db0 · 9 months ago

Threativore: An anti-spam automoderator for Lemmy

FaceDeer · 9 months ago

Another more general property that might be worth looking for would be substantially similar posts that get cross-posted to a wide variety of communities in a short period of time. That’s a pattern that can have legitimate reasons but it’s probably worth raising a flag to draw extra scrutiny.

One idea for making it computationally lightweight but also robust against bots “tweaking” the wording of each post might be to fingerprint each post based on rare word usage. Spam is likely to mention the brand name of whatever product it’s hawking, which is probably not going to be a commonly used word. So if a bunch of posts come along that all use the same rare words all at once, that’s suspicious. I could also easily see situations where this gives false positives, of course - if some product suddenly does something newsworthy you could see a spew of legitimate posts about it in a variety of communities. But no automated spam checker is perfect.

db0 · edit-2 9 months ago

Feel free to submit a PR for these ideas. For post similarity, ML learning techniques can be used to calculate the “distance” between two posts, but I don’t know if with an increasing amount of spam could work computation wise. Especially if spammers start using their own GenerativeAI engines.

FaceDeer · 9 months ago

That’s why I was suggesting such a simple approach, it doesn’t require AI or machine learning except in the most basic sense. If you want to try applying fancier stuff you could use those basic word-based filters as a first pass to reduce the cost.

db0 · 9 months ago

There’s likely a lot of anti spam tactics we can employ. I hope people will help improve it

@GlitterInfection · 9 months ago

Honestly, my dream lemmy client would combine posts in my home and all feed based solely on the links in the post regardless of community or instance, and it would then provide UX to present the rest of the information if I choose to click into it.

Lemmy is designed around a concept that almost requires but definitely invites spamming links. Assuming you have good intentions and want to reach a wider federated audience, you would post your link to a few instances at once.

Threativore: An anti-spam automoderator for Lemmy

Threativore: An anti-spam automoderator for Lemmy

GitHub - db0/threativore: A Thrediverse bot fight against spam