YSK: Your Lemmy activities (e.g. downvotes) are far from private

Muddybulldog · edit-2 2 years ago

YSK: Your Lemmy activities (e.g. downvotes) are far from private

@[email protected] · 2 years ago

Obviously, this isn’t ideal. But this isn’t as damning as some of the other commenters believe.

The way reddit operates, is that they are “trusted” with all our data. They can (and do), sell any data they like, to whomever they like. They store much more information than simply who upvoted what. They can’t simply allow upvotes with no claimant, they’d have no way of stopping or identifying bots or illegitimate upvotes.

This system is not ideal, but it’s also not necessarily worse. We’re still operating under that system, the only real difference is, we get to choose who that trusted party is. We get to move instances if the hosters interests become misaligned with our own.

Ultimately, there needs to be a smart solution to this problem to ensure it’s not abused. We can’t completely remove collection of the data, otherwise upvotes will be meaningless and hijacked by agendas. We can’t simply encrypt the data, if there’s a genuine use for it (which we’ve discussed), who SHOULD be allowed to decrypt it?

I completely understand the concern, and I share it. But this isn’t an issue so much with Lemmy, it’s an issue with upvotes on distributed social media.

Edit: Okay, ANY instance admin is where the issue lies. That much I agree with.

@ScaNtuRd · 2 years ago

There’s a huge difference between Reddit keeping our data “locked away” on their private server vs. a system that puts it all out in public view. You can bet your behind that Big Tech and governments are harvesting ALL of it as we speak. This is MUCH worse than Reddit just selling some data to a few third party actors.

@[email protected] · edit-2 2 years ago

I completely agree that sharing it with other instances is a problem.

You can bet your behind that Big Tech and governments are harvesting ALL of it as we speak.

This is super nitpicky, but assuming it exposed even a minute amount of the data that Reddit freely ships to whoever buys it (including governments), I actually think it’s far less likely to be seen. Social media companies are well-known to freely give access to anything law enforcement, governments or advertisers would like. Most if not all, have exposed APIs which allow law enforcement at least to collect almost any data at their leisure. This data is packaged up by the orgs who have the data.

Scraping Lemmy for this information would require their own solutions, and backends to handle all the data. Here in the UK, our tecnically-inept government famously broke their multi-billion COVID test-and-trace system because the excel spreadsheet they used as a database, ran out of lines…

Even assuming it’s true that all of these groups have bothered to make their own solutions and bought server space to store the data themselves for a relatively tiny (certainly until very recently), the only data they get is who liked what post/comment.

That is a small snowflake compared to the iceberg that other social media organizations collect, package and sell. Facebook for example collect enough data that they earn more per user than Netflix.

Certainly, as Lemmy and ActivityPub gain more traction, this is a privacy hole which deserves some consideration, and should be immediately plugged. But I just don’t think it’s in the same solar system as exposing data to any social media site.

@[email protected] · 2 years ago

I think that the best solution is probably “best practices” and defederatiom used to enforce some sort of minimal Code of Conduct wrt the actual mechanics of running an instance.

Otherwise, the only other way I could see to address this is to lump some data at the instance level. I.e. each instance simply reports a total of upvotes and downvotes from it’s instance, and you just have to trust the instances to behave. There might be some checks to make sure the vote totals are plausible.

FreeFacts · 2 years ago

I think that the best solution is probably “best practices” and defederatiom used to enforce some sort of minimal Code of Conduct wrt the actual mechanics of running an instance.

In reality, this will be the end of small instances. Only feasible way to enforce this is federation whitelists, and it will be very hard to get whitelisted. Not necessarily a bad thing in the big scheme of things when we weight the positives and negatives, but still sucks for anyone “self hosting” an instance.

@[email protected] · 2 years ago

True. Any random unverified instance could be set up just to harvest data from the Fediverse.