This is an idea I’ve had for a while now. Please let me know in the comments your thoughts on this feature.
This is my own feature request. I think the more I use Lemmy the more I find myself ignoring posts I’ve already read on a separate instance. In theory it doesn’t sound like a bad idea to check out the same post from different instances since they have different comments however in practice this is rarely useful. I think a feature that essentially remembers the last ~100 posts you’ve read/hidden and then automatically filters out any new posts loaded with the same content would be cool.
Specifics:
- Create a “cache” that remembers the last 100-1000 posts that are either read or hidden. This cache is per post feed. It can remember posts based on the title + maybe the first 100 characters of the post body + url (but not image urls since these can differ based on which instance the image is uploaded to). We can exclude any post with a title less than 25 characters and no body.
- For each new post loaded, check against the cache. If the post hits the cache then that post is automatically marked as read. If hide read posts is active, the post is already automatically hidden.
- When a post is read/hidden, all loaded, but unread posts are checked to see if any match the read/hidden post. If they are, they are marked as read.
Possible things to test/include:
- Have the “cache” be based on time. Eg. only things read/hidden within the last 24 hours are checked again. This could be a setting.
- Make the cache global. Eg. check across post feeds. This could be a setting.
The weird results will he there if you limit the “cache”, meaning that previous hidden posts will pop into existance.
I like the overall idea, just not how you think it should/could be implemented.
I think I understand the issue you think will occur however the app works differently from what you think. I drew a diagram to illustrate how the app determines what posts to show you today:
Essentially each “raw post” from the server is processed once by a pipeline before it’s added to the list of posts. This feature will be implemented within the post processing pipeline. So just because a post is no longer in the “cache” does not mean it will magically appear in the final list of posts as it was already processed. It will not be processed again.
I don’t know how pists are “gotten” but if they are only served once it will work I guess!
I would have a tag, “has_been_read” or something. OTOH, memory is cheap, slap a 10.000 limit if needed ;-)
Are you going to implement this or is it more an idea you’re floating around?
I’m pretty sure I’ll implement it. It’s going to be optional of course (likely default off).
If you do, I’ll try it out! Good luck!