A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

@assassin_aragorn · 2 years ago

A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

@orclev · 2 years ago

While it glosses over a lot of details it’s not fundamentally wrong in any fashion. A LLM does not in any meaningful fashion “know” anything. Training an LLM is training it on what words are used in relation to each other in different contexts. It’s like training someone to sing a song in a foreign language they don’t know. They can repeat the sounds and may even recognize when certain words often occur in proximity to each other, but that’s a far cry from actually understanding those words.

A LLM is in no way shape or form anything even remotely like a AGI. I wouldn’t even classify a LLM as AI. LLM are machine learning.

The entire point I was trying to make though is that a LLM does not store specific training data, rather what it stores is more like the hashed results of its training data. It’s a one way transform, there is absolutely no way to start at the finished model and drive it backwards to derive its training input. You could probably show from its output that it’s highly likely some specific piece of data was used to train it, but even that isn’t absolutely certain. Nor can you point at any given piece of the model and say what specific part of the training data it corresponds to or vice versa. Because of that it’s impossible to pluck out some specfic piece of data from the model. The only way to remove data from the model is to throw the model away and train a new model from the original training data with the specific data removed from it.