The guts of an LLM are 100% deterministic. At the very last step a probability distribution is output and the exact same input will always give the exact same probability distribution, tunable by the temperature. One item from this distribution is then chosen based on that distribution and fed back in.
Most people on lemmy literally have no idea what LLMs are but if you say something sounding negative about them then you get a billion upvotes.
Do I understand it correctly that the LLM’s state is changed after execution? That does sorta mean that it’s effectively non-deterministic, though probably not as severely as with an RNG plugged in (depending on the algorithm).
The only thing that changes is the data that is passed to the LLM, which for each iteration includes the last token that the LLM itself generated. So yes, sort of. The LLM itself doesn’t change state; just the data that is fed into it.
It’s also non-deterministic insofar as similar inputs will not necessarily give similar outputs. The only way to actually predict its output is to use the exact same input - and then you only get identical token probability lists on the other end. Every LLM chatbot, by default, will then make a random selection based on those probabilities. It can be set to always pick the most probable token, but this can cause problems.
There must be an RNG to choose the next token based on the probability distribution, that is where non-determinism comes in, [edit: unless the temperature is 0 which would make the entire process deterministic]. The neural networks themselves though are 100% deterministic.
I understand that could be seen as an “akschually” nitpick, but I think it’s an important point, as it is at least theoretically possible to understand that underlying determinism.
Well, technically users’ input could serve as the source of randomness, if it’s fed into modifying the internal state. Basically, a redditor is trying to interrogate the LLM as to whether Israel is bad, while someone on line 2 is teaching the LLM “I am Cornholio”. We already know how it goes when a chatbot is learning from its users, and generally the effect could vary arbitrarily from a nothingburger to a chaos-theory mess.
I don’t think it’s typical to consider user input a source of randomness. Are you talking about in context learning and thinking about what would happen if those contexts get crossed? If so, contexts are unique to a session and do not cross between them for something like ChatGPT/Claude.
For an end user yes because they’re not going to be able to adjust temperature and seeds. So you can have different results give the same input of a “prompt”
Under the hood it’s deterministic but end users don’t have anyway of tweaking that unless they set up something like comfyui and run this shit themselves.
Are they? Making a non-deterministic program is actually not that easy unless one just feeds urandom into it.
The guts of an LLM are 100% deterministic. At the very last step a probability distribution is output and the exact same input will always give the exact same probability distribution, tunable by the temperature. One item from this distribution is then chosen based on that distribution and fed back in.
Most people on lemmy literally have no idea what LLMs are but if you say something sounding negative about them then you get a billion upvotes.
Do I understand it correctly that the LLM’s state is changed after execution? That does sorta mean that it’s effectively non-deterministic, though probably not as severely as with an RNG plugged in (depending on the algorithm).
The only thing that changes is the data that is passed to the LLM, which for each iteration includes the last token that the LLM itself generated. So yes, sort of. The LLM itself doesn’t change state; just the data that is fed into it.
It’s also non-deterministic insofar as similar inputs will not necessarily give similar outputs. The only way to actually predict its output is to use the exact same input - and then you only get identical token probability lists on the other end. Every LLM chatbot, by default, will then make a random selection based on those probabilities. It can be set to always pick the most probable token, but this can cause problems.
There must be an RNG to choose the next token based on the probability distribution, that is where non-determinism comes in, [edit: unless the temperature is 0 which would make the entire process deterministic]. The neural networks themselves though are 100% deterministic.
I understand that could be seen as an “akschually” nitpick, but I think it’s an important point, as it is at least theoretically possible to understand that underlying determinism.
Well, technically users’ input could serve as the source of randomness, if it’s fed into modifying the internal state. Basically, a redditor is trying to interrogate the LLM as to whether Israel is bad, while someone on line 2 is teaching the LLM “I am Cornholio”. We already know how it goes when a chatbot is learning from its users, and generally the effect could vary arbitrarily from a nothingburger to a chaos-theory mess.
I don’t think it’s typical to consider user input a source of randomness. Are you talking about in context learning and thinking about what would happen if those contexts get crossed? If so, contexts are unique to a session and do not cross between them for something like ChatGPT/Claude.
I was speaking about the user visible behavior, the context that I was replying to.
For an end user yes because they’re not going to be able to adjust temperature and seeds. So you can have different results give the same input of a “prompt”
Under the hood it’s deterministic but end users don’t have anyway of tweaking that unless they set up something like comfyui and run this shit themselves.