• Lvxferre
    link
    fedilink
    English
    2
    edit-2
    1 year ago

    I mentioned this in another thread, but language has additional semantic and pragmatic layers, beyond the grammatical one: language uses words to refer to concepts and concepts to convey purpose. We don’t simply string the words together and call it a day, and yet that’s exactly what those LLMs do.

    Based on that, here’s what I think that it’s happening:

    The initial prompt made ChatGPT weight words based on two criteria - 1) used in the same context as “deadly”, and 2) having five letters. The weight assigned for “lethal” was so big that it eclipsed any weight assigned to having five letters.

    The big off-topic wall of text about Christianity and Transhumanism is likely the result of the sequence of prompts exhausting ChatGPT’s options for highly weighted following tokens, picking a low weight one at random (that “inj”) that would require more tokens (as it’s a word fragment). ChatGPT did this with <|endoftext|> and it worsened the situation - because now you have a token demanding new tokens (inj) and another signalling end of text. The randomness was only fixed when it picked the token “Transhumanism”, that prompted it to output new tokens from a rather obscure topic, that is likely being transcribed verbatim.

    The opposite likely happened with “vexed”, as the further prompts exhausted the options for words for “harmful”; then it picked a word that can be somewhat interpreted as related to psychological harm.

    I gave the nicknames a websearch. “Poh92” pops up some times, but “Nightmimists” never does it, neither in DDG nor in the botnet (Google). I wonder where ChatGPT copied this from. Gotta love businesses harvesting data that people produced to sell it again to people, uh.


    I know that I’ll sound foul and I apologise for that. The discussion in “Hacker” News, as usual, made me facepalm. It reminds me why I’m discussing this here - the links are usually good stuff, but the commenters there show that same nasty type of idiocy as in Facebook and Reddit.