• @kn33
    link
    English
    215 hours ago

    Am I misunderstanding your comment or does it completely ignore context windows? Not that context windows are long-term, but it’s not zero.

    • @brucethemoose
      link
      4
      edit-2
      15 hours ago

      The context window is indeed the LLM’s memory.

      …But its also muddy.

      Many LLMs get ‘dumber’ and less attentive as their context windows grow, and OpenAI’s models just happen to be one of these. It’s awful close to the full 128K, even with the full GPT-4. Mistral models are also really bad at long context understanding while, conversely, I find that Google Gemini and Qwen 2.5 are really good close to their limits.

      There are attempts to try and measure this performance objectively, like: https://github.com/NVIDIA/RULER