• @j4k3OP
    link
    English
    110 months ago

    There is a limit to how many vectors can be effectively passed from the header (or whatever the initial attention layer us called) before it goes through the actual neural path for the layer. It is why models are constant summarizing if too much complexity is introduced. It has been too long since I read the details to talk about it accurately. I know enough to make use of it in practice; spot when complexity is being dropped by summarization. I also know how such patterns are difficult to conclusively say they are reply styles versus actual limitations. I have pushed these boundaries very hard from many angles using unrelated prompts with several 70B and 8×7B models I run on my own hardware. That is still just speculative opinion of no external value, but I don’t care. I can make useful outputs and adjust using these principles.

    Talking about bots for businesses is not remotely relevant. Those are Rags, heavily trained for a very specifically limited task. If you try and breakout one of these and it is not caught my a model loader filter, it will not handle complex thought well either. I have yet to find a model that can handle a topic, introspection, and meta analysis within a few sentences on a random subject. Now a two dimensional replay such as this one, sure. However my bad grammar mix, is harder to replicate. Small models like a 7-13B are very style and subject dependent. Those can be run on cheap hardware. If you want to train a 8×7B, you need a very expensive setup and one dedicated just for your troll bot. I’m sure there are people with more than enough funds, but the life opportunities of such wealth negates most reasons someone is buying the hardware and burning the power bill money required to do this in practice. It isn’t just next word under the surface. It is next word under what context, and there is a hard limit to the number of those contexts in a given space of relationships between vectors. It has to do with the major innovation of transformers and rotational spaces IIRC.

    • @[email protected]
      link
      fedilink
      110 months ago

      You keep saying Troll Bot as if there aren’t commercial incentives to run these bots on forums.