• @pixxelkick
    link
    210 months ago

    Most LLMs have tonnes of NSFW data in their training.

    Typically, if this wants to be blocked, a secondary RAG or LORA is run overtop to act as a filtering mechanism to catch, block, and regenerate explicit responses.

    Furthermore, output allowed lexicon is a whole thing.

    Unfiltered LLMs without these layers added on are actually quite explicit and very much capable of generating extremely NSFW output by default.