• @Serinus
    link
    English
    210 months ago

    the training data is just a statistical record of human bias.

    It’s not. It’s a record of online conversations, which tend to be more polarized and extreme than real people.

    • @[email protected]
      link
      fedilink
      English
      110 months ago

      That’s why I said

      So as long as the training data is well selected for your problem…

      It’s clear that in the training data for LLMs, 4chan, reddit, etc. are over-represented, so that explains why chatgpt might be more awful than an average person. Having an LLM decide on, e.g., college admission would be like having a Twitter poll to decide on who should be its next CEO. Like that’s obviously stupid, nobody would ever do that, right?

      The problem is that for the college admission example, the models were trained on previous admissions, taken by college employees , and these models are still biased.