• @CriticalMiss
    link
    English
    152 months ago

    Earlier reports suggested they trained it on books from Bibliotik.

    What changed?

    • @halcyoncmdr
      link
      English
      252 months ago

      Probably just both honestly.

    • @BetaDoggo_
      link
      English
      32 months ago

      The llama-1 paper acknowledged the use of the books dataset, libgen isn’t mentioned in any of the papers so this is new info.