@[email protected]MB to Hacker [email protected]English • 1 year agoSearchable Database of the 183,000 Pirated Books Meta, et al., Used to Train AIwww.theatlantic.comexternal-linkmessage-square1fedilinkarrow-up18arrow-down12file-textcross-posted to: [email protected]
arrow-up16arrow-down1external-linkSearchable Database of the 183,000 Pirated Books Meta, et al., Used to Train AIwww.theatlantic.com@[email protected]MB to Hacker [email protected]English • 1 year agomessage-square1fedilinkfile-textcross-posted to: [email protected]
minus-square@akrotlinkEnglish0•1 year agoFor anyone interesred, books3 were part of The Pile data used to train LLMs. They used to be hosted by The Eye, but recently removed due to DMCA. Their torrent link is still up though.
For anyone interesred, books3 were part of The Pile data used to train LLMs. They used to be hosted by The Eye, but recently removed due to DMCA. Their torrent link is still up though.