A Hugging Face employee made a huge dataset of Bluesky posts, and it’s already very popular.

  • @vzq
    link
    English
    37 hours ago

    Already taken down.

    • @[email protected]
      link
      fedilink
      English
      17 hours ago

      For those interested:

      I’ve removed the data from this dataset since there was a lot of community pushback about its creation/uploading. I will leave the dataset repository up to allow room for discussion of how datasets can be used to help improve Bluesky and allow people to build the tools they need to build their own open models and approaches to creating feeds that work for their needs. Please feel free to continue to leave feedback in the discussions here.

      https://huggingface.co/datasets/bluesky-community/one-million-bluesky-posts