cross-posted to: https://lemmy.world/post/2499861

As I said, I made a lossy reformat of the database and a lossless one for 6.0 Gib (6,477,905,920). compared to ~26GIB from Reddit, where fields are almost intentionally anti-compressed to take up more room.

If there is somewhere I can host it, let me know.

also, I couldn’t figure this out, do sqlite databses store any information on the creator or editor of a document?

why it's lossy

It’s missing a large table of base64 urandom technically required to recreate the document fully

    • Hello HotelOP
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      2 years ago

      thanks, how do I crosspost/ move this one?

      • Skyhighatrist@lemmy.ca
        link
        fedilink
        English
        arrow-up
        15
        ·
        2 years ago

        Using the web-ui, on this post there is an icon made up of two squares. It’s right next to the star for saving the post. That’s the cross post button.

  • inspxtr
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    1
    ·
    edit-2
    2 years ago

    here are a few options that I see but never actually use.

    Your data don’t seem to be massive compared to the types of data people store on there. So I don’t think it’s gonna be an issue. Plus, if you deposit your data in 1 archivist place + 1 research place, the data may be used by more people. Don’t forget about licenses btw.

    EDIT: added https://socialmediaarchive.org/ to the list, just found out about that.

  • fiat_lux@kbin.social
    link
    fedilink
    arrow-up
    3
    ·
    2 years ago

    Is this derived directly from the data reddit stored/created or is it a reconstruction of some kind from observing the r/place output? I’m tempted to look at the table structures but not tempted enough to download 4 gigs of it just yet.

    • Hello HotelOP
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      edit-2
      2 years ago

      rebuilt from reddit’s offitial sources, still messing with optomizations, is adding a color definitions table worth it?

      edit, YES, only 32 unique colors ever