• @humanbroadcast
    link
    English
    12910 months ago

    Companies are in the fuck-around phase, and we’ll all have to live in the find-out era.

    • @thesystemisdown
      link
      English
      5210 months ago

      Meanwhile, the masses are still using all the ‘services’ because they all have momentum. I’m not confident any of them can do anything bad enough to chase off their users.

      • @[email protected]
        link
        fedilink
        English
        3810 months ago

        I thought Facebook would die with all the scandals, I’m the only person in my life who cared. I deleted Twitter before it became X, I’m the only one I know who did that.

        I don’t think anyone gives a shit and it’s made me hate people a lot more than I used to.

        • gregorum
          link
          fedilink
          English
          1010 months ago

          Most people I know left Facebook and Twitter years ago. Maybe it’s just a difference in people we associate with? 

      • @[email protected]
        link
        fedilink
        English
        3110 months ago

        I already left reddit because they did bad things. Assume you mean chase off a critical mass though? The fact that “X” is still a thing may prove you correct.

        • gregorum
          link
          fedilink
          English
          110 months ago

          Huge amounts of people have already left X. Those that remain are mostly bots and neo Nazis.

            • gregorum
              link
              fedilink
              English
              210 months ago

              there’s a pretty good breakdown here about how it actually is, but, obviously, X is goosing the numbers to make the loss look not-so-bad.

      • @jmanes
        link
        English
        1310 months ago

        Yeah, they’re not leaving. The only way they would leave is if the service were to be physically shut down. Pretty sure you could make everyone watch 1 minute long ads on app open and they would still stay.

      • @givesomefucks
        link
        English
        1110 months ago

        For now.

        I’m old as shit, I’ve seen an uncountable amount of “social media” come and go. At it’s heart reddit is just a forum. They’ve tacked on a lot of modern shit, but so do most of them when they’re running out of steam.

        It’s a war of attrition now. People will leave in batches overtime until it just kinda ends, or not. Myspace is still shuffling around here somewhere.

      • John Wilker
        link
        English
        510 months ago

        100%. In a writing sub I threw out. “I wish they’d just charge us, users a fee.”

        I got “Pay for this?!”

        Folks live and die on Reddit but the idea of paying is… gross. Yet they scream about moves like this.

        People, man… I dunno.

  • @[email protected]
    link
    fedilink
    English
    4110 months ago

    Which will probably last for about one year, long enough to boost IPO valuations, then openAI (we all know who’s buying it) will cancel their contract because it’s too expensive and Reddit does not actually generate enough unique content yearly to be worth continuously training on. Then the death spiral happens again.

    • kamenLady.
      link
      English
      1110 months ago

      Fuckin’ Spez will squeeze Reddit until the head pops. The only thing meticulously planned, is the flow of money to his accounts. The fastest flow.

  • @foggy
    link
    English
    3010 months ago

    Really trying to meddle an elections like it’s 2016

    • FaceDeer
      link
      fedilink
      310 months ago

      I don’t see anything in the article related to elections.

      • @ArbiterXero
        link
        English
        1810 months ago

        No, But there’s BIG money in AI astroturfing for elections.

    • @[email protected]
      link
      fedilink
      English
      -2310 months ago

      Good. Get a good taste your own medicine.

      It’s time how people in the US felt foreign entities meddling with their elections for a change, huh?

      grabs popcorn

      • SeedyOne
        link
        fedilink
        English
        2910 months ago

        You’re naive to think it’s just starting now.

      • @[email protected]
        link
        fedilink
        English
        1810 months ago

        It’s been happening for the better part of a decade though. And started probably much earlier than that, just not as blatant.

  • @PizzaFacia
    link
    English
    2910 months ago

    $60mm a year seems really cheap, no? I know its shit data from the bot posters but still would think it would be like $100-150mm

    • @[email protected]
      link
      fedilink
      English
      16
      edit-2
      10 months ago

      Honestly it’s probably the best search dataset in existence right now. You can make Google suck far less by appending “reddit” to most searches because you’ll get results from a group consisting of a higher ratio of actual humans instead of bots.

      Yeah reddit is shit, but the rest of the internet is 10x worse at this point. Pretty much any writing that isn’t a labor of love on someone’s personal page or users interacting with each other in a semi organic way is rapidly becoming 100% GPT vomit as every company in existence lays off their writing staff

      Whoever bought this got a fucking bargain.

    • @ilinamorato
      link
      English
      9
      edit-2
      10 months ago

      It’s ludicrously cheap for the size and quality of the dataset. A set of 829 academic papers at University of Michigan is priced at $25,000—about 1/2400 of this sale. If you were to scale that dollar value to the size of the Reddit dataset, you’d expect it to contain about 2 million academic papers’ worth of data.

      But Reddit has almost two decades of text written by 200 million chronically-online people. And sure, probably most Reddit users don’t write an academic paper amount of content every year; but the average is probably closer to that than not, especially when you consider that some of those subreddits like AskHistorians and AskScientists really are generating the equivalent of dozens of academic papers per day. Just based on the amount of text alone, Reddit should’ve sold us out for 50-100x what they got for just a single year of data, and 1000-2000x for the full twenty years (though, granted, they didn’t have that much data for that entire time, so let’s say half that).

      Furthermore, those 829 papers in the U of M dataset are disconnected, unlinked text representing a tiny fraction of what U of M’s 50,000 students generate in even a single year. Reddit has data with links, images, conversational responses, prompt responses, Q&As, flash fiction, slash fiction, historical deep-dives, investigations, memes, inside jokes, a development of style and consensus over time, and a comprehensive understanding of what it means to interact online, generated by people around the world over the course of 18 years. It’s much better data for almost any LLM purpose that isn’t just writing academic papers from the perspective of students at a medium size 4-year undergrad institution in the Midwestern US. The quality of the dataset should’ve made the value even higher. It’s hard to say exactly how much higher, but let’s just be extremely conservative and say it should have doubled the total.

      That means that, conservatively, the value of Reddit’s dataset—or, rather, our dataset, which Reddit freebooted from us—was about 1000x what they were paid, based on the proportional value of the U of M dataset.

      They should’ve sold us out for billions.

      Of course, we don’t know anything about what exclusivity deals or subset of data that they might have included with this deal. It might only be one year of data, and only 6 months of exclusivity. But assuming they sold the rights to the entire dataset, we got sold for pennies.

    • ormr
      link
      fedilink
      English
      110 months ago

      Is the data access exclusive for that one company? If not then it’s no miracle they’re opting for a subscription-based model lol

  • millifoo
    link
    English
    2510 months ago

    I spent a chunk of this afternoon nuking my old reddit posts. Thousands and thousands of posts… thank goodness for shreddit.

    • @NightAuthor
      link
      English
      610 months ago

      Most tools miss a ton bc of the limitations of the website and api. The best, pretty much only, way to get everything is to get an export of your data, then use that csv to delete all items one by one

      • @[email protected]
        link
        fedilink
        English
        610 months ago

        Yup, shreddit has the ability to use the csv from the data request. Took me about 24 hours to edit and erase the 20.000+ comments I made over the last 10 years.

      • @x4740N
        link
        English
        510 months ago

        Does shreddit have the ability to exclude certain subreddits because I want to exclude my comments on one subreddot but overwrite the rest of them

    • @[email protected]
      link
      fedilink
      English
      410 months ago

      I did that before they went through with their api bullshit, I’m so happy, it was fully automated. Just typed in the replacement massege and that’s it

    • @guacupado
      link
      English
      2
      edit-2
      10 months ago

      I didn’t know about this. I just went into it.

      edit: lol you have to pay for Shreddit. Nevermind, I don’t care about deleting my posts that much.

      • millifoo
        link
        English
        410 months ago

        No, not the website: the git project.

        If you want a web app, try redact.dev (yes, there’s a paid version where you can download your old messages, but the free one wipes out your posts (with random text) for free.

  • Lexi Sneptaur
    link
    fedilink
    English
    1610 months ago

    Very glad I overwrote all of my comments with random words before deleting my account. They won’t be profiting off of me anymore.

    • FaceDeer
      link
      fedilink
      1110 months ago

      Instead you’re posting to the Fediverse, which is even more open for use by third parties.

      • @residentmarchant
        link
        English
        1710 months ago

        Yea, but it’ll be open forever, nobody can turn off an app overnight and profit from it.

        • FaceDeer
          link
          fedilink
          410 months ago

          Right. But my point is that they can profit from it. The issue lots of folks seem to be having is “how dare Reddit make money using something I did!”, and that issue is even worse for the Fediverse since lots of companies can be doing it.

          • Undearius
            link
            fedilink
            English
            910 months ago

            And what value is a commodity that is available to everyone?

            I don’t disagree with your point but I’m sure it couldn’t be sold for as much as when it’s a limited resource.

            • FaceDeer
              link
              fedilink
              110 months ago

              The value comes from the work that can be done with it. If you can train an AI off it then it’s worth something.

              • Undearius
                link
                fedilink
                English
                510 months ago

                It’s one thing if you’re processing and doing work with the data but Reddit will be getting $60,000,000 a year for simple having the data.

                • FaceDeer
                  link
                  fedilink
                  110 months ago

                  They can’t do the work without the data, though.

                  Or rather, they can’t do the work without the risk of Reddit raising a legal fuss that would cost them more than $60 million. The data itself can already be downloaded for free from various places.

      • Lexi Sneptaur
        link
        fedilink
        English
        510 months ago

        Its obscure enough that I don’t think it’s being sought out by AI companies. The nature of federated instances should make it a bit more challenging to pull a complete data set too

        • FaceDeer
          link
          fedilink
          410 months ago

          Not so obscure that Meta isn’t paying attention and planning for interoperation, and Meta is one of the biggest players in the AI development field.

          A complete data set isn’t required, just a comprehensive one.

  • AutoTL;DRB
    link
    fedilink
    English
    710 months ago

    This is the best summary I could come up with:


    On Friday, Bloomberg reported that Reddit has signed a contract allowing an unnamed AI company to train its models on the site’s content, according to people familiar with the matter.

    The move comes as the social media platform nears the introduction of its initial public offering (IPO), which could happen as soon as next month.

    Reddit initially revealed the deal, which is reported to be worth $60 million a year, earlier in 2024 to potential investors of an anticipated IPO, Bloomberg said.

    In April 2023, Reddit founder and CEO Steve Huffman told The New York Times that it planned to charge AI companies for access to its almost two decades’ worth of human-generated content.

    If the reported $60 million/year deal goes through, it’s quite possible that if you’ve ever posted on Reddit, some of that material may be used to train the next generation of AI models that create text, still pictures, and video.

    Even without the deal, experts have discovered in the past that Reddit has been a key source of training data for large language models and AI image generators.


    The original article contains 379 words, the summary contains 182 words. Saved 52%. I’m a bot and I’m open source!