• MudMan
    link
    fedilink
    15413 hours ago

    I keep having to repeat this, but the conversation does keep going on a loop: LLMs aren’t entirely useless and they’re not search engines. You shouldn’t ask it any questions you don’t already know the answer to (or have the tools to verify, at least).

    • @[email protected]
      link
      fedilink
      English
      296 hours ago

      Or if you’re fine with non-factual answers. I’ve used chatgpt various times for different kinds of writing, and it’s great for that. It can give you ideas, it can rephrase, it can generate lists, it can help you find the word you’re trying to think of (usually).

      But it’s not magic. It’s a text generator on steroids.

      • MudMan
        link
        fedilink
        106 hours ago

        Sure! Used as… you know, what it is, there’s a lot of fun/useful stuff you can do. It’s just both AIbro shills and people who have decided to make hating on this tech a core part of their personality have misrepresented that.

        It’s indeed very, very good text generation/text parsing. It is not a search engine, the signularity, Skynet or a replacement for human labor in the vast majority of use cases.

    • @[email protected]
      link
      fedilink
      English
      24 hours ago

      LLMs are good for some searches or clarification that the original website doesn’t say. Ex the “BY” attribute in creative commons being acronymed to “BY” (by John Doe) and not “AT” (attributed to John Doe)

    • @[email protected]
      link
      fedilink
      English
      36 hours ago

      I had to tell DDG to not give me an AI summary of my search, so its clearly intended to be used as a search engine.

      • MudMan
        link
        fedilink
        136 hours ago

        “Intended” is a weird choice there. Certainly the people selling them are selling them as search engines, even though they aren’t one.

        On DDG’s implementation, though, you’re just wrong. The search engine is still the search engine. They are using an LLM as a summary of the results. Which is also a bad implementation, because it will do a bad job at something you can do by just… looking down. But, crucially, the LLM is neither doing the searching nor generating the results themselves.

        • @[email protected]
          link
          fedilink
          English
          16 hours ago

          What do you mean its not generating the results? If the summation isn’t generated, wheres it come from?

          • @[email protected]
            link
            fedilink
            English
            45 hours ago

            I dont want to speak for OP but I think they meant its not generating the search results using an LLM

            • @[email protected]
              link
              fedilink
              English
              15 hours ago

              Maybe I just don’t know what “generating results” means. You query a search engine, and it generates results as a page of links. I don’t understand how generating a page of links is fundamentally different from generating a summation of the results?

              • @[email protected]
                link
                fedilink
                English
                65 hours ago

                Its a very different process. Having work on search engines before, I can tell you that the word generate means something different in this context. It means, in simple terms, to match your search query with a bunch of results, gather links to said results, and then send them to the user to be displayed

                • @[email protected]
                  link
                  fedilink
                  English
                  25 hours ago

                  then send them to the user to be displayed

                  This is where my understanding breaks. Why would displaying it as a summary mean the backend process is no longer a search engine?

                  • MudMan
                    link
                    fedilink
                    35 hours ago

                    The LLM is going over the search results, taking them as a prompt and then generating a summary of the results as an output.

                    The search results are generated by the good old search engine, the “AI summary” option at the top is just doing the reading for you.

                    And of course if the answer isn’t trivial, very likely generating an inaccurate or incorrect output from the inputs.

                    But none of that changes how the underlying search engine works. It’s just doing additional work on the same results the same search engine generates.

                    EDIT: Just to clarify, DDG also has a “chat” service that, as far as I can tell, is just an UI overlay over whatever model you select. That just works the same way as all the AI chatbots you can use online or host locally and I presume it’s not what we’re talking about.

    • @chemical_cutthroat
      link
      English
      8013 hours ago

      Yeah. Everyone forgot the second half of “Trust, but Verify”. If I ask an LLM a question, I’m only doing it because I’m not 100% sure how to look up the info. Once it gives me the answer, I’m checking that answer with sources because it has given me a better ability to find what I was looking for. Trusting an LLM blindly is just as bad as going on Facebook for healthcare advice.

      • @danc4498
        link
        English
        26 hours ago

        I thought it was “butt verify” whoops

      • @eronth
        link
        English
        2011 hours ago

        I find LLMs very useful for setting up tech stuff. “How do I xyz in docker?” It does a great job of boiling together several disjointed How Tos that don’t quite get me there into one actually usable one. I use it when googling and following articles isn’t getting me anywhere, and it’s often saved so much time.

        • @[email protected]
          link
          fedilink
          English
          147 hours ago

          They are also amazing at generating configuration that’s subtly wrong.

          For example, if the bad LLM generated configurations I caught during pull requests reviews are any example, there are plenty of people with less experienced teams running broken kubernetes deployments.

          Now, to be fair, inexperienced people would make similar mistakes, but inexperienced people are capable of learning with their mistakes.

      • MudMan
        link
        fedilink
        2813 hours ago

        Yep. Or because you can recognize the answer but can’t remember it off the top of my head. Or to check for errors on a piece of text or code or a translation, or…

        It’s not “trust but verify”, which I hate as a concept. It’s just what the tech can and cannot do. It’s not a search engine finding matches to a query inside a large set of content. It’s a stochastic text generator giving you the most likely follow up based on its training dataset. It’s very good autocorrect, not mediocre search.

    • @[email protected]
      link
      fedilink
      English
      -56 hours ago

      honestly LLMs are about a thousand times more useful than Google at this point. Every week i try googling and get nothing but spam results.

      for example just yesterday i was searching for how to reclaim some wasted space on one of my devices. so i searched on Google and tried 8 different pages that were ad-riddled hell holes.

      i gave up and spent 10 seconds with an LLM and got the answer i needed. i will admit that i had to tell it to quit bullshitting me at one point but i got what i needed. and no ads.

      • MudMan
        link
        fedilink
        86 hours ago

        Well, you shouldn’t be using Google Search, but that’s a completely different conversation and the answer shouldn’t (can’t) be “let’s just use LLMs, then”.

        • @[email protected]
          link
          fedilink
          English
          26 hours ago

          bing or duck duck go, too. i just say googling because it sounds stupid as shit to say anything else. DDG is my default search engine. kagi isn’t much better, and comes with its own issues

          • MudMan
            link
            fedilink
            46 hours ago

            So we’re talking about SEO and the content being generated in the first place? Yeah, it’s worse than it used to be when the main application online was websites, but I still want/need a reliable way to parse results across… you know, Wikipedia and Reddit, mostly. IMDB sometimes. It may have looped around to the old days of Altavista directory search, but it’s still a valuable tool. And crucially not replaced by an LLM, especially for the kind of non-obvious queries where you don´t just go to the site you know will have the answer directly.

            • @MutilationWave
              link
              English
              1
              edit-2
              2 hours ago

              Altavista was the shit when it came out. My classmates and friends were surprised at how quick I was getting answers or general information. Altavista, that’s it. If you’re using Ask Jeeves or Yahoo you’re going to have a hard time.

              I can’t remember how I found out about it, but it’s what I used until Google came out. Anyone know if they were the first to use web crawlers like that or did they just popularize the concept?

              • MudMan
                link
                fedilink
                22 hours ago

                I’m fuzzy on the timeline, but it was definitely THE search engine for a while. And I’d say the one that’s most memory-holed. I feel like Yahoo’s unlikely survival as some vestigial online service made people remember it and I guess Americans in particular had an Ask Jeeves moment at some point? For me it was Altavista until Google, for sure, and they were trading blows for a good while. I almost remember Gmail being the thing that tipped the scales more than the search quality.

    • @[email protected]
      link
      fedilink
      English
      -812 hours ago

      LLM is a random person in the internet, or the first link on a search.

      If you wouldn’t blandly trust them, don’t trust it.

      • MudMan
        link
        fedilink
        2111 hours ago

        LLM is a LLM. LLM is a transformer model generating likely output from a dataset.

        I hate all this analogy stuff people keep resorting to. The thing does what it does, and trying to understand what it does by analogy is being used disingenuously to push all sort of misinformation-filled agendas.

        It’s not about “trust”, it’s about how the output you’re being given is generated, and so what types of outputs are useful on what applications.

        The answer is fairly narrow, particularly compared to how it’s being marketed. It absolutely, 100% isn’t a search engine, though. And even when plugged into a search engine and acting as a summarization engine it’s actually pretty terrible and very likely to distort an output that anybody who has been near a computer in the past thirty years can parse faster at a glance.

    • @seven_phone
      link
      English
      -712 hours ago

      That is exactly the point, LLM aim to simulate the chaotic best guess flow of the human mind, to be conscious and at least present the appearance of thinking and from that to access and process facts but not be a repository of facts in themselves. The accusation here that the model constructed a fact and then built on it is missing the point, this is exactly the way organic minds work. Human memory is constantly reworked and altered based on fresh information and simple musings and the new memory taken as factual even while it is in large part fabricated, and to an increasing extent over time. Many of our memories of past events bear only cursory fidelity to the actual details of the events themselves to the point that they could be defined as imagined. We still take these imagined memories as real and act upon them exactly as has been done here by the AI model.

      • MudMan
        link
        fedilink
        1111 hours ago

        As below, stop with the analogies. No, that’s not “the chaotic best guess flow of a human mind”, that’s a whole bunch of tensor math generating likely chains of tokens. Those two things aren’t the same thing.

        They aren’t the same thing in the strict sense, but they’re also not the same thing in practical terms at the end user level. If I ask a friend if they remember some half-forgotten factoid they can tell me not just if they do remember, but also how well they remember, how sure they are and why they know it. No LLM can do that, because LLMs know as little about themselves as about anything else. Which is nothing, because they’re LLMs, not people.