• andrew_bidlaw
    link
    fedilink
    English
    1110 hours ago

    As it learns from our data, no wonder it fucks up at regexps. They are the arcane knowledge not accessible to us mere mortals, nor to LLMs.

    • @[email protected]
      link
      fedilink
      79 hours ago

      If you know even a little about how an LLM works it’s obvious why regex is basically impossible for it. I suspect perl has similar problems, but no one is capable of actually validating that.

      • Ignotum
        link
        11 hour ago

        What do you mean it’s impossible for it? I know how LLMs work but I don’t know if any such limitations

        Write me a regex that matches a letter repeated four times, followed by a 3 or 4 digit number

        Here’s your regex: ([a-zA-Z])\1{3}\d{3,4}

  • fmstrat
    link
    fedilink
    English
    79 hours ago

    I love regex. I know, most don’t, but I do. GPT/Claude can write some convincing code, but their regexes can be spotted a mile away.

  • @[email protected]
    link
    fedilink
    English
    29
    edit-2
    12 hours ago

    You know what? If your management is telling you to use AI generated code to “go faster”, just go ahead and do it. But fork the repo first, in case you’re still around when they get fired and someone sensible says to put it back how it was before.

  • Snot Flickerman
    link
    fedilink
    English
    69
    edit-2
    15 hours ago

    Management: Fuck it, ship it.


    The people at the top honestly don’t give a fuck if it barely works as long as it’s an excuse to cut costs. In things like Customer Service, barely working is a bonus, because it makes customers give up before they try to get their issue solved.

  • @kitnaht
    link
    50
    edit-2
    15 hours ago

    I mean, I bet it failed at making a regex that worked much faster than you could fail at writing a regex that worked. Sounds like progress! :D

    • Karyoplasma
      link
      fedilink
      2315 hours ago

      I am always suspicious if a regex I write doesn’t throw some form of pattern compilation error. It usually means I’m not even close to the correct solution.

  • @cm0002
    link
    17
    edit-2
    15 hours ago

    Just outta curiosity:

    Full o1 model

    “\\id:\[]]+\\\\[]]+\\\”

    Claude 3.5 Haiku:

    Never used elisp, no idea of any of this is right lmao

      • @cm0002
        link
        214 hours ago

        3.5 sonnet might do a lot better, idk I’m on the free plan with Claude lmao

    • @[email protected]
      link
      fedilink
      English
      9
      edit-2
      14 hours ago

      o1 without Markdown misformatting:

      \\id:\\[^]]+\\\\\[^]]+\\\
      

      No idea what the rectangles are supposed to be, I just copy-pasted it

      • @marcos
        link
        210 hours ago

        They are valid unicode points that your font doesn’t know about.

        … or at least they represent that, but I think there’s a character that looks like one too.

        • @[email protected]
          link
          fedilink
          English
          18 hours ago

          It’s U+E001 from a Private Use Area. The UnicodePad app renders it as something between 鉮 and 鋁 (separate boxes stricken through; I wasn’t able to find it even with Google Lens)

    • @Skullgrid
      link
      011 hours ago

      I swear to god,someone must have written an intermediary language between regex and actual programming, or I’m going to eventaully do it before I blow my fucking brains out.

      • @BassTurd
        link
        510 hours ago

        How do you think that would look? Regex isn’t particularly complicated, just a bit to remember. I’m trying to picture how you would represent a regex expression in a higher level language. I think one of its biggest benefits is the ability to shove so much information into a random looking string. I suppose you could write functions like, startswith, endswith, alpha(4), or something like that, but in the end, is that better?

        • @[email protected]
          link
          fedilink
          510 hours ago

          People have unironically done that. No, it isn’t better. The fundamental mental model is the same.

          • @Skullgrid
            link
            29 hours ago

            I want to see their unironic attempts, maybe they’re useful to me at least if they’re not better.

            The fundamental mental model is the same.

            It’s not the fundemental model that I have a problem with for Regex, it’s the fucking brainfuck tier syntax

        • @Skullgrid
          link
          410 hours ago

          I suppose you could write functions like, startswith, endswith, alpha(4), or something like that,

          yes.

          but in the end, is that better?

          YES.

          startswith('text');
          lengthMustBe(5);
          onlyContain(CHARSETS.ALPHANUMERICS); 
          endswith('text');
          

          is much more legible than []],[.<{}>,]‘text’[[]]][][)()(a-z,0-9){}{><}<>{}‘text’{}][][

          • @BassTurd
            link
            610 hours ago

            Assuming “text” in your example is a placeholder for a 5 digit alpha string, it can be written like this in regex: /[a-zA-Z0-9]{5}/

            If ”text" is literal, then your statement is impossible.

            I think that when it gets to more complex expressions like a phone number with country code that accepts different formats, the verbosity of a higher level language will be more confusing, or at least more difficult to take in quickly.

      • @marcos
        link
        110 hours ago

        intermediary language between regex and actual programming

        It’s called Haskell.

  • madthumbs
    link
    English
    -915 hours ago

    It will replace a lot of crappy jobs the same way the industrial revolution did with machines making it possible to improve lifestyles (which is what machines did).

    The people that typically hate LLMs (AI) are the same people that don’t mind developers flooding the market with ‘free software’ (which can thwart real competition, not just reduce paying jobs).

    It has gotten me on the right path on occasions where it was wrong, and when I question information it will most often tell me ‘you’re right!’ and have a good chance of a real answer.

    The biggest failure I’ve had with it is trying to get the ffmpegthumbnailer working in Windows. Other than that, it’s annoying how many times I ask for instructions for Windows and it tells me to use blatant Linux commands (or package managers).

    • @pivot_root
      link
      6
      edit-2
      10 hours ago

      The people that typically hate LLMs (AI) are the same people that don’t mind developers flooding the market with ‘free software’ (which can thwart real competition, not just reduce paying jobs).

      And “fuck the free market” while we’re at it, am I right?

      The only people who hate free software are those who either can’t compete in quality with FOSS offerings, or have something to gain through vendor lock-in. And neither of those are beneficial for anybody other than the software vendor.

      • madthumbs
        link
        English
        -26 hours ago

        You didn’t pay for or invest your time in software development did you?

        • @pivot_root
          link
          1
          edit-2
          42 minutes ago

          You would be mistaken, then.

    • @BassTurd
      link
      110 hours ago

      Anecdotally, every AI has generations to go before it’s good enough to replace a person entirely. It’s a solid tool for people that know how to code, but it’s nothing more than hobby or reporting worthy to someone with limited to no programming experience. It does not make secure or efficient code, and it’s only as good as the input, which will not be good from someone that doesn’t understand how to code. Anyone that blindly trusts generated code and pushes into production, is insanely reckless and grossly unqualified to have that kind of power.

      • madthumbs
        link
        English
        16 hours ago

        Agreed, at the moment people who understand the code are still valuable. Also knowing what code is capable is as well.

    • @kitnaht
      link
      315 hours ago

      To be fair a lot of the windows commands ARE now Linux commands, thanks to WSL. Lots of people using it directly from within windows now instead of trying to make windows-only solutions.