The first programs were written in binary/hexadecimal, and only later did we invent coding languages to convert between human readable code and binary machine code.

So why can’t we just do the same thing in reverse? I hear a lot about devices from audio streaming to footware rendered useless by abandonware. Couldn’t a very smart person (or AI) just take the existing program and turn it into code?

  • FaceDeer
    link
    fedilink
    116 months ago

    As others have mentioned, it’s possible but very complicated. Decompilers produce code that isn’t very readable for humans.

    I am indeed awaiting the big news headlines that will for some reason catch everyone by surprise when a LLM comes along that’s trained to “translate” machine code into a nice easily-comprehensible high-level programming language. It’s going to be a really big development, even though it doesn’t make programs legally “open source” it’ll make it all source available.

    • Toes♀
      link
      fedilink
      56 months ago

      I have a bunch of 16-bit applications that I would love to be able to do that with. Mostly dos and windows 3.1 games.

      • subignition
        link
        fedilink
        46 months ago

        You might actually consider dipping your toes into trying to learn how to analyze/reverse those yourself. Relatively speaking, software that old can sometimes be easier to reverse.

        • Toes♀
          link
          fedilink
          26 months ago

          Yeah I’m not unfamiliar (still a novice though) with the process and mostly used it circumvent something obnoxious or tweak save files. Just takes a lot of effort when you’re just looking to spend a couple hours playing a game before bed.

          I’m currently experiencing a frustrating bug in dolphin and I’m being tempted to learn enough about it. My MIPS buddy won’t help me with it because he thinks it’s a waste of time.

          I like LLMs for the time it saves you to do something laborious or mundane. One day we’ll have general ai fingers crossed

          ~Love the toes pun

          • subignition
            link
            fedilink
            26 months ago

            My apologies for preaching to the choir. (And I didn’t notice your username when I wrote that, LOL. Happy accident.)

    • @[email protected]
      link
      fedilink
      46 months ago

      I am indeed awaiting the big news headlines that will for some reason catch everyone by surprise when a LLM comes along that’s trained to “translate” machine code into a nice easily-comprehensible high-level programming language.

      Another commenter dismissed the idea outright. WTF… What is implausible about an LLM that takes decompiled code, deals with the obfuscating bs, recognizes known libraries, and organizes the remaining code. That will totally happen, if it hasn’t already been done.

      • FaceDeer
        link
        fedilink
        36 months ago

        There’s a lot of outright rejection of the possibilities of AI these days, I think because it’s turning out to be so capable. People are getting frightened of it and so jump to denial as a coping mechanism.

        I recalled reading about an LLM that had been developed just a couple of weeks ago for translating source code into intermediate representations (a step along the way to full compilation) and when I went hunting for a reference to refresh my memory I found this article from March about exactly what’s being discussed here - an LLM that translates assembly language into high-level source code. Looks like this one’s just a proof of concept rather than something highly practical, but prove the concept it does.

        I wonder if there are research teams out there sitting on more advanced models right now, fretting about how big a bombshell it’ll be when this gets out.

      • @[email protected]
        link
        fedilink
        26 months ago

        It’s easy to say that we should throw AI at a problem and in a few years it will solve it, but most of the time it doesn’t actually work that way. If you think about the Turing Test itself, where the history goes back to the 1950s, how many decades did it take for us to get to anything that could reasonably come close to passing it? So anytime you think to yourself that one of these days AI is going to get there, remember that one of these days might actually be a half century from now.

        The other aspect to this challenge, or rather specifically with regards to this challenge, is that the setup involves humans organizing code in a certain way according to some kind of reasoning that the authors know about, and then that being compiled away, and then another computer program trying to get back what the original authors might have been thinking when they designed the thing originally. That’s a steep hill to climb. Can it be done on a small scale? It certainly can. On a large scale? Don’t hold your breath.