I’ve noticed some files I opened in a text editor have all kinds of crazy unrenderable chars

  • @JASN_DE
    link
    36 hours ago

    Are those binary files by any chance?

    • @cheese_greaterOP
      link
      2
      edit-2
      6 hours ago

      I just mean like any file (pdf, jpeg, mp4, mp3, exe—

      mp4/mp3 most famously for me

      I find it so damn cool and incredible I can record something/anything right now and open the audio in a text file and its all right there—albeit in an incomprehensible format but there altogether.

      Its like a thinking rock etching sound into stone

      • Admiral Patrick
        link
        fedilink
        English
        6
        edit-2
        6 hours ago

        If you’re on Linux, you can convert that to something more human readable by piping it to base64. It works with any file, but I’ll use an image here:

        cat image.webp | base64

        Which yields:

        UklGRroEAABXRUJQVlA4WAoAAAAgAAAAYwAAQgAASUNDUKACAAAAAAKgbGNtcwRAAABtbnRyUkdC
        IFhZWiAH6AAIABoADgAJACBhY3NwQVBQTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA9tYAAQAA
        AADTLWxjbXMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA1k
        ZXNjAAABIAAAAEBjcHJ0AAABYAAAADZ3dHB0AAABmAAAABRjaGFkAAABrAAAACxyWFlaAAAB2AAA
        ABRiWFlaAAAB7AAAABRnWFlaAAACAAAAABRyVFJDAAACFAAAACBnVFJDAAACFAAAACBiVFJDAAAC
        FAAAACBjaHJtAAACNAAAACRkbW5kAAACWAAAACRkbWRkAAACfAAAACRtbHVjAAAAAAAAAAEAAAAM
        ZW5VUwAAACQAAAAcAEcASQBNAFAAIABiAHUAaQBsAHQALQBpAG4AIABzAFIARwBCbWx1YwAAAAAA
        AAABAAAADGVuVVMAAAAaAAAAHABQAHUAYgBsAGkAYwAgAEQAbwBtAGEAaQBuAABYWVogAAAAAAAA
        9tYAAQAAAADTLXNmMzIAAAAAAAEMQgAABd7///MlAAAHkwAA/ZD///uh///9ogAAA9wAAMBuWFla
        IAAAAAAAAG+gAAA49QAAA5BYWVogAAAAAAAAJJ8AAA+EAAC2xFhZWiAAAAAAAABilwAAt4cAABjZ
        cGFyYQAAAAAAAwAAAAJmZgAA8qcAAA1ZAAAT0AAACltjaHJtAAAAAAADAAAAAKPXAABUfAAATM0A
        AJmaAAAmZwAAD1xtbHVjAAAAAAAAAAEAAAAMZW5VUwAAAAgAAAAcAEcASQBNAFBtbHVjAAAAAAAA
        AAEAAAAMZW5VUwAAAAgAAAAcAHMAUgBHAEJWUDgg9AEAALAQAJ0BKmQAQwA+8WSmTqmlKCYvmWqp
        MB4JZQDLnNaF2NMD2L3xQGb5nmLiGhGWxQuD8kwUSXF0u2UTgX0YrR3MY2SsRCNEQ8hZ6WkCUTih
        LdmsElHZVzoMwO/fj4X/ZSNT2R9qgxwqgEed891j4KCNRLK/tUbG3hZ3Mw2kixguSFIEcAgBtv8w
        eAu0PwAA/upMzBqq+dcN8viO7FpqpV6GvPcRILm+HsOQblnpHx03lASjGlSyGbkKUD3xA5KOqgq/
        VEUJ4qF9VoAYFbFhQRAgkvmREk5umMj8sr9Np95+n/oP2Aq2VW5xU4F1xpD8Vd4Dp7Phwm9w/Dnf
        94djRROFRYPZeg/1Q/qiROFRVRu2nBcgndbhc0x0h+kgvT/naeJOEqwNjYPlIiw/DGuxav7+x09R
        mf2mJto3ineDqfyMWUN83PmKqzGHkYGhZrTU478qjlQucDzWkwobnUmzhE6I+mDYkfiUVPcHyXbf
        xXRStyPiPZAkJZrE9OrjFNUeljRQdVTQqeBsy+O9VwDLU5GcKhBQHa4cj+/DGqUhi74WH0EuHsb3
        EgZVNc1FbRm5QFOpjDSprGIRYxe6sFFDrDOg4DhWZRnOa7s68pGaDDpbqrORxzPHXPbs55/1HTas
        DDGzKFmTG4hJ2GUZKqjPcQ+MAAAA
        

        Copy that into a text file and pass it to base64 with the decode flag, and you’ll get the original binary:

        cat data.txt | base64 -d > data.bin

        Inspect it to see what kind of file it is:

        file data.bin -> data.bin: RIFF (little-endian) data, Web/P image

        Rename it so you can just double-click it to open it:

        mv data.bin data.webp

        Enjoy the surprise.

        You can also print files like that, scan them using OCR, and then restore them. A very inefficient way to do backups, but it works.

        • @cheese_greaterOP
          link
          26 hours ago

          How is it representing it tho? Like does it have woven in there an array of hexcode colors for every microscopic pixel that makea up the picture.

          Are images and audio files just arrays of frames which are arrays of pixels and sound units?

          • Admiral Patrick
            link
            fedilink
            English
            3
            edit-2
            6 hours ago

            It just converts the raw binary data into character encoding, so it doesn’t matter what the source is (image, video, database file, etc). The source binary data is taken 6 bits at a time, then this group of 6 bits is mapped to one of 64 unique characters.

            The decoding process is just the reverse of that: mapping the data back to binary form.

            https://en.wikipedia.org/wiki/Base64