(skeletor is leading by example by adding that unnecessary apostrophe…)

    • Toes♀
      link
      fedilink
      162
      edit-2
      10 months ago

      To fuck with computers that don’t know how to do UTF8, add a few emoji.

      I once set a WiFi ssid to 🌻 and I was amazed at how much problems that likely caused. I had people showing me their network manager was dumping random characters. Some other routers web interfaces became corrupted when trying to show the neighborhood. Some clients refused to connect. Even a bsod on a windows XP box.

      • @Potatos_are_not_friends
        link
        52
        edit-2
        10 months ago

        One of my projects was validation for form submission and emojis melted me. I gave up trying to do it from scratch and trusted a library.

        • AggressivelyPassive
          link
          fedilink
          4210 months ago

          I’m currently in a project where the client has a custom, but not entirely consistent or known subset of utf-8.

          They want us to keep the form content as it is, but remove the “bad” characters. Our current approach is to just forward everything as it is and wait for someone to complain. How TF am I supposed to remove a character without changing the message?

          • Toes♀
            link
            fedilink
            1410 months ago

            Yeah I had a backend with poor support for anything that wasn’t ASCII. So my solution was turning everything into hex before storing it. I wonder if people are still using it.

            • @[email protected]
              link
              fedilink
              8
              edit-2
              10 months ago

              Yeah I had a backend with poor support for anything that wasn’t ASCII

              PHP is like this. Poor Unicode support, but it treats strings as raw bytes so it usually works well enough. It turns out a programming language can take data from a form, save it to a database, then later load and render it, without having to know what those bytes actually mean, as long as the app or browser knows it’s UTF-8, for example through a Content-Type header or meta tag.

              The tricky thing is the all the standard string manipulation functions (strlen, substr, etc) don’t handle Unicode properly at all and they deal with number of bytes rather than number of characters. You need to use the “multibyte” (Unicode-ready) equivalents like mb_substr, but a lot of PHP developers forget to do this and end up with string truncation code that cuts UTF-8 characters in half (e.g.if it’s truncating a long title with Emoji in it, it might cut off the title in the middle of the three bytes that represent the Emoji and only leave 1 or 2 of them)

        • @[email protected]
          link
          fedilink
          5
          edit-2
          10 months ago

          You just need to ensure you validate character by character (NOT byte by byte) and allow characters in the Emoji Unicode ranges (which are well-defined in the Unicode standard). Using a library is a great idea though.

      • @aeronmelon
        link
        3610 months ago

        They called it “The Sunflower Incident.”

          • @[email protected]
            link
            fedilink
            210 months ago

            I believe it’s 32 bytes, but it depends on the AP, some use a null terminator as the final byte.

          • Spaz
            link
            210 months ago

            64 characters long is wifi spec IIRC but some routers don’t follow spec, wouldnt go higher than 60. Idk if this helps answer your question.

      • @[email protected]
        link
        fedilink
        210 months ago

        I had an emoji in my phone hotspot a while ago. Unfortunately I had to remove it after a while because some devices refused to connect.

    • @Ottomateeverything
      link
      7210 months ago

      To make sure millenials can’t read your password, 𝔀𝓻𝓲𝓽𝓮 𝓹𝓪𝓻𝓽 𝓸𝓯 𝓲𝓽 𝓲𝓷 𝓬𝓾𝓻𝓼𝓲𝓿𝓮.

      How would this mess with millennials? I think you mean gen z.

      • Xhieron
        link
        English
        8110 months ago

        Common mistake: When you’re ascribing a bad quality to them, “millenials” means everyone born after 1960. If you’re ascribing a good quality to them, it only means people born between December 12, 1989, and December 14, 1989.

        • @tool
          link
          English
          510 months ago

          Were told our assignments in high school would get an automatic zero if we didn’t turn them in in cursive, even…

          • @[email protected]
            link
            fedilink
            710 months ago

            I knew someone who did physics in cursive. It was impossible to read (not bc it was sloppy, because seeing Greek letters as cursive threw me for a loop)

        • @Bytemeister
          link
          Ελληνικά
          310 months ago

          Yeah! Most of us can read analog clocks too!

          • @[email protected]
            link
            fedilink
            110 months ago

            I actually work in an after school program and I’ve been teaching kids how to read analog clocks. It is interesting to say the least

      • @proudblond
        link
        English
        710 months ago

        Even my gen alpha kid was learning cursive in third grade last year. I don’t expect him to write using it much but at least he knows how to read it.

      • slazer2au
        link
        English
        310 months ago

        𝔒𝔯 𝔶𝔢𝔬𝔩𝔡 𝔢𝔫𝔤𝔩𝔦𝔰𝔥 𝔱𝔬 𝔰𝔠𝔯𝔢𝔴 𝔴𝔦𝔱𝔥 𝔢𝔳𝔢𝔯𝔶𝔬𝔫𝔢.

    • The Picard ManeuverOP
      link
      3010 months ago

      To make sure millenials can’t read your password, 𝔀𝓻𝓲𝓽𝓮 𝓹𝓪𝓻𝓽 𝓸𝓯 𝓲𝓽 𝓲𝓷 𝓬𝓾𝓻𝓼𝓲𝓿𝓮.

      Hey, millennials know cursive!

      • @[email protected]
        link
        fedilink
        3010 months ago

        Forced to learn it in elementary school because “highschool and college require it!” by Boomers that didn’t recognize the tech revolution only to get to college and be told by those same boomers to never turn in a handwritten paper unless you wanted an auto fail.

        • AwkwardLookMonkeyPuppet
          link
          English
          710 months ago

          told by those same boomers

          Your elementary school teachers were also your college professors?

    • @nezbyte
      link
      1510 months ago

      CSVs are supposed be comma-separated files. Microsoft deviated from the specification and decided some languages would use semicolons for CSVs.

      Source: StackOverflow

      • @[email protected]
        link
        fedilink
        6
        edit-2
        10 months ago

        Microsoft deviated from the specification

        There is no specification for CSV, which is why it’s such a mess and different parsers and renderers have wildly different features. The closest thing to a spec is RFC4180 but that RFC simply describes the most common features across several CSV implementations, and is not actually a spec.

        I agree that it should be comma separated though. My understanding is that it caused issues in countries that use a comma as a decimal point.

        Also, Excel sometimes uses tabs rather than commas or semicolons.

      • @[email protected]
        link
        fedilink
        510 months ago

        Using comma would probably caused more problems as it is a decimal separator for those languages. My excel also uses semicolon in formulas instead of comma when separating parameters. Some VBA scripts break when using different language settings and some forumilas don’t translate automatically to different locale so they just give an error. Overall using excel in different locale setups is annoying.

        Best separator I have used is | as i have never seen it in the data as an input. Comma and semicolon both have caused issues in the past for me as they might pop up at wrong places.

    • jawa21
      link
      fedilink
      1310 months ago

      Here’s my confusion: as soon as it is no longer separated by commas, it is by definition no longer a CSV. Is it an SCSV now?

      • MrPasty
        link
        fedilink
        1210 months ago

        It turns into a CSV where the C stands for character.

    • @rtxn
      link
      English
      1210 months ago

      Z̵̫̖͚̳̖̖̰̩̀̆͐͒͝ä̸̛̻́̈́̌͂̽̈́l̷̤̥̖̝͙̅g̵̱̤͙͕̥̮͌̽o̸̡̦̙̬̘͎̪̥̔ ̴͔̙̞̱̗͒͊͊̽̀̑͌ẏ̵̛̻̾o̸̡͍̤͔͌ų̶̠͔̯̲̖͇̯̅̒̓̃̏̓͊r̷͎̪̗̤̄̊̃̚͝ ̵̢̰͔̀t̵̡̘̤̙͕͎̅͂͛̀̚ȩ̷͙̙̖̲̟͍̉̎͝x̷͇̦̝̼͗͋̊t̶̫̹̳̩͇̼̠͚̿͆̅̋̔̃͐͗!̶̧̛͕̮̻̞͎͇̹͆͛͘̕̚͠

    • Thomas
      link
      fedilink
      9
      edit-2
      10 months ago

      To fuck with computers that don’t know how to do UTF8, add a few emoji.

      Even better, add some byte sequences that are invalid UTF-8.