I’ve been running into several problems with restoring MySQL backups. Namely, the backups come from an environment other than the one I’m working in and I’m forced to remove superuser commands contained in the backups.

The problem is when trying to remove those commands I’m constantly getting UTF-8 encoding errors because there are loads of invalid character sequences.

Why would MySQL encode a backup as UTF-8 if the data isn’t actually UTF-8? This feels like bad design to me.

  • @folekaule
    link
    54 days ago

    Character encoding and type coercion errors are so common. But a lot of bugs also come from programs trying to do “the right thing”. Like in OP’s case: they are just trying to import some data and maybe the data was never even intended to be interpreted as utf8, but the tool they are using to remove the commands wants to treat it that way. Sometimes the safest thing to do is to just assume data is binary until you care otherwise.