• Sheridan
    link
    English
    115 hours ago

    deleted by creator

    • CodexArcanum
      link
      fedilink
      English
      95 hours ago

      I’m hardly the king of databases, but always using a surrogate key (either an auto-incremented integet or a random uuid) has done me pretty well over the years. I had to engineer a combination of sequential timestamp with a hash extension as a key for one legacy system (keys had to be unique but mostly sequential), and an append-only log store would have been a better choice than an RDBMS, but sometimes you make it work with what you have.

      Natural keys are almost always a bad idea though. SSNs aren’t natural, which is one pitfall: implicitly relying on someone else’s data practices by assuming their keys are natural. But also, nature is usually both more unique than you want (every snowflake is technically unique) and less than you’d hoped (all living things share quite a lot of DNA). Which means you end up relying on how good your taxonomy is for uniqueness. As opposed to surrogate keys, which you can assure the uniqueness of, by definition, for your needs.

      • Sheridan
        link
        English
        4
        edit-2
        4 hours ago

        deleted by creator

        • CodexArcanum
          link
          fedilink
          English
          13 hours ago

          For a minute i thought you were talking about Elon and was completely following.

          Ah, so data exchange between databases is a slightly different matter. First, I would recommend against using ssn as a number you exchange back and forth to sync data, as it exposes you to a greater risk of data theft. However, forcing your internal DB keys onto someone else is a a synchronization nightmare. Your internal data schema might look totally different even for the data overlap you might have.

          My usual suggestion would be to assign a random uuid to each person and then just agree with each other on either which system creates new records (originates ids) or which system has priority (is the system of record) for people if there’s a collision. Ultimately, you’d end up needing to compare names and addresses or phone numbers or birthdays etc. to unify records anyway. SSN is an easy cheat that gets you a lot of the way there (for Americans born after 1935) but like I said it’s a security/legality risk and you still actually need to check that other stuff to verify anyway.

          There’s a reason why systems that join person records together is big business (mostly for advertising!) It’s tricky stuff.