I have recently become interested in mini PCs, but one thing that is stopping me is a feeling that bit rot could cause me to lose data.

Is bit rot something to worry about when storing data for services such as Git, or Samba. I have another PC right now that is setup with btrfs raid1 and backups locally and to the cloud, however was thinking about downsizing for the benefit of size and power usage.

I know many people use the mini PCs such as ThinkCentres, Optiplex, EliteDesks and others, I am curious if I should be worried about losing data due to bit rot, or is bit rot a really rare occurrence?

Let’s say I have backups with a year of retention, wouldn’t it be possible that the data becomes corrupt and that it isn’t noticed until after a year? for example archived data that I don’t look at often but might need in the future.

  • @markstos
    link
    English
    31 year ago

    You don’t define bitrot. If you leave software alone with no updates for long enough, yes, there will be problems.

    There will eventually be a security issue with no fix, or a new OS or hardware it doesn’t work on.

    Backups can also fail over time if restores are not tested periodically.

    This recently happened to me. A server wouldn’t boot anymore, so we restored from backup, but it still wouldn’t boot. The issue was that we’d introduced change that caused a boot failure. To fix that by restoring from a backup, we’d need a backup from before that change. It turns out we had one, but didn’t realize what the issue was.

    The other moral is to reboot frequently if only to confirm the system can still boot.

    • @dragontamer
      link
      English
      11
      edit-2
      1 year ago

      That’s not what storage engineers mean when they say “bitrot”.

      “Bitrot”, in the scope of ZFS and BTFS means the situation where a hard-drive’s “0” gets randomly flipped to “1” (or vice versa) during storage. It is a well known problem and can happen within “months”. Especially as a 20-TB drive these days is a collection of 160 Trillion bits, there’s a high chance that at least some of those bits malfunction over a period of ~double-digit months.

      Each problem has a solution. In this case, Bitrot is “solved” by the above procedure because:

      1. Bitrot usually doesn’t happen within single-digit months. So ~6 month regular scrubs nearly guarantees that any bitrot problems you find will be limited in scope, just a few bits at the most.

      2. Filesystems like ZFS or BTFS, are designed to handle many many bits of bitrot safely.

      3. Scrubbing is a process where you read, and if necessary restore, any files where bitrot has been detected.

      Of course, if hard drives are of noticeably worse quality than expected (ex: if you do have a large number of failures in a shorter time frame), or if you’re not using the right filesystem, or if you go too long between your checks (ex: taking 25 months to scrub for bitrot instead of just 6 months), then you might lose data. But we can only plan for the “expected” kinds of bitrot. The kinds that happen within 25 months, or 50 months, or so.

      If you’ve gotten screwed by a hard drive (or SSD) that bitrots away in like 5 days or something awful (maybe someone dropped the hard drive and the head scratched a ton of the data away), then there’s nothing you can really do about that.