cross-posted from: https://programming.dev/post/9319044

Hey,

I am planning to implement authenticated boot inspired from Pid Eins’ blog. I’ll be using pam mount for /home/user. I need to check integrity of all partitions.

I have been using luks+ext4 till now. I am hesistant hesitant to switch to zfs/btrfs, afraid I might fuck up. A while back I accidently purged ‘/’ trying out timeshift which was my fault.

Should I use zfs/btrfs for /home/user? As for root, I’m considering luks+(zfs/btrfs) to be restorable to blank state.

    • @[email protected]
      link
      fedilink
      1210 months ago

      Same here, but for only 1 year on my main machine and 6 years on my laptop. I looove snapper. It saved my ass so many times

      • @[email protected]
        link
        fedilink
        810 months ago

        Yes it is great. For me snapper rollback was an awesome onboarding experience to linux. Being eager to try things I read online for tweaks and general explorarion it brought me back to a working system after some custom kernel compiling gone awry, or deleting the wrong file etc.

        • 𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍
          link
          fedilink
          8
          edit-2
          10 months ago

          I’ve been on btrfs for so many years, with nightly backups with restic, so I’ve been dragging my feet on snapper. Finally installed it a couple weeks ago, and while I opened the config, I don’t think I changed anything. It’s worked so well, and the Arch package was so well done, that I’d forgotten I had it installed until a few days later I noticed that it was taking snapshots every time before I installed something. It’s shockingly good, and I don’t understand why btrfs+snapper(+grub-btrfs) isn’t the default on installs now.

  • @Samueru
    link
    1810 months ago

    Been using Btrfs for a year, I once had an issue that my filesystem was read only, I went to the Btrfs reddit and after some troubleshooting it turned out that my ssd was dying, I couldn’t believe it at first because my SMART report was perfectly clean and the SSD was only 2 years old, then a few hours later SMART began reporting thousands of dead sectors.

    The bloody thing was better than smart at detecting a dying ssd lol.

  • @[email protected]
    link
    fedilink
    1610 months ago

    Luks+btrfs with Arch as daily driver for 3 years now, mostly coding and browsing. Not a single problem so far :D

  • @[email protected]
    link
    fedilink
    1510 months ago

    After 4 years on btrfs I haven’t had a single issue, I never think about it really. Granted, I have a very basic setup. Snapper snapshots have saved me a couple of times, that aspect of it is really useful.

  • @[email protected]
    link
    fedilink
    910 months ago

    Can’t vouch for ZFS, but btrfs is great!

    You can mount root, log, and home on different subvolumes, they’d practically be on different partitions while still sharing the size limit.

    I would also take system snapshots while the system is still running with one command. No need to exclude the home or log directories, nor the pseudo fs (e.g. proc, sys, tmp, dev).

  • @[email protected]
    link
    fedilink
    9
    edit-2
    10 months ago

    My experiences:

    ZFS: never even tried because it’s not integrated (license).

    Btrfs: iirc I’ve tried it three times. Several years ago now. On at least two of those tries, after maybe a month or some of daily driving, suddenly the fs goes totally unresponsive and because it’s the entire system, could only reboot. FS is corrupted and won’t recover. There is no fsck. There is no recovery. Total data loss. Start again from last backup. Haven’t seen that since reiserfs around 2000. Found lots of posts with similar error message. Took btrfs off the list of things I’ll be using in production.

    I like both from a distance, but still use ext*. Never had total data loss that wasn’t a completely electrically dead drive with any version I’ve used since 1995.

    • Possibly linux
      link
      fedilink
      English
      710 months ago

      Btrfs has come a long way in the last few years. I have been using it for a little over 5 years and its rock solid. It now powers all my bare metal machines and I use Raid 1 on my servers.

      There was one time I had a disk unexpectedly go bad (it started returning bad data on read) which lead to the system going read only. It took me about 5min to swap disks and it was fine. Needless to say I was impressed that no data was lost.

      Btrfs will normally won’t get corrupted unless you have a hardware issue. It uses cow so writes can never be half competed. If you do manage to get corruption you can use btrfs check.

      • @TCB13
        link
        English
        3
        edit-2
        10 months ago

        Btrfs will normally won’t get corrupted unless you have a hardware issue. It uses cow so writes can never be half competed. If you do manage to get corruption you can use btrfs check.

        From my experience BTRFS is way more reliable against hardware failure then Ext4 ever was. Ext* filesystems tend to go corrupt on the first and smallest power loss or hardware failure.

    • 𝒍𝒆𝒎𝒂𝒏𝒏
      link
      fedilink
      710 months ago

      Ouch, that must have been a pain to recover from…

      I’ve had almost the opposite experience to yours funnily. Several years ago my HDDs would drop out at random during heavy write loads, after a while I narrowed down the cause to some dodgy SATA power cables, which sadly I could not replace at the time. Due to the hardware issue I could not scrub the filesystem successfully either. However I managed to recover all my data to a separate BTRFS filesystem, using some “restore” utility that was mentioned in the docs, and to the best of my knowledge all the recovered data was intact.

      While that past error required a separate filesystem to perform the recovery, my most recent hardware issue with drives dropping out didn’t need any recovery at all - after resolving the hardware issue (a loose power connection) BTRFS pretty much fixed itself during a scheduled scrub and spat out all the repairs in dmesg.

      I would suggest enabling some kind of monitoring on BTRFS’s counters if you haven’t, because the fs will do whatever it can to prevent interruption to operations. In my previous two cases, performance was pretty much unaffected, and I only noticed the hardware problems due to the scheduled scrub & balance taking longer or failing.

      Don’t run a fsck - BTRFS essentially does this to itself during filesystem operations, such as a scrub or a file read. The provided btrfs check tool (fsck) is for the internal B-tree structure specifically AFAIK, and irreversably modifies the filesystem internally in a way that can cause unrecoverable data loss if the user does not know what they are doing. Instead of running fsck, run a scrub - it’s an online operation that can be done while the filesystem is still mounted

      • Possibly linux
        link
        fedilink
        English
        410 months ago

        DO NOT RUN A SCRUB IF YOU SUSPECT HARDWARE FAILURE.

        No seriously. If you are having hardware issues a scrub could make the corruption much worse. You should first make a complete copy of your data and then run btrfs check. Sorry for shouting but it is really important you don’t stub a bad disk.

    • @waigl
      link
      English
      310 months ago

      Several years ago now. On at least two of those tries, after maybe a month or some of daily driving, suddenly the fs goes totally unresponsive and because it’s the entire system, could only reboot. FS is corrupted and won’t recover. There is no fsck. There is no recovery. Total data loss.

      Could you narrow it down to just how long ago? BTRFS took a very long time to stabilise, so that could possibly make a difference here. Also, do you remember if you were using any special features, especially RAID, and if RAID, which level?

      • @[email protected]
        link
        fedilink
        210 months ago

        I could see if there’s notes somewhere. Very plain desktop and laptop. Probably encrypted LVM. At least one was doing a lot of software builds with big system image trees and snapshots.

      • Chewy
        link
        fedilink
        4
        edit-2
        10 months ago

        https://www.suse.com/support/kb/doc/?id=000018769

        WARNING: Using ‘–repair’ can further damage a filesystem instead of helping if it can’t fix your particular issue.

        Edit:

        It is extremely important that you ensure a backup has been created before invoking ‘–repair’.

        • @[email protected]
          link
          fedilink
          4
          edit-2
          10 months ago

          That is a caveat with OS disk tools. Even partition resizing gives this warning, as does Windows checkdisk…something about unnessary disk checks ahould be avoided as they can create issues where none might have existed, so only run when you suspect a problem.

          But as lemann pointed out in this thread btrfs scrub is less risky

  • @chili1553
    link
    710 months ago

    I think zfs is a pretty cool guy. Eh copy on write and doesn’t afraid of anything

  • rhys the great
    link
    fedilink
    710 months ago

    @unhinge I run a simple 48TiB zpool, and I found it easier to set up than many suggest and trivial to work with. I don’t do anything funky with it though, outside of some playing with snapshots and send/receive when I first built it.

    I think I recall reading about some nuance around using LUKS vs ZFS’s own encryption back then. Might be worth having a read around comparing them for your use case.

    • @[email protected]OP
      link
      fedilink
      210 months ago

      afaik openzfs provides authenticated encryption while luks integrity is marked experimental (as of now in man page).

      openzfs also doesn’t reencrypt dedup blocks if dedup is enabled Tom Caputi’s talk, but dedup can just be disabled

  • @rtxn
    link
    English
    610 months ago

    My experience with btrfs is “oh shit I forgot to set up subvolumes”. Other than that, it just works. No issues whatsoever.

    • @[email protected]OP
      link
      fedilink
      310 months ago

      oh shit I forgot to set up subvolumes

      lol

      I’m also planning on using its subvolume and snapshot feature. since zfs also supports native encryption, it’ll be easier to manage subvolums for backups

  • @[email protected]
    link
    fedilink
    610 months ago

    At some, long ago, the Ubuntu installer was offering to use zfs for the boot and root partitions. That sounded like a good idea and worked great for a long time, automatic snapshots, options to restore state at boot etc.

    Until my generous boot partition started to run out if space with all the snapshots (which were setup automatically and no obvious way to configure) OK no big deal, write a bash script that finds the old snapshots and delete them manually whenever boot is full again.

    Then one day recently my laptop wouldn’t boot anymore, Grub could no longer read the zfs on boot. Managed to boot with USB installation image, read zsf and chroot. Tried alot of things but in the end killed zfs and replace with ext4. Then made it boot again.

    Apparently I’m not the only one with this issue.

  • 0x0
    link
    fedilink
    610 months ago

    I did my first BTRFS setup over the weekend. I followed the Arch wiki to set up what I thought was RAID 1 only to find out nearly a TB of copying later that it was splitting the data between the drives, not mirroring them (only the metadata was in R1.) One command later and I’d converted the filesystem to true RAID 1. I feel like any other system would require a total redo of the entire FS, but BTRFS did it flawlessly.

    I’m still confused, however, as it seems RAID 1 only works with two drives from what I’ve read. Is that true? Why?

  • @[email protected]
    link
    fedilink
    English
    3
    edit-2
    10 months ago

    I haven’t used them professionally but I’ve been using ZFS on my home router (OPNsense) and NAS (TrueNAS with RAID-Z2) for many years without problem. I’ve used Btrfs on laptops and desktops with OpenSUSE Tumbleweed for the past year and a bit, also without problem. Btrfs snapshots have saved me a couple of times when I messed something up. Both seem like solid filesystems for everyday use.

  • @0000
    link
    310 months ago

    Been using BTRFS on a couple NAS servers for 4+ years. Also did raid1 BTRFS over two USB hard drives connected to a Pi4 (yes this should be absolutely illegal).

    The USB raid1 had a couple checksum errors that were easily fixed via scrub last year and the other two servers have been running without any issues. I assume it’s been fine since they’re all connected to a UPS and since I run weekly scrubs.

    I enjoyed CoW and snapshots so much that I’ve been using it on my main Arch install’s (I use Arch btw) root drive and storage drives (in BTRFS raid1) for the last 4 months without issue.

  • SavvyWolf
    link
    fedilink
    English
    310 months ago

    Many many years ago I set up btrfs for the disks I write my backups to with a raid 1 config for them. Unfortunately one of those disks went bad and ended up corrupting the whole array. Makes me wonder if I set it up correctly or not.

    Nowadays, I have the following disks in my system set up as btrfs:

    • My backups disk because of compression.
    • My OS drive because of Timeshift.
    • My home folder because it feels safer. COW feels like it’ll handle power failures better, whilst there’s also checksumming so I can identify corrupted files.
    • My SSD Steam library over two drives because life is short and I cba managing the two ssds independently.

    It’s going fine, but it feels like I need to manually run a balance every one in a while when the disk fills up.

    I also like btrfs-assistant for managing the devices.

    Out of interest, since I’ve not used the “recommended partion setup” for any install for a while now, is ext4 still the default on most distros?

    • Quazatron
      link
      410 months ago

      My SSD Steam library over two drives because life is short and I cba managing the two ssds independently.

      You do know that Steam handles multiple libraries transparently, even on removable drives?

      • SavvyWolf
        link
        fedilink
        English
        110 months ago

        I know they all show up in the same interface and I can move games between drives in the storage interface.

        But I don’t want to deal with having to shuffle things around to install a 40GiB game where both drives only have 30GiB free. Or having to remember which of the two drives has a specific game on when I want to find their files.

        It also gives a possibly-insignificant speed boost and extra cool points.

        • Quazatron
          link
          210 months ago

          Can’t argue with cool points.

    • @waigl
      link
      English
      310 months ago

      Out of interest, since I’ve not used the “recommended partion setup” for any install for a while now, is ext4 still the default on most distros?

      I recently installed Nobara Linux on an additional drive, because after 20 years, I wanted to give Linux gaming another shot (works a lot better than I had hopes for, btw), and it defaulted to btrfs. I’ll assume so does Fedora, because I cannot imagine Nobara changed that part over the Fedora base for gaming purposes.

      • @[email protected]
        link
        fedilink
        English
        410 months ago

        Fedora does, with compression enabled. It’s one of the largest divergences from Red Hat since Red Hat doesn’t support it at all. openSUSE does also.