I’m currently watching the progress of a 4tB rsync file transfer, and i’m curious why the speeds are less than the theoretical read/write maximum speeds of the drives involved with the transfer. I know there’s a lot that can effect transfer speeds, so I guess i’m not asking why my transfer itself isn’t going faster. I’m more just curious what the bottlenecks could be typically?

Assuming a file transfer between 2 physical drives, and:

  • Both drives are internal SATA III drives with 5.0GB/s 5.0Gb/s read/write 210Mb/s (this was the mistake: I was reading the sata III protocol speed as the disk speed)
  • files are being transferred using a simple rsync command
  • there are no other processes running

What would be the likely bottlenecks? Could the motherboard/processor likely limit the speed? The available memory? Or the file structure of the files themselves (whether they are fragmented on the volumes or not)?

  • Max-P
    link
    fedilink
    1610 months ago

    SATA III is gigabit, so the max speed is actually 600MB/s.

    What filesystem? For example, on my ZFS pool I had to let ZFS use a good chunk of my RAM for it to be able to cache things enough that rsync would max out the throughput.

    Rsync doesn’t do the files in parallel so at such speeds, the process of open files, read chunks, write chunks, close files, repeat can add up. So you want the kernel to buffer as much of it as possible.

    If you look at the disk graphs of both disks, you probably see a read spike, followed by a write spike on the target, instead of a smooth maxed out curve. Then the solution is increasing buffers and caching. Depending on the distro there’s a sysctl that may be on by default that limits the size of caches to prevent the “I wrote a 4GB file to my USB stick and now there’s 4GB of RAM used for it and it takes hours after finishing the transfer before it’s flushed to the stick”.

    • archomrade [he/him]OP
      link
      fedilink
      English
      4
      edit-2
      10 months ago

      SATA III is gigabit, so the max speed is actually 600MB/s.

      My mistake, though still, a 4tb transfer should take less than 2hr at 5Gb/s (IN THEORY) Thank you @[email protected] for pointing this out a second time elsewhere: 6Gb/s is what the sata 3 interface is capable of, NOT what the DRIVE is capable of. The marketing material for this drive has clearly psyched me out, the actual transfer speed is 210Mb/s

      The filesystem is EXT4 and shared as a SMB… OMV has a fair amount of ram allocated to it, like 16gb or something gratuitous. I’m guessing the way rsync does it’s transfers is the culprit, and I honestly can’t complain because the integrity of the transfer is crucial.

        • archomrade [he/him]OP
          link
          fedilink
          English
          210 months ago

          Thanks, corrected my comment above.

          I’m interested in ksmbd… I chose SMB simply because I was using it across lunix/windows/mac devices and I was using OMV for managing it, but that doesn’t mean I couldn’t switch to something better.

          Honestly though, I don’t need faster transfers typically, I just happen to be switching out a drive right now. SMB through OMV has been perfectly sufficient otherwise.

          • @[email protected]M
            link
            fedilink
            410 months ago

            ksmbd is still SMB, except it’s implemented within the Linux kernel. As a result, file transfers speeds are improved greatly compared to pure-Samba which runs only in userspace.

            The second thing is, you need to check which SMB protocol you’re using, ideally you’d want to use at least SMB 3, anything older than that will be painfully slow.

            Finally, I read in your other comment that you’re using spinning disks and a USB dock. That adds significant overheads.

            The Ironwolf drive benchmarks starting at 250MB/s and slows down to 100MB/s as it reaches the end of the drive. (spinning disks gradually become slower the more full it becomes.) Now add file fragmentation + filesystem overheads (buffers, cluster size allocation etc) and the speeds could go down considerably.

            Then there’s your SATA > USB dock - no dock would ever reach 5Gbps, that’s just false advertising - it’s only mentioning the theoretical protocol speed. In reality, you’d be seeing something like below 100MB/s write speeds for 128k sequential writes, but if your block size is smaller, expect far slower writes.

            Combine all of the above and you can imagine just how much slower this whole thing can be.

            For reference, see this benchmark as an example, to see what’s “normal” for a simple file transfer to a blank drive with no fragmentation: https://www.anandtech.com/show/6014/startechcom-usb-30-to-sata-ide-hdd-docking-station-review/3