I’ve noticed recently that my network speed isn’t what I would expect from a 10Gb network. For reference, I have a Proxmox server and a TrueNAS server, both connected to my primary switch with DAC. I’ve tested the speed by transferring files from the NAS with SMB and by using OpenSpeedTest running on a VM in Proxmox.

So far, this is what my testing has shown:

  • Using a Windows PC connected directly to my primary switch with CAT6: OpenSpeedTest shows around 2.5-3Gb to Proxmox, which is much slower than I’d expect. Transferring a file from my NAS hits a max of around 700-800MB (bytes, not bits), which is about what I’d expect given hard drive speed and overhead.
  • Using a Windows VM on Proxmox: OpenSpeedTest shows around 1.5-2Gb, which is much slower than I would expect. I’m using VirtIO network drivers, so I should realistically only be limited by CPU; it’s all running internally in Proxmox. Transferring a file from my NAS hits a max of around 200-300MB, which is still unacceptably slow, even given the HDD bottleneck and SMB overhead.

The summary I get from this is:

  • The slowest transfer rate is between two VMs on my Proxmox server. This should be the fastest transfer rate.
  • Transferring from a VM to a bare-metal PC is significantly slower than expected, but better than between VMs.
  • Transferring from my NAS to a VM is faster than between two VMs, but still slower than it should be.
  • Transferring from my NAS to a bare-metal PC gives me the speeds I would expect.

Ultimately, this shows that the bottleneck is Proxmox. The more VMs involved in the transfer, the slower it gets. I’m not really sure where to look next, though. Is there a setting in Proxmox I should be looking at? My server is old (two Xeon 2650v2); is it just too slow to pass the data across the Linux network bridge at an acceptable rate? CPU usage on the VMs themselves doesn’t get past 60% or so, but maybe Proxmox itself is CPU-bound?

The bulk of my network traffic is coming in-and-out of the VMs on Proxmox, so it’s important that I figure this out. Any suggestions for testing or for a fix are very much appreciated.

  • @[email protected]
    link
    fedilink
    5
    edit-2
    4 months ago

    I’ve used virtio for Nutanix before and not using open speed test, but instead using iperf, gathered line rate across hosts.

    However I also know network cards matter a lot. Some network cards, especially cheap Intel x710 suck. They don’t have specific compute offloading that can be done so the CPU does all the work and the host cpu itself processes network traffic significantly slowing throughput.

    My change to mellanox 25g cards showed all vm network performance increase to the expected line rate even on same host.

    That was not a home lab though, that was production at a client.

    Edit sorry I meant to wrap up:

    • to test use iperf (you could use UDP at 10Gbit and run it continuous, in UDP mode you need to set the size you try to send)
    • while testing look for CPU on the host

    If you want to exclude proxmox you could attempt to live boot another usb Linux and test iperf over the lan to another device.

    • @corrodedOP
      link
      24 months ago

      Every VM is using VirtIO as the network card; they’ll all on the same bridge to the physical 10Gb NIC. As far as I understand, any traffic between VMs should not be leaving the Proxmox server.

      • chiisana
        link
        fedilink
        14 months ago

        Sorry I missed that part. Reading in bed is not my forte. I’ve deleted my comment because of my mistake.

        • @corrodedOP
          link
          14 months ago

          It was a good suggestion. That’s one of the first things I checked, and I was honestly hoping it would be as easy as changing the NIC type. I know that the Intel E1000 and Realtek RTL8139 options would limit me to 1Gb, but I haven’t tried the VMware vmxnet3 option. I don’t imagine that would be an improvement over the VirtIO NIC, though.

        • @corrodedOP
          link
          14 months ago

          What do you mean specifically? If I’m already testing between two VMs, doesn’t that already isolate any issues to Proxmox? Is there another performance metric you think I should be looking at?

          • Possibly linux
            link
            fedilink
            English
            14 months ago

            It will tell you if the virtualization is the bottle neck. It is actually pretty easy to mount a smb share in proxmox you just need to open up the shell and mount it. You can use dd to test sequential speed.

  • Possibly linux
    link
    fedilink
    English
    1
    edit-2
    4 months ago

    My guess is there is a “glitch” somewhere in the middle. If not then it might be SMB or your drive speeds.

    Can you try doing a speed check in between hosts? Also, I would make sure that the networking is paravirtualized properly. You also could try swapping out your network cables.

    • @corrodedOP
      link
      14 months ago

      When I use OpenSpeedTest to to test to another VM, it doesn’t read or write from the HDD, and it doesn’t leave the Proxmox NIC. It’s all direct from one VM to another. The only limitations are CPU are perhaps RAM. Network cables wouldn’t have any effect on this.

      I’m using VirtIO (paravirtualized) for the NICs on all my VMs. Are there other paravirtualization options I need to be looking into?

      • Possibly linux
        link
        fedilink
        English
        14 months ago

        I don’t have a lot of experience in high speed but as soon as you start getting faster there tends to be exponential overhead. I think you should try mounting the network share on the Proxmox host to test speed without the complexity of the VMs. If you get the results you are looking for then you are good but if it is bottle necked there the bottle neck is on the NAS or SMB. SMB is particularly hard to overcome as it seems to be slow no matter what you do.