I have two subnets and am experiencing some pretty weird (to me) behaviour - could you help me understand what’s going on?


Scenario 1

PC:                        192.168.11.101/24
Server: 192.168.10.102/24, 192.168.11.102/24

From my PC I can connect to .11.102, but not to .10.102:

ping -c 10 192.168.11.102 # works fine
ping -c 10 192.168.10.102 # 100% packet loss

Scenario 2

Now, if I disable .11.102 on the server (ip link set <dev> down) so that it only has an ip on the .10 subnet, the previously failing ping works fine.

PC:                        192.168.11.101/24
Server: 192.168.10.102/24

From my PC:

ping -c 10 192.168.10.102 # now works fine

This is baffling to me… any idea why it might be?


Here’s some additional information:

  • The two subnets are on different vlans (.10/24 is untagged and .11/24 is tagged 11).

  • The PC and Server are connected to the same managed switch, which however does nothing “strange” (it just leaves tags as they are on all ports).

  • The router is connected to the aformentioned switch and set to forward packets between the two subnets (I’m pretty sure how I’ve configured it so, plus IIUC the second scenario ping wouldn’t work without forwarding).

  • The router also has the same vlan setup, and I can ping both .10.1 and .11.1 with no issue in both scenarios 1 and 2.

  • In case it may matter, machine 1 has the following routes, setup by networkmanager from dhcp:

default via 192.168.11.1 dev eth1 proto dhcp              src 192.168.11.101 metric 410
192.168.11.0/24          dev eth1 proto kernel scope link src 192.168.11.101 metric 410
  • In case it may matter, Machine 2 uses systemd-networkd and the routes generated from DHCP are slightly different (after dropping the .11.102 address for scenario 2, of course the relevant routes disappear):
default via 192.168.10.1 dev eth0 proto dhcp              src 192.168.10.102 metric 100
192.168.10.0/24          dev eth0 proto kernel scope link src 192.168.10.102 metric 100
192.168.10.1             dev eth0 proto dhcp   scope link src 192.168.10.102 metric 100
default via 192.168.11.1 dev eth1 proto dhcp              src 192.168.11.102 metric 101
192.168.11.0/24          dev eth1 proto kernel scope link src 192.168.11.102 metric 101
192.168.11.1             dev eth1 proto dhcp   scope link src 192.168.11.102 metric 101

solution

(please do comment if something here is wrong or needs clarifications - hopefully someone will find this discussion in the future and find it useful)

In scenario 1, packets from the PC to the server are routed through .11.1.

Since the server also has an .11/24 address, packets from the server to the PC (including replies) are not routed and instead just sent directly over ethernet.

Since the PC does not expect replies from a different machine that the one it contacted, they are discarded on arrival.

The solution to this (if one still thinks the whole thing is a good idea), is to route traffic originating from the server and directed to .11/24 via the router.

This could be accomplished with ip route del 192.168.11.0/24, which would however break connectivity with .11/24 adresses (similar reason as above: incoming traffic would not be routed but replies would)…

The more general solution (which, IDK, may still have drawbacks?) is to setup a secondary routing table:

echo 50 mytable >> /etc/iproute2/rt_tables # this defines the routing table
                                           # (see "ip rule" and "ip route show table <table>")
ip rule add from 192.168.10/24 iif lo table mytable priority 1 # "iff lo" selects only 
                                                               # packets originating
                                                               # from the machine itself
ip route add default via 192.168.10.1 dev eth0 table mytable # "dev eth0" is the interface
                                                             # with the .10/24 address,
                                                             # and might be superfluous

Now, in my mind, that should break connectivity with .10/24 addresses just like ip route del above, but in practice it does not seem to (if I remember I’ll come back and explain why after studying some more)

  • Possibly linux
    link
    fedilink
    English
    1
    edit-2
    2 months ago

    Why is your network like this? Wouldn’t is be easier to put the server and the devices that need to access server on the same subnet? It is generally a bad practice to have multiple subnets on the same device and nic.

    Anyway I suspect the treason is that ping is using the wrong source IP. If you run ping --help there should be an option to change the source address.

    • @talkingpumpkinOP
      link
      English
      22 months ago

      why is your network like this?

      Well, at the moment my network is actually flat :)

      This is an experiment I’m doing because I wanted to have all the management stuff on a different subnet (eg. adguard dns is on the “regular” subnet everyone uses, but its web interface is on the special subnet only select devices can talk to).

      Of course (like with most stuff in my homelab), it’s not like I really have a super-compelling security reason to that, it’s mostly that I wondered “what if?” :D

      Oh. the ping option you are referring to is -I (upper case) and takes either an interface name or an ip. I did try giving a .10/24 IP to the PC and the results were consistent with scenario 1 (pings where source and destination are on the same subnet work, pings acrrss subnets don’t), so I didn’t mention that in the OP