[SOLVED] Weird (to me) networking issue - can you help?

@talkingpumpkin · edit-2 5 months ago

[SOLVED] Weird (to me) networking issue - can you help?

@teslasaur · 5 months ago

My guess is that the server receives the packet from the client with src .11.101 dst .10.102 and tries to respond over the interface that has .11.102 assigned. The client expects a response from src .10.102 and drops the packet. But I would turn on a packet sniffer in the gateway to see if the returning traffic even passes the Firewall in scenario 1.

@talkingpumpkin · edit-2 5 months ago

So the request goes trough but the replies are discarded ? That could actually be it!

I think there was an option to allow that… I’ll search it and give it a try. Thanks!

@teslasaur · 5 months ago

It has to do with link priority on the server. You’d imagine that a server that receives a packet that has a return address on the same subnet as it self logically would use that interface instead.

A similar thing happens in switches. For example if you have two vlans on a switch and both vlans have an ip assigned, connect a computer to one of the vlans. You will only be able to reach the switch on the non-routed connection. Even if you also are allowed to reach the second vlan through a router/Firewall.

@[email protected] · edit-2 5 months ago

Do you have a route that’s configured to route between the subnets that perhaps changes when you change which interfaces are enabled on your NAS?

My $2 guess is that it’s working fine, because you really shouldn’t expect computers to talk to each other on subnets they’re not a part of without routing, and that the interface disabling you’re doing is changing something in how packets are routed/brings your router into routing packets and thus makes it work then.

@just_another_person · edit-2 5 months ago

You have two NICs in a machine and two networks, one untagged and one tagged? This is a mess for a number of reasons. You have two routes and two adapters that don’t route to the default gateway of each subnet because you’re also tagging one portion of the VLAN traffic, and not tagging the other. That’s your problem.

How you’re going to fix it: learn about VLANs and subnetting, then let your router do the job it’s designed to do. You’ve already defeated the purpose of the VLANs by having them bridged with this one machine anyway. There’s literally no point except this confusing setup.

@talkingpumpkin · 5 months ago

I don’t think I quite explained the situation well enough: my server only has 1 ethernet port (same as my PC), otherwise I wouldn’t have bothered with vlans (well, I would still have bothered, since my house still only has one “backbone” cable running through it, but I would have configured it on the switches only).

Anyway… a few of the things you say/imply go against my understanding of networking, so one of us would better go back RTFM as you suggest :) (just kidding - most probably I just don’t understand what you mean)

@[email protected] · 5 months ago

This sounds familiar. Can you verify if you’ve enabled net.ipv4.ip_forward=1 in /etc/sysctl.conf? If you have to make a change, then issue sysctl --system to reload the updates.

@talkingpumpkin · 5 months ago

Thanks! Forwarding is disabled. I don’t want the server to steal the router’s job :)

@[email protected] · 5 months ago

If you already have a router tying these two networks together then you should NOT also have two NICs in one machine tied to both networks. Pick one or the other, you can’t have both. If you think you need both then you haven’t correctly considered your network topology.

fmstrat · 5 months ago

The PC is on .11
It pings server on .10
Since PC is on .11, server tries to respond on .11
This will fail
Disabling .11 on the server works because it can no longer try to respond in that direction
I bet it will work if you disable the .11 default route

Scott · 5 months ago

The reason is “asymmetric routing”. The return ping packets are traveling a different route on the way out than on the way back.

@[email protected] · 5 months ago

In scenario 1, the server is technically a router between the 11 and the 10 subnet. There is a sysctl that enables forwarding (i’m on the phone and can’t look it up right now). This must be enabled.

In your second scenario, the server appears to regard its subnet 11 address as just another address it has and replies.

Snot Flickerman · edit-2 5 months ago

Have you considered adding a manually configured route for each of these networks to find each other?

If the auto-generated routes aren’t able to find it, I would personally manually add the route on both ends (give 192.168.11.0/24 a path to 192.168.10.0/24 and vice versa) to see if that changes anything.

Occasionally, you just have to tell computers what to do.

EDIT: said “path” when I meant “route”

@[email protected] · 5 months ago

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

Fewer Letters	More Letters
IP	Internet Protocol
NAS	Network-Attached Storage
NAT	Network Address Translation

[Thread #934 for this sub, first seen 23rd Aug 2024, 13:05] [FAQ] [Full list] [Contact] [Source code]

sj_zero · 5 months ago

Having a pair of default gateways could be an issue. On Windows (which I know, isn’t the OS here), you have to be pretty careful because if you’re straddling two networks, you need to pick one network to be the dominant one, that’s the one whose default gateway will get packets heading onto outbound networks.

@talkingpumpkin · 5 months ago

I tried dropping the default routes (one at a time) and it doesn’t make a difference, which isn’t (I think) surprising as all traffic is local as far as the server in scenario 1 is concerned. Also IIUC only the default gateway with the lowest metric actually counts.

Possibly linux · edit-2 5 months ago

Why is your network like this? Wouldn’t is be easier to put the server and the devices that need to access server on the same subnet? It is generally a bad practice to have multiple subnets on the same device and nic.

Anyway I suspect the treason is that ping is using the wrong source IP. If you run ping --help there should be an option to change the source address.

@talkingpumpkin · 5 months ago

why is your network like this?

Well, at the moment my network is actually flat :)

This is an experiment I’m doing because I wanted to have all the management stuff on a different subnet (eg. adguard dns is on the “regular” subnet everyone uses, but its web interface is on the special subnet only select devices can talk to).

Of course (like with most stuff in my homelab), it’s not like I really have a super-compelling security reason to that, it’s mostly that I wondered “what if?” :D

Oh. the ping option you are referring to is -I (upper case) and takes either an interface name or an ip. I did try giving a .10/24 IP to the PC and the results were consistent with scenario 1 (pings where source and destination are on the same subnet work, pings acrrss subnets don’t), so I didn’t mention that in the OP

@hungover_pilot · 5 months ago

Another solution is to use NAT on the router. NAT all traffic from the client network 11.0/24 to the routers IP on the server network 10.0/24.

That way when the server sees the ICMP echoes on its 10.102 network it will look like it came from the router and send the reply back together router instead out its other interface.