Is the bitwise AND of subnet masks and IP addresses redundant?

@ricdeh · edit-2 3 months ago

Is the bitwise AND of subnet masks and IP addresses redundant?

macniel · edit-2 3 months ago

But why do we need the bitwise AND for that, specifically? I understand the idea, but would it not be easier to only parse the IP address string of bits only for the first n bits and then disregard the remainder (the host identifier)?

Essentially it boils down to:

bit operations are stupid fast and efficient, String operations are super slow.

Also, IP addresses are always stored as int32/int64, so applying String operations would require them to be converted first.

@ricdeh · 3 months ago

Okay, that makes sense. Thank you.

@ricdeh · 3 months ago

Though I would like to clarify that maybe my wording was a bit confusing. By “string of bits”, I did not mean the term as it is typically used in programming language environments, but rather a raw binary sequence, e.g., the first 24 bits of an IP address, therefore allocating 3 bytes of memory for storing the NID.

macniel · 3 months ago

but rather a raw binary sequence, e.g., the first 24 bits of an IP address, therefore allocating 3 bytes of memory for storing the NID.

That would require dynamic memory allocation, since you can never know what CIDR your stack encounters. It could be a nibble, a byte, a byte and a nibble, …, 4 bytes. So you would allocate a int32/int64 anyway to be on the safe side.

@ricdeh · 3 months ago

Yep, I agree. Though one could make a hypothetical argument for expanding the array dynamically when needed. Of course, due to the varying sizes of NIDs resulting from CIDR (which you correctly mentioned), you would need to have a second array that can store the length of each NID, with 5 bits per element, leaving you with 3 bits “saved” per IP address.

That can end up wasting more memory than the 32-bit per NID approach, e.g., when the host identifier is smaller than 5 bits. And there’s the slowness of memory allocation and copying from one array to another that comes on-top of that.

I think that it is theoretically possible to deploy a NID-extracting and tracking program that is a tiny bit more memory efficient than the 32-bit implementation, but would probably come at a performance overhead and depend on you knowing the range of your expected IP addresses really well. So, not useful at all, lol

Anyway, thanks for your contributions.

macniel · 3 months ago

sure thing buddy, and never feel discouraged to ask “stupid questions”, it’s how we learn after all :)

@[email protected] · 3 months ago

Probably because it’s only four bytes of data, and counting/extracting bits takes more cpu time than one AND operation.
Most CPU’s are optimised to work with whole integers (32/64 bit) rather than individual bits.

If memory was a serious concern you could compress it down to one byte as a ‘number of 1s’ counter at the cost of additional cpu operations, but because 3 extra bytes is such a small amount of data, this memory/time trade off isn’t worth it in most systems.

It’d be useful if you wanted to compress some data logs or something with many subnet masks though.

@[email protected] · 3 months ago

I’ll address your question in two parts: 1) is it redundant to store both the IP subnet and its subnet mask, and 2) why doesn’t the router store only the bits necessary to make the routing decision.

Prior to the introduction of CIDR – which came with the “slash” notation, like /8 for the 10.0.0.0 RFC1918 private IPv4 subnet range – subnets would genuinely be any bit arrangement imaginable. The most sensible would be to have contiguous MSBit-justified subnet masks, such as 255.0.0.0. But the standard did not preclude using something unconventional like 255.0.0.1.

For those confused what a 255.0.0.1 subnet mask would do – and to be clear, a lot of software might prove unable to handle this – this is describing a subnet with 2^23 addresses, where the LSBit must match the IP subnet. So if your IP subnet was 10.0.0.0, then only even numbered addresses are part of that subnet. And if the IP subnet is 10.0.0.1, then that only covers odd numbered addresses.

Yes, that means two machines with addresses 10.69.3.3 and 10.69.3.4 aren’t on the same subnet. This would not be allowed when using CIDR, as contiguous set bits are required with CIDR.

So in answer to the first question, CIDR imposed a stricter (and sensible) limit on valid IP subnet/mask combinations, so if CIDR cannot be assumed, then it would be required to store both of the IP subnet and the subnet mask, since mask bits might not be contiguous.

For all modern hardware in the last 15-20 years, CIDR subnets are basically assumed. So this is really a non-issue.

For the second question, the router does in-fact store only the necessary bits to match the routing table entry, at least for hardware appliances. Routers use what’s known as a TCAM memory for routing tables, where the bitwise AND operation can be performed, but with a twist.

Suppose we’re storing a route for 10.0.42.0/24. The subnet size indicates that the first 24 bits must match a prospective destination IP address. And the remaining 8 bits don’t matter. TCAMs can store 1’s and 0’s, but also X’s (aka “don’t cares”) which means those bits don’t have to match. So in this case, the TCAM entry will mirror the route’s first 24 bits, then populate the rest with X’s. And this will precisely match the intended route.

As a practical matter then, the TCAM must still be as wide as the longest possible route, which is 32 bits for IPv4 and 128 bits for IPv6. Yes, I suppose some savings could be made if a CIDR-only TCAM could conserve the X bits, but this makes little difference in practice and it’s generally easier to design the TCAM for max width anyway, even though non-CIDR isn’t supported on most routing hardware anymore.

@NeoNachtwaechter · 3 months ago

Removed by mod

@ricdeh · 3 months ago

It’s nostupidquestions after all :( I am not saying that anyone ever did anything worse, my question is aiming at the answer for why the current approach is the way that it is, on a technical level.

@NeoNachtwaechter · 3 months ago

Because the information necessary for that is already available from the subnet mask WITHOUT the bitwise AND, e.g., with 255.255.255.0 or 1111 1111.1111 1111.1111 1111.0000 0000, you count the amount of 1s, which in this case is 24 and corresponds to that appendix in the CIDR notation. At this point, you already know that you only need to consider those first 24 bits from the IP address, making the subsequent bitwise AND redundant.

On a technical level, the bitwise operation is all that is needed. It is one calculation of the simplest kind. A CPU can do it in 1 tick. That’s why they invented it this way.

The other way that you described is the super extra ultra lengthy complicated - and maybe redundant - thing.