DWise1's Sockets Programming Pages: IP Addresses

DWise1's Sockets Programming Pages

IP Addresses


Introduction

Imagine the situation in which you are creating a small network or connecting a host to an existing network. Furthermore, imagine that you are going to set the host's IP address manually 1. In order to do that, you will need to select that address correctly.

Under the TCP/IP suite of protocols, each host has its own IP address which uniquely identifies it. If you do not select a host's address properly, then the network will not operate properly, perhaps not at all. In order to select your hosts' addresses properly, you need to understand a little IP addressing theory.

Stated simply and directly:

For two hosts on the same network to be able to talk to each other, the network bits of their IP addresses must be the same.
If they are not the same, then both hosts will treat each other as if they were on different networks and they will be unable to communicate with each other.

The gory details follow.


Footnote 1:
Many networks, especially large ones, use Dynamic Host Configuration Protocol (DHCP) instead of assigned IP addresses manually. In DHCP, a new host configured to be a DHCP client broadcasts a request for an address and a DHCP server on the network assigns him one.

Obviously, for the purpose of this page, we must assume that we are not using DHCP. However, the exact same theory and warnings that apply to manually setting IP addresses also apply to DHCP.


Addressing Theory

To understand how to set the IP address of two devices so that they are on the same network, or to tell whether two IP addresses are on the same network, you need to understand how an IP address is put together.

For now, all of your addressing will be done with IPv4 -- you can read about the future scheme, IPv6, elsewhere. In IPv4, an IP address is given as a series of four numbers separated by periods in what is called "dotted-decimal" format. Each of these numbers falls in the range of 0 to 255 and is called an "octet", because it is eight bits long. The reason that they are not called "bytes" is that for most of the history of computers (about the mid-1940's to the late 1970's), the byte was a different length on just about every different computer. On the other hand, the term "octet" specifically means 8 bits.

Network and Host Bits

In addition, the 32 bits of an IP address are all divided up into network bits and host bits. The network bits are the high-order bits (starting on the left if all the bits were written out) and the host bits are the ones that remain on the right.

Where do the network bits end and the host bits begin? Well, that depends on the network class and the subnet mask.

Network Classes

At first, networks were classified according to the bit pattern of their first octet. Each network class supported different sized networks:

Class 1st Octet Range Network
Bits
Host Bits Number of Networks Hosts per Network
A 1 - 126
(0XXX XXXX)
First octet
8 bits
24 bits 126 16,777,214
B 128 - 191
(10XX XXXX)
First 2 octets
16 bits
16 bits 4,096 65,534
C 192 - 223
(110X XXXX)
First 3 octets
24 bits
8 bits 2,097,152 254

The addresses whose first octet is 224 or higher are special-purpose ones not germane to this discussion (e.g., multicast and experimental networks).
Also, addresses starting with 127 are part of a special class which refers to the local host; e.g., 127.0.0.1 for local host (try pinging either 127.0.0.1 or localhost -- interface lo on Linux).

So, from the table you see that the dividing line between the network bits and host bits is normally determined by its network class. However, it turned out that this (relatively) simple scheme had some inherent problems that have driven some refinements.


Subnet Masks

There is a long-standing tradition in the computer industry of underestimating future needs. Story goes that at first they expected that fewer than ten computers would more than adequately satisfy world-wide demand. And I remember a BYTE editorial in 1977 which mentioned an electrical engineer who could not imagine anyone ever being able to use more than 1 K of RAM.

Subnet Mask Octets
Binary Decimal
0000 0000 0
1000 0000 128
1100 0000 192
1110 0000 224
1111 0000 240
1111 1000 248
1111 1100 252
1111 1110 254
1111 1111 255
Similarly, nobody anticipated the growth of the number of devices connected to the internet, each of which would need its own unique IP address. Refering to the table above, you can see that IPv4 limits us to a little over 2.9 billion addresses, which we find is not enough. Besides, the allocation of whole networks by class leads to a lot of wasted host addresses; e.g., if you only needed 20 addresses, you would still get an entire class C network of 254 addresses not matter what, thus wasting 234 addresses.

The solution to this situation was subnetting, in which you split a larger network down into smaller networks. Now, an ISP can take one class C network and divide it down into 64 sub-networks, each containing two host addresses, one for your host and the other for the ISP's router. This way, smaller networks can be set up closer to the required size and fewer addresses will go to waste.

The way to define a subnet is with a subnet mask. A subnet mask looks very much like an IP address; it consists of four octets in the same "dotted-decimal" format. Only here all the network bits are set to one (1) and all the host bits are set to zero (0). Since all the higher-order bits are network bits and all the lower-order bits are host bits, each octet of a subnet mask can only contain one of the eight values in the table on the right. More exactly, the first octets must contain "255", the last octets must contain "0", and the octet at the boundary between network bits and host bits must contain a value in the table to the right.

Another method is with CIDR (Classless Inter-Domain Routing) notation. Here, you conclude the IP address with the number of network bits; e.g., 192.168.42.34/24 . "/24" corresponds to the subnet mask, "255.255.255.0". Although CIDR notation is more concise and easier to read, you are more likely to encounter subnet masks when setting up local networks.

For example, the default subnet masks for the three classes of networks are:

Class Subnet Mask CIDR
A 255.0.0.0 /8
B 255.255.0.0 /16
C 255.255.255.0 /24


Two Special Addresses

On the Network Classes table above, you may have noticed that the "Hosts per Network" count is always two shy of the complete number of possible host addresses that that many host bits provide. That is true; the formula for calculating the number of host addresses available for a given number of host bits is:
hosts = 2(host bits) - 2
Those two missing addresses are special addresses that are reserved for special use:

Name Description Example
Network Address Host bits set to all zeros 192.168.16.0/24
Broadcast Address Host bits set to all ones 192.168.16.255/24

The network address is more commonly used in setting up network configurations, especially in Linux and in routers, and conceptually in testing whether two addresses are on the same network (see below). You will rarely see it.

On the other hand, you will encounter the broadcast address. It is used to send a message to all the hosts on the local network. This has a variety of uses, most of which are built in to the network already (e.g., DHCP requests, ARP requests). Most routers are configured to not allow broadcasts to get out of the local network and on most systems you need special permission (such as being root on Linux) to send a broadcast.


Finally, How to Tell Whether Two Addresses are on the Same Network

Now you've had enough IP addressing theory to understand how to compare two IP addresses and determine whether they are on the same same network. The reason why this is so important to be able to do is because connecting two computers with IP addresses for different networks is one of the most common mistakes made. You need to know how to tell whether you have made that mistake and how to avoid it.

First, the simple answer: Two addresses are on the same network if their network addresses are the same.

Second, the basic procedure:

  1. AND the subnet mask to both IP addresses.
  2. Compare the ANDed results. They should:
    1. Be equal to each other, and
    2. Be equal to the network address.
  3. If the two ANDed results are equal to each other, then they are on the same network. If they are not equal, then they are not on the same network.

Now in More Detail

To get the network address, you simply AND the subnet mask with the IP address. AND'ing is a basic binary operation. All it means here is that you:
  1. Convert both the IP address and the subnet mask to binary.
  2. Compare the corresponding bits of both binary values.
  3. Wherever there is a one in the subnet mask, write down the corresponding bit from the IP address.
  4. Wherever there is a zero in the subnet mask, write down a zero (ie, "mask out" that bit in the IP address).
  5. Convert the resultant value back to dotted decimal.
Now you have the network address.

Since most of the octets in the subnet mask are either 255 (all ones) or 0 (all zeros) and only one octet might be something else (see the table above), the procedure is even simpler than it may seem at first:

  1. If the octet in the subnet mask is 255, then copy down the corresponding octet in the IP address.
  2. If the octet in the subnet mask is 0, then make the corresponding octet zero.
  3. If the octet in the subnet mask is another value, then AND the corresponding octets together.
If you are not comfortable with converting between decimal and binary nor with performing binary operations, tools are available to you:


Examples

Consider the following IP addresses, all of which have the same subnet mask of 255.255.255.240.

Notice that all of the first three octets and the first four bits of fourth octet are set to one (1). This means that the first three octets of the IP address are part of the network address and that you will need to AND the fourth octet of the IP address to 240 (1111 0000 binary):

64.12.149.24
24 = 0001 1000 binary
1111 0000 AND 0001 1000 = 0001 0000 = 16
Therefore, the network address of 64.12.149.24 is 64.12.149.16

64.12.149.7
7 = 0000 0111 binary
1111 0000 AND 0000 0111 = 0000 0000 = 0
Therefore, the network address of 64.12.149.24 is 64.12.149.0

64.12.149.30
30 = 0001 1110 binary
1111 0000 AND 0001 1110 = 0001 0000 = 16
Therefore, the network address of 64.12.149.24 is 64.12.149.16

The network addresses of 64.12.149.24 and 64.12.149.7 are different (64.12.149.16 and 64.12.149.0, respectively).
Therefore, they are on different networks.

The network addresses of 64.12.149.24 and 64.12.149.30 are identical (64.12.149.16).
Therefore, they are on the same network.


Now you know how to tell whether two IP addresses are on the same network.


So Why do They Need to be on the Same Network?

To answer that, we need to look at the basics of how TCP/IP and the Ethernet work.

Application Layer
Transport Layer
IP Layer
Data Link Layer
First, realize that we're working with layers of protocols here, each one laid on top of the other. Here's how it works, for example, when we send some data:

  1. The Application Layer creates the data to be sent and processes the data received. It also handles the overall session between the applications on the two computers.

    In our example, it puts together a block of data and passes it to the TCP Layer for transmission. Earlier, it had already told the Transport Layer the destination address and the protocol to use.

  2. The Transport Layer (AKA "TCP Layer") handles the actual connection with the destination host and the transmission and receipt of data packets.

    In our example, it wraps the Application Layer's block of data inside a tcp or udp packet, depending on which protocol is used, puts the source and destination IP addresses and port numbers in the packet header, and passes it on to the IP Layer.

  3. The IP Layer handles IP addressing and the routing of packets from source to destination. It decides where to send the packet so that it will eventually reach the destination IP address. Part of that decision is based on what network the destination host resides in.

    In our example, it attaches an IP header onto the packet which contains the IP address of the next device that the packet will go to, which would be either the destination host or a router.

    Herein lies the rub! How does it know which it will be? And what are the consequences of its choice?

  4. The Data Link Layer handles the physical connection between the computers and the actual transmission of data over the wire.

    However, the Ethernet knows nothing about IP addresses. It only knows about hardware addresses, AKA "Media Access Control (MAC) addresses", a unique six-byte address that the manufacturer programs into every NIC. In order to resolve an IP address to its MAC address, it applies the Address Resolution Protocol (ARP):

    1. ARP maintains a cache table of IP addresses resolved within the past few minutes and their corresponding MAC addresses. View the arp cache with the arp command: arp -a

    2. If the IP address is in the arp cache, then the Data Link Layer uses the MAC address stored there.

    3. If the IP address is not in the arp cache, then the Data Link Layer broadcasts an ARP request message to every host on the network, basically asking everyone, "Who has this IP address?" The host with that IP address sends an ARP reply containing its MAC address back to the requesting host, who can now send the packet.


Now, there's a wrinkle in this process. Routers are normally set up to not pass broadcasts, so the broadcast ARP request is restricted to the local network. Furthermore, broadcasts eat up bandwidth and unnecessary ARP requests -- i.e., requests for IP addresses that could not possibly be on the local network -- do so unnecessarily. So the problem becomes one of determining when to ARP and when not to.

The solution is to check whether the IP address is on the local network and then to ARP for it only when it is. And if it is not on the local network, then the packet gets sent to the router, which may require an ARP but at least it won't be unnecessary. It's the IP Layer that determines whether the IP address is on the local network and then tells the Data Link Layer to send the packet to that IP address or to the router.

Here's the procedure:

  1. The IP Layer tests whether the destination IP address is on the local network:
    • If it is on the same network, then the IP Layer instructs the Data Link Layer to send the packet to the destination IP address.
    • If it is not on the same network, then the IP Layer commands the Data Link Layer to send the packet to the router -- i.e., it gives the Data Link Layer the router's IP address.

  2. The Data Link Layer resolves the IP address it has been given to a MAC address, either from the ARP cache or with a broadcast ARP request.

  3. The Data Link Layer sends the packet to the MAC address that the IP address resolved to.


OK, that procedure works fine when the IP addresses have been set correctly, but what happens when an address is wrong? Specifically, what happens when a host that's physically connected to the local network is given an IP address that is not on the network?

Let's get even more specific with what must be the most common mistake, the one that prompted me to write this section:

With a cross-over Ethernet cable, you connect two PCs together. However, you set them to IP addresses that are not on the same network.
So here's what happens in this case:
  1. The IP Layer determines that the destination IP address is not on the local network.
    NOTE: the IP Layer assumes that its own host's IP address is on the local network.

  2. The IP Layer commands the Data Link Layer to send the packet to the router.

  3. Assuming that an address was entered for the gateway router, the Data Link Layer tries to resolve the router's IP address.

  4. Since there is no router on this network, the ARP fails and the packet is never sent.
The skinny is that you can never find that peer, even though you can plainly see that it is physically connected to your computer. Your host never even tries to look for it, because it already "knows" that it's not on the local network.


This is a major "got'cha!" that beginners keep falling for. When you network two PCs together, or you add a new host to an existing network, make sure that its IP address is on the local network. And if the other hosts cannot find that new one, this should be the first thing you check.


Private Addresses and Network Address Translation (NAT)

These are a couple extra topics that are of great use in setting up a home network.


Remember I mentioned that there are a limited number of IP addresses? Besides the long-term solution of a new IP protocol, IPv6, a number of short-term solutions have been implemented. Besides subnetting, the scheme that has saved the Internet for now is Network Address Translation (NAT), which is supported by ranges of private IP addresses. Although the idea of NAT was being discussed in 1994, it doesn't appear to have gotten implemented until about 1999. Now it's virtually impossible to find a gateway router that doesn't do NAT.

If you are interested, the pertinent Requests for Comment (RFCs) are:


Private Addresses

If you are connecting to the Internet or to a pre-existing network, then the network administrator or your ISP will tell you what to set your IP address and subnet mask to, as well as the IP address of your gateway router. However, if you are setting up a private network not connected to the Internet or that will connect to the Internet via Network Address Translation (NAT -- see below), then you should use a private IP address from one of the following private address ranges that have been defined in RFC 1918 mentioned directly above:

Private Address Range Subnet Mask CIDR
10.0.0.0 - 10.255.255.255 255.0.0.0 10.0.0.0/8
172.16.0.0 to 172.31.255.255 255.240.0.0 172.16.0.0/12
192.168.0.0 to 192.168.255.255 255.255.0.0 192.168.0.0/16

None of these addresses will be visible outside of your network. None of them can get out onto the Internet and routers are designed to not pass them. They are yours to use as you see fit. You are even free to change the subnet masks, just don't decrease the number of network bits (which could make it a different network that is public).

Network Address Translation (NAT)

This is the neat trick that probably saved the Internet from running out of IP addresses.

Basically, NAT works by changing the addresses inside the local network to the public address through which the network connects to the Internet (via a gateway router) and vice versa.

Most home networks are set up this way:

  1. The ISP gives you one IP address for connecting to the Internet.
  2. That connection (usually a DSL modem or cable modem) connects to a gateway router whose IP address is set to the one provided by the ISP. This is a public address so that the Internet can reach the router and that public address is known to the Internet.
  3. The router's other network interface is connected to a local network with a private network address. None of the addresses within this local network are known to the Internet.
  4. The router performs NAT and other firewall functions.
Here is basically how NAT is performed:
  1. A host on the local network sends a packet to the gateway router to be sent to the Internet.
  2. The router changes the packet's source IP address and port numbers to its own public address and a different port and keeps track of that change.
  3. The packet goes out onto the Internet with the router being given as its source.
  4. A response to that packet comes back from the Internet and is received by the router.
  5. The router remembers the translation it did on the way out and translates the packet's destination back to the private address of the host on the local network that had sent the original packet out. In addition, the router could perform "stateful inspection" fire-walling by remembering that it had sent a packet and to whom and verifying that the incoming packet is a response from that destination.
  6. The router sends the translated packet to the host on the local network.
The advantages of NAT include:
  1. It allows multiple local hosts to access the Internet through a single public IP address, thus conserving the limited pool of available public addresses. This is given much of the credit for staving off the much-feared depletion of IP addresses.
  2. It hides details about your local network from the outside, thus improving security.
Search Google for more information on "Network Address Translation". A few articles are:


Return to Top of Page
Return to DWise1's Sockets Programming Page
Return to DWise1's Programming Page

Contact me.


Share and enjoy!

First uploaded on 2003 July 26.
Updated on 2011 July 18.