DWise1's Sockets Programming Pages


HARD HAT AREA

WATCH YOUR STEP

Purpose of these Pages

The purpose of these pages is two-fold:
  1. To share with the community what I have been learning about sockets programming.
  2. In the process, to organize and clarify the information for myself.
I am in the process of learning sockets programming. I discovered long ago that one of the best ways to learn something is to teach it. So in that vein, by trying to explain it to you, I am also helping myself learn the material.

These pages do not even begin to intend to provide comprehensive coverage of the subject. There is a multitude of books and sites that explain the concepts better, a few of which I list on my resources page. All I really intend to do here is to cover the basics needed to get started, though over time I will undoubtedly expand upon some topics as needed.


MAJOR CAVEAT!

I first developed these pages in 2002 through 2008. Because IPv6 has only just recently started going on-line (as of mid-2011), I concentrated on IPv4 and had not yet studied IPv6. As a result, these pages only cover IPv4. Also, some of the functions I use have been deprecated in favor of newer functions that also support IPv6, but I have not covered them yet.

Despite this deficiency, you should still be able to learn a lot from these pages and be able to get started with network programming. After I get every thing back up on-line, I will revisit those issues and update these pages accordingly.


Introduction

Network programming is the writing of programs that will communicate over a network with programs running on another computer. There are several different programming models for accomplishing this. Indeed, at first almost every different operating system had its own proprietary programming model, a condition which continues to exist to some extent. But with the growth in popularity of TCP/IP as a standard networking protocol, a native Application Programming Interface (API) for TCP/IP has also become more popular: Berkeley Sockets.

The Sockets programming model, AKA "Berkeley Sockets", was first introduced in 1983 in the 4.2 BSD Unix system. That programming model consists of a set of data structures, predefined constants, and functions that perform system calls. Although it originated as a TCP/IP programming model, it appears that it can also be used for other protocol families, including UNIX domain, IPX/SPX, XEROX NS, X.25, SNA, DECnet, AppleTalk, and NetBios. The model is that flexible (we'll cover part of the reason for that in the section on addressing).

In addition, sockets programming is available in many different languages and development environments. While it is natively C (and hence also C++) and UNIX, I have also seen it in Perl, Visual BASIC, Delphi, Java, Python, Windows, and LabView. From what I've seen, the concepts and most of the core function names are the same -- if anything, the tendency is to expand on the core API rather than to change it appreciably. Therefore, what you learn in one development environment should be transferable to the others. By the way, my approach in these pages is in C.

Now, I have to admit that the idea of writing code that would network computers together scared me, so I approached it cautiously. This meant that I wasted a lot of time in "analysis paralysis" researching all that I could before I would try to write down the first line of code. But then I finally got started writing sockets applications by playing with the code from Donahoo and Calvert's The Pocket Guide to TCP/IP Sockets: C Version -- it is clear, concise, and a bargain at $15. I recommend it highly to beginning sockets programmers.

Sockets programming really is a lot easier to do than it seems at first.


A Few Basic Comments


Some Basic Networking Caveats

These are what I think are some of the more common "stupid mistakes" that you could make in network programming, especially when you're getting started. Remember that I drew up this list mainly from my personal experience:
Mistake #1. Trying to talk to a host on a LAN with wrong network address.
Even if both computers are connected together and sitting side-by-side, if their IP addresses do not place them on the same network (ie, if the network portion of their IP addresses are not the same), then they will never be able to talk with each other.

The reason for this is that TCP/IP has two entirely different ways of resolving the IP address to a physical MAC address and which one it uses depends on whether the hosts are on the same network or not.

Mistake #2. Mixing up TCP and UDP.
Not only are TCP and UDP two different protocols, but they also use two separate sets of ports; e.g., Port 80/TCP is entirely different and separate from port 80/UDP. So a TCP client will not be able to talk with a UDP server, nor will a UDP client be able to talk with a TCP server.

Don't laugh. In preparing to answer a forum question, I compiled her client code and tried to connect it to my server. Well, duh! She was using UDP and trying to connect to a TCP server and I made the exact same mistake! The same thing happened when I wrote my first rtime server and couldn't get a known-good client to connect to it. So it's a lot easier to make this mistake than you may think.

Mistake #3. Not taking care of byte order.
This one you'll encounter as you start to write your programs. When a computer stores a multi-byte value, it can either start with the higher-order byte ("big-endian") or with the lower-order byte ("little-endian"). As long as the data stays within a given computer and is only shared with computers of the same type, there's no problem and the entire issue of byte order is completely transparent to the user. But as soon as you start sharing that data with any possible computer in the world, byte order becomes an important issue.

The standard byte order on the internet is big-endian, high-order byte first. Sockets provides functions for converting host byte order into network byte order, so the byte order of the host can remain transparent to the programmer. However, the programmer must still remember to use the functions.

The two most common places where not using the built-in functions will cause problems are:

When loading the port number into an address structure.
If the host byte order is different, then the port number you have entered will be entirely different from what you think it is -- eg, port 23 will become port 5888 and port 80 will become port 20,480. The client will try to connect to the wrong port and will never be able to connect. Unless the same mistake was made in both the client and the server, which will turn this situation into a future debugging nightmare when a third application that was written correctly tries to connect.

When loading or reading from the data packet.
In this case, you will end up reading the data in reversed order or cause the destination host to read it in reversed order. In either case, a non-zero value will be misinterpreted and data corruption will have occured. Again, if both hosts make the same mistake then the error will be masked for a time and cause troubleshooting headaches down the road.
I'll provide links to more complete explanations as I expand this portion of my site.


Topics

The following is a table of contents providing links to the topics on this sockets programming site. This table of contents currently provides the only access to these topics.

More links will be added as the topics are written and uploaded.

Sockets Programming Home Page

Basic TCP/IP Theory

IP Addresses

Working with Sockets

Sockets Applications

Windows Sockets (WinSock)

Miscellaneous Topics

  • Address Resolution (DNS)
  • Accessing the Domain Name Service (DNS) to convert a domain name into an IP address and vice versa.
  • Data Representation within a Packet
  • General guidelines on how to format a packet, insert data into it, and extract data back out. Also refered to as "wire format." Includes an example (SNTP message format, RFC 2030) with techniques in C for inserting and extracting data.
  • Survey of RFC Data Formatting Specifications
  • A survey of the RFCs of well-known services to learn how they format their data. Provides us with ideas for when we create our own protocols -- why re-invent the wheel?
  • Dealing With and Getting Around Blocking Sockets
  • How to get your program to be remain responsive while handling multiple sockets.

    Currently includes a brief discussion of server strategies.

  • Graceful Shutdown and Crash Detection
  • Server Strategies
  • Resources and Links

    Sample Network Applications with Source Code


    Return to Top of Page
    Return to DWise1's Programming Home Page

    Contact me.


    Share and enjoy!

    First uploaded on 2002 November 08.
    Updated on 2011 September 10.