DWise1's Sockets Programming Page

Address Resolution (DNS)


HARD HAT AREA

WATCH YOUR STEP

Introduction

When you set up the address struct for a socket, it requires an IP address. But how often do you actually input an IP address? Or even know what the IP address of the server is? Probably almost never. Instead, you enter a domain or host name and somehow the network app (eg, your web browser) is happy with that.

That "somehow" is the Domain Name Service (DNS). It's a hierarchical network of servers that collectively has a list of all the hosts in all the domains and their IP addresses. When you need the IP address for a host name, your application sends a DNS request to its DNS servers (the ones in the TCP/IP properties page for your network connection -- if you're a DHCP client of your ISP, then DHCP has filled in that information for you, along with your IP address). If those servers don't have the information, then they'll ask their DNS servers up the hierarchy until one of them does know and responds with the IP address. Your DNS server will then respond to you, plus it will cache that information locally for a limited amount of time, so that if you ask for the same IP address again it will be able to respond immediately. To see why caching would be a good idea, consider visiting a web site in which you will not only be requesting several pages from that server, but also each page could generate requests for several files (eg, each graphic on the page is a separate file and hence a separate file request).

The whole subject of how DNS works and how DNS servers are set up is a rather involved topic to which whole chapters and more are devoted in textbooks. If you want to know more, then refer to Wikipedia's article on DNS for a better and more complete explanation than I could provide.


But you don't need to know how DNS works in order to use it; you just need to know how to use it. And in order to use DNS within our applications, we need to know what functions to call and what to do with the data in the struct that they return.


The DNS Sockets Functions

It's fairly simple. There are just two functions (copied and edited from Visual C++ documentation):

Since you'll usually need to resolve a host or domain name to an IP address, so most of the time you'll use gethostbyname. However, there are times when you'll want to do a reverse look-up and obtain the domain name of a given IP address, in which case you would use gethostbyaddr. An example of this would be the nslookup utility (following command-line session scrubbed of local network information):

C:>nslookup www.yahoo.com
Non-authoritative answer:
Name:    www.yahoo-ht3.akadns.net
Address:  209.131.36.158
Aliases:  www.yahoo.com

C:>nslookup 209.131.36.158
Name:    f1.www.vip.sp1.yahoo.com
Address:  209.131.36.158

C:>

A Simple Example

Here is a simple function, ResolveName, that you can include in your code and use to resolve a domain/host name into an IP address that you can copy directly into a sockaddr_in struct.

Review the section on setting up a socket address. Normally an IP address will be typed in as a dotted-decimal string which would be a string like "192.168.0.1". That's fine and good for input and output, but before it's used in a socket it must be converted to a 32-bit binary number in network byte order. Normally, we would feed the dotted-decimal string to the function, inet_addr, which would then return the 32-bit binary equivalent IP address. ResolveName returns the IP address as a 32-bit binary IP address in network byte order, so there's no need to convert it further.

Here's the sample code from that section you just reviewed, modified to use ResolveName:

int StuffSockAddr_inWithName(struct addr_in *addr, const char *domain_name, short int port)
{
    unsigned long addr;
    
    if (ResolveName(domain_name, &addr) == -1)
        return -1;  /* for error handling in the calling function */
            
    memset(addr, 0, sizeof(*addr));         /* Zero out structure */
    addr->sin_family      = AF_INET;        /* Internet address family */
    addr->sin_addr.s_addr = addr;           /* IP address */
    addr->sin_port        = htons(port);    /* Port */
}    

And here's the ResolveName function itself, with all the explanation in the commenting:

/***************************************************************************
 * Function name : ResolveName
 *    returns    : int -- error/success indication:
 *                          -1 indicates failure, 0 indicates success
 *    arg1       : char name[] -- C-style character string containing the 
 *                      domain or host name to be resolved into an IP address.
 *    arg2       : unsigned long *addr -- pointer to an unsigned long 
 *                      variable into which to save the IP address, in binary 
 *                      form in network byte order.  This binary form is as
 *                      required by the sin_addr field in struct sockaddr_in.
 * Description   : This function resolves the domain name to an IP address by
 *                          calling gethostbyname and passing it the domain 
 *                          name.
 *                      If the return value is NULL, the function failed to
 *                          resolved the name and the function returns a -1
 *                          to indicate failure.
 *                      Else, the return value points to a hostent structure
 *                          that contains the information on that domain.
 *                          In this case, the IP address is copied to the 
 *                          variable pointed to by the addr parameter and
 *                          the function returns a 0 to indicate success.
 * Notes         : This function does not create a hostent struct, but rather
 *                      can only create a pointer to one.  The hostent struct
 *                      pointer value returned by gethostbyname points to a
 *                      static variable that will be overwritten by the next
 *                      socket function that could affect it.  
 *                      Therefore, before we exit this function we make sure
 *                      to copy from that struct any data that we need.
 *                 This sample was written with a minimum amount of error 
 *                      checking and reporting.  Since the details of error
 *                      handling of sockets functions is implementation 
 *                      dependent (ie, handled differently in UNIX/Linux than
 *                      in Winsock), I will leave it to you to elaborate the
 *                      code as you need to.
 */
int ResolveName(char name[], unsigned long *addr)
{
    struct hostent *host;            /* Structure containing host information */

    /* Try to resolve the name, testing for failure */
    if ((host = gethostbyname(name)) == NULL)
    {
        /* failed, so output error message and return a -1 for failure */
        fprintf(stderr, "gethostbyname() failed");
        return -1;
    }

    /* return the binary, network byte ordered address 
        through the pointer parameter*/
    *addr = *((unsigned long *) host->h_addr_list[0]);
    
    /* return a 0 for success */
    return 0;
}

One more note that will be explained more fully below. We obtained the IP address from the address list array in the hostent struct. The declaration of struct hostent includes some macro definitions that simplify its use, much as was the case with the sockaddr_in declaration (look at it for yourself; go to your compiler's INCLUDE directory and find the header file that contains the declaration for struct sockaddr_in).

Specifically, this macro definition:

#define h_addr h_addr_list[0] /* for backward compatibility */
allows us to change this line:
*addr = *((unsigned long *) host->h_addr_list[0]);
to this:
*addr = *((unsigned long *) host->h_addr);

I only bring it up here because you may see other authors using the macro instead of the field name. As I said, more on that in the next section where we take a closer look at the hostent struct.


struct hostent

This is the heart of the sockets' support of DNS. The only purpose of the two DNS functions, gethostbyname and gethostbyaddr, is to stuff a hostent struct with all the DNS information about that host.

Here is the definition of the hostent struct:

#include <netdb.h>

    struct hostent 
    {
            char    *h_name;        /* official name of host */
            char    **h_aliases;    /* alias list */
            short   h_addrtype;     /* host address type */
            short   h_length;       /* length of address */
            char    **h_addr_list;  /* list of addresses */
    };
#define h_addr  h_addr_list[0]  /* for backward compatibility */

The members of the hostent structure are:

h_name
The official name of the host.
h_aliases
These are alternative names for the host, represented as a null-terminated vector of strings.
h_addrtype
This is the host address type; in practice, its value is always either AF_INET or AF_INET6, with the latter being used for IPv6 hosts. In principle other kinds of addresses could be represented in the database as well as Internet addresses; if this were done, you might find a value in this field other than AF_INET or AF_INET6.
h_length
This is the length, in bytes, of each address.
h_addr_list
This is the vector of addresses for the host. (Recall that the host might be connected to multiple networks and have different addresses on each one.) The vector is terminated by a null pointer.
h_addr
This is a synonym for h_addr_list[0]; in other words, it is the first host address.

As I noted earlier, gethostbyname and gethostbyaddr both return a pointer to a statically allocated struct. That means that one and only one such struct exists and the next call to any function that would modify the struct will overwrite it. That means that if there's something in that struct that you want to use later, then you need to copy it to a variable declared within your application.

Winsock's hostent is a bit different, but still very close. Windows documentation notes:

The hostent structure is used by functions to store information about a given host, such as host name, IPv4 address, and so forth. An application should never attempt to modify this structure or to free any of its components. Furthermore, only one copy of the hostent structure is allocated per thread, and an application should therefore copy any information that it needs before issuing any other Windows Sockets API calls.
Winsock's declaration and field definitions are:

#include <winsock2.h>

typedef struct hostent 
{   
    char FAR* h_name;  
    char FAR  FAR** h_aliases;  
    short h_addrtype;  
    short h_length;  
    char FAR  FAR** h_addr_list;
} HOSTENT,  *PHOSTENT,  FAR *LPHOSTENT;

Members

h_name
Official name of the host (PC).If using the DNS or similar resolution system, it is the Fully Qualified Domain Name (FQDN) that caused the server to return a reply. If using a local "hosts" file, it is the first entry after the IP address.
h_aliases
A NULL-terminated array of alternate names.
h_addrtype
The type of address being returned.
h_length
This is the length, in bytes, of each address.
h_addr_list
A NULL-terminated list of addresses for the host. Addresses are returned in network byte order. The macro h_addr is defined to be h_addr_list[0] for compatibility with older software.

Note that Winsock typedefs struct hostent as HOSTENT, so that your C code will not need to use the struct keyword all the time. If you write Winsock code, you should get into the habit of taking advantage of the HOSTENT typedef.


I know that the first time I looked at the documentation on hostent, I couldn't quite understand what it was telling me, so I wrote a program to play with it. I recommend that you do the same.

Here is a function built from the code I had written:

int DisplayHostEnt(char *name)
{
    struct hostent *he;
    int  i;
    struct in_addr addr;
    
    he = gethostbyname(name);

    if (he == NULL)
    {
        fprintf(stderr,"gethostbyname failed");
        return -1;  /* return -1 for error */
    }
    else
    {
        printf("h_name = %s\n",he->h_name);

    }

    if (he->h_aliases[0] == NULL)
        printf("No aliases.\n");
    else
    {
        printf("Aliases:\n");
        for (i = 0; he->h_aliases[i] != 0; ++i) 
        {
            printf("  %d. %s\n",i+1,he->h_aliases[i]);
        }
    }
    
    /* original code had an array of address family strings 
     *   that was indexed by h_addrtype, so I printed out the name.
     *   Left that out here to keep from cluttering up the web page.
     *      AF_INET == 2
     *      AF_INET6 should be 26, but it's not defined on all platforms.
     */
    printf("h_addrtype = %d\n",he->h_addrtype);
    
    printf("h_length = %d\n",he->h_length);

    if (he->h_addr_list == NULL)
        printf("No h_addr_list present.\n");
    else
    {
        printf("h_addr_list:\n");
        for (i = 0; he->h_addr_list[i] != 0; ++i) 
        {
            memcpy(&addr, he->h_addr_list[i], sizeof(struct in_addr));
            printf("  Addr #%d: %s\n",i,inet_ntoa(addr));
        }
    }
    
    return 0;  /* for success */
}

Running my program that I just pulled that code out of, I get this output (scrubbed for security reasons):

Hostname = myPC
h_name = myPC.myemployer.com
No aliases.
h_addrtype = AF_INET [2]
h_length = 4
h_addr_list:
  Addr #0: 192.168.8.180


Resolving Service Names

As I noted in my section on ports (scroll down a bit once you get there), ports 0 through 1023 are the "Well Known Ports" that are reserved for and associated with standard services like telnet, ftp, http, ntp. Where this is leading us is that we will want to be able to accept a service name and be able to resolve it to a port number.

Well, we do have the capability. And it is almost exactly like using hostent.


The SERVICES File

First, what are the service names and what ports do they belong to? On each computer with TCP/IP there should be a file named SERVICES . On UNIX and Linux systems it should be in the /etc directory. On Windows it tends to move around a bit, but it should be under the system directory in System32\Drivers\ETC . BTW, the same directory contains the HOSTS file, into which you can enter host names and their associate IP addresses as part of host name resolution; basically a local component to the DNS process.

SERVICES is a text file. Here is a short excerpt from it:

# Copyright (c) 1993-1999 Microsoft Corp.
#
# This file contains port numbers for well-known services defined by IANA
#
# Format:
#
# <service name>  <port number>/<protocol>  [aliases...]   [#<comment>]
#

echo                7/tcp
echo                7/udp
discard             9/tcp    sink null
discard             9/udp    sink null
systat             11/tcp    users                  #Active users
systat             11/tcp    users                  #Active users
daytime            13/tcp
daytime            13/udp
qotd               17/tcp    quote                  #Quote of the day
qotd               17/udp    quote                  #Quote of the day
chargen            19/tcp    ttytst source          #Character generator
chargen            19/udp    ttytst source          #Character generator
ftp-data           20/tcp                           #FTP, data
ftp                21/tcp                           #FTP. control
telnet             23/tcp
smtp               25/tcp    mail                   #Simple Mail Transfer Protocol
time               37/tcp    timserver
time               37/udp    timserver

One thing you'll notice is that a lot of the services run on either tcp or udp. Another thing you'll notice is that some only run on one protocol (eg, ftp, telnet, and smtp only run with the tcp protocol).


struct servent -- The Service Resolution Structure

Service resolution is almost completely analogous to domain name resolution. In place of struct hostent, we have struct servent. In place of gethostbyname and gethostby addr, we have getservbyname and getservbyport.

In UNIX/Linux, include netdb.h In Winsock, include winsock2.h

struct  servent 
{
    char  *s_name;      /* official name of service */
    char  **s_aliases;  /* alias list */
    int   s_port;       /* port service resides at */
    char  *s_proto;     /* protocol to use */
};

Members:

s_name
The official name of the service.
s_aliases
A NULL-terminated array of alternate names for the service.
s_port
The port number at which the service resides. Port numbers are returned in network byte order.
s_proto
The name of the protocol to use when contacting the service. Eg, "tcp", "udp".

Note that, as before, Winsock's declaration is slightly different and that it also creates typedefs to simplify usage. Other than that, the struct fields all have the same definitions as above:

typedef struct servent 
{  
    char FAR* s_name;  
    char FAR  FAR** s_aliases;  
    short s_port;  
    char FAR* s_proto;
} SERVENT,  *PSERVENT,  FAR *LPSERVENT;


The Service Resolution Functions

Just as with hostent, we have two functions that will return a servent struct: getservbyname and getservbyport. Most of the time, you will use getservbyname in order to resolve a service name to its associate port so that you can set up the sockaddr_in struct to connect to the server. Then on occasion you may want to use the reverse lookup function, getservbyport, to see what service is on a particular well-known port. Either function will return a servent struct with the same information, just as with hostent and its two functions. You're in familiar territory here!

The syntax for the two functions are:


Using Service Resolution

Just as with using struct hostent and gethostbyname, the code for using servent and getservbyname is fairly simple. And while I haven't yet gone through the exercise of printing out all the information in a servent struct, you might want to give it a whirl.

Though I guess I should at some point describe why you would want to resolve a service name. I've written a udp time client, UdpTimeC, which runs from the command line and which expects the user to enter a time server (either by name or by IP address; my code figures out which it has) and the port, either by port number or by service name. It will accept either ntp (port 123) or time (port 37). Now, I could require the user to remember that the time service is on port 37 or that NTP is on port 123, but that wouldn't be very user-friendly, now would it? Instead, he can enter in either "time" or "ntp" and I use getservbyname to resolve that to a port number, which is what I need to complete the sockaddr_in struct.

The following is a function I had written to resolve the service passed to it to a port number:

/***************************************************************************
 * Function name : ResolveService
 *    returns    : unsigned short -- 
 *                      on failure, returns 0xFFFF, an impossible value for 
 *                          a well-known port.
 *                      on success, returns the port number converted to
 *                          network byte order.
 *    arg1       : char service[] -- C-style character string containing the 
 *                      service name to be resolved into a port number.
 *    arg2       : char protocol[] -- C-style character string containing the
 *                      protocol name.  For most purposes, this will be either 
 *                      "tcp" or "udp".
 * Description   : This function resolves the service name to a port number by
 *                          calling getservbyname and passing it the service 
 *                          name and protocol name.
 *                      First, the service name is tested for containing a
 *                          numeric string, in which case the user had 
 *                          entered a port number so no look-up would be 
 *                          necessary.  That string is converted to a numeric
 *                          value which is converted to network byte order
 *                          and returned.
 *                      Else, getservbyname is called and its return value 
 *                          is tested.
 *                          If the return value is NULL, the function failed 
 *                          to resolve the name and the function returns a -1
 *                          to indicate failure.
 *                          Else, the return value points to a servent 
 *                          structure that contains the information on that 
 *                          service.  In this case, the port number is 
 *                          returned -- because it's from the servent struct,
 *                          it's already in network byte order.
 * Notes         : This function does not create a servent struct, but rather
 *                      can only create a pointer to one.  The servent struct
 *                      pointer value returned by getservbyname points to a
 *                      static variable that will be overwritten by the next
 *                      socket function that could affect it.  
 *                      Therefore, before we exit this function we make sure
 *                      to copy from that struct any data that we need.
 *                 This sample was written with a minimum amount of error 
 *                      checking and reporting.  Since the details of error
 *                      handling of sockets functions is implementation 
 *                      dependent (ie, handled differently in UNIX/Linux than
 *                      in Winsock), I will leave it to you to elaborate the
 *                      code as you need to.
 */
unsigned short ResolveService(char service[], char protocol[])
{
    struct servent *serv;        /* Structure containing service information */
    unsigned short port;         /* Port to return */

    if ((port = atoi(service)) == 0)  /* Is port numeric?  */
    {
        /* Not numeric.  Try to find as name */
        if ((serv = getservbyname(service, protocol)) == NULL)  
        {
            fprintf(stderr, "getservbyname() failed");
            return 0xFFFF;  /* to signal failure */
        }
        else 
            port = serv->s_port;   /* Found port (network byte order) by name */
    }
    else    /* it's already a port number */
        port = htons(port);  /* Convert port to network byte order */

    return port;
}


Newer Ways

Now that you have learned how to do domain name and service name resolution with struct hostent and struct servent, you need to know that they're not the only functions available, nor necessarily the best. I can't find any reference right now, but the impression I have is that they're on their way out, so you'll eventually need to learn how to use their replacements. I haven't done anything with the new functions yet, so all I can do is to introduce you to them and then you can continue the research on your own.

The following is an abridged copy of the Linux man page for getaddrinfo; I've inserted "[SNIP]" wherever I've cut something out. The functions that it makes reference to, getipnodebyname and getipnodebyaddr, do the same jobs as gethostbyname and gethostbyaddr do, only they appear to work a bit more along the lines that getaddrinfo does. I'm not sure what their relationship is to our old friends, gethostbyname and gethostbyaddr, but the man pages are quite clear that getipnodebyname and getipnodebyaddr are on their way out and are being replaced by getaddrinfo.

getaddrinfo(3)             Linux Programmer's Manual            getaddrinfo(3)

NAME
       getaddrinfo,  freeaddrinfo,  gai_strerror - network address and service
       translation

SYNOPSIS
       #include <sys/types.h>
       #include <sys/socket.h>
       #include <netdb.h>

       int getaddrinfo(const char *node, const char *service,
                       const struct addrinfo *hints,
                       struct addrinfo **res);

       void freeaddrinfo(struct addrinfo *res);

       const char *gai_strerror(int errcode);

DESCRIPTION
       The getaddrinfo(3) function combines the functionality provided by  the
       getipnodebyname(3),   getipnodebyaddr(3),  getservbyname(3),  and  get-
       servbyport(3) functions  into  a  single  interface.   The  thread-safe
       getaddrinfo(3)  function  creates one or more socket address structures
       that can be used by the bind(2) and connect(2) system calls to create a
       client or a server socket.

       The  getaddrinfo(3)  function  is  not  limited to creating IPv4 socket
       address structures; IPv6 socket address structures can  be  created  if
       IPv6 support is available.  These socket address structures can be used
       directly by bind(2) or connect(2), to prepare  a  client  or  a  server
       socket.

       The  addrinfo  structure  used  by this function contains the following
       members:

       struct addrinfo {
           int     ai_flags;
           int     ai_family;
           int     ai_socktype;
           int     ai_protocol;
           size_t  ai_addrlen;
           struct sockaddr *ai_addr;
           char   *ai_canonname;
           struct addrinfo *ai_next;
       };

       getaddrinfo(3) sets res to point to a dynamically-allocated linked list
       of  addrinfo  structures, linked by the ai_next member.  There are sev-
       eral reasons why the linked list may have more than one addrinfo struc-
       ture,  including:  if  the  network host is multi-homed; or if the same
       service is available from multiple socket  protocols  (one  SOCK_STREAM
       address and another SOCK_DGRAM address, for example).

       The members ai_family, ai_socktype, and ai_protocol have the same mean-
       ing as the corresponding parameters in the socket(2) system call.   The
       getaddrinfo(3) function returns socket addresses in either IPv4 or IPv6
       address family, (ai_family will be set to either AF_INET or  AF_INET6).

       The  hints  parameter specifies the preferred socket type, or protocol.
       A NULL hints specifies that any network address or protocol is  accept-
       able.  If this parameter is not NULL it points to an addrinfo structure
       whose ai_family, ai_socktype, and ai_protocol members specify the  pre-
       ferred socket type.  AF_UNSPEC in ai_family specifies any protocol fam-
       ily (either IPv4 or IPv6, for example).  0 in ai_socktype or  ai_proto-
       col  specifies  that any socket type or protocol is acceptable as well.
       The ai_flags member specifies additional options, defined below.   Mul-
       tiple  flags  are specified by logically OR-ing them together.  All the
       other members in the hints parameter must contain either 0, or  a  null
       pointer.

       The  node or service parameter, but not both, may be NULL.  node speci-
       fies either a numerical  network  address  (dotted-decimal  format  for
       IPv4, hexadecimal format for IPv6) or a network hostname, whose network
       addresses are looked up and resolved.  If hints.ai_flags  contains  the
       AI_NUMERICHOST flag then the node parameter must be a numerical network
       address.  The AI_NUMERICHOST flag suppresses  any  potentially  lengthy
       network host address lookups.

       The  getaddrinfo(3)  function  creates a linked list of addrinfo struc-
       tures, one for each network address subject to any restrictions imposed
       by  the  hints parameter.  
       
[snip]

       service  sets  the  port  number  in the network address of each socket
       structure.  If service is NULL the port number will be left  uninitial-
       ized.   If AI_NUMERICSERV is specified in hints.ai_flags and service is
       not NULL, then service must point to a string containing a numeric port
       number.   This flag is used to inhibit the invocation of a name resolu-
       tion service in cases where it is known not to be required.

       The freeaddrinfo(3) function frees the memory that  was  allocated  for
       the dynamically allocated linked list res.

RETURN VALUE
       getaddrinfo(3)  returns  0 if it succeeds, or one of the following non-
       zero error codes:

[snip]

       The gai_strerror(3) function translates these error codes  to  a  human
       readable string, suitable for error reporting.

Return to Top of Page
Return to DWise1's Sockets Programming Page
Return to DWise1's Programming Page

Contact me.


Share and enjoy!

First uploaded on 2007 October 04.
Updated 2011 July 18.