Introduction to DNS

There are times when we need to lookup a phone number for a person or address of a business, then we look it up in the directory or yellow pages. Similarly when computer wants to find an address of another computer then it looks it up via Domain name system. Domain name system is like a directory for computers using which it can locate other computers on the network.

One might ask when does a computer need to locate another computer. The answer is whenever we need to send or get any message from another system e.g. when you enter the name of a web-site in your web-browser such as www.computergenome.com then the web browser will need to find the address of web server of computergenome web site, or when you send an email to homer.simpson@gmail.com then the email client would need to find the address of gmail email server. The addresses "www.computergenome.com" or "homer.simpson@gmail.com" might appear as valid addresses but computers need some other form of addresses e.g. IPv4 or IPv6, to locate each other.

Computers use numerical addresses for locating each other, such as web server for www.computergenome.com might have an IPv4 address ????. Compyters prefer to have addresses in formas of number instead of long strings because it is more efficient for computers to process. Addresses such as IPv4 or IPv6 are fixed length addresses, which is easy and fatser for computers to process. Computers can derive hierarchy from the numerical addresses which would be little more difficult or atleast more confusing to implement otherwise.

In computer lingo, the string such as "www.computergenome.com" is known as a domain, converting the domain name to IP address is known as resolving the domain name. Domain name service is the process which aids in resolving an domain anem to its address (which may be an IPv4 o IPv6.)

Domain name space

A common domain name such as "www.computergenome.com" comprises of a different sub domains. A domain like ths would be resolved in parts, one step at a time. A client would break the domain name such as "www.computergenome.com" into parts starting from the right end. A client would first resolve .com (which is the top level domain,) which would take the client to a server. Then using that server, the client would try to resolve the domain computergenome.com. This would take the client to the server where the site is hosted. Using this server, a client will try to resolve the domain www.computergenome.com.

Domain name servers

A domain name server is a server which is capable of receiving a DNS request and returning the information related to the domain. However, a single domain name server would not be able to handle request for all the computers on the internet. So there are several domain name servers arranged in a hierarchy which can be used to resolve a domain name. A domain name server can be configured to a small set of domains called its "local zone". All the requests for a domain which does not belong to that's local zone would be referred to other servers in the hierarchy.

For example there might be DNS servers that contain the domain information for domains ending in .com only, then there are servers that contain information for domains ending in .org domains only. Then the organisations might have their own DNS servers that contain information for their own domain. e.g. "ACME" oganiszation might have an internal DNS server which contains information for traps.acme.com domain, elevator.acme.com domain etc. So using this way the information can be distributed among the different DNS servers.

It is not unsual for internet service provider to provide their own domain name server as well . For instance if you have taken an internet connection from an ISP such as comcast, then comcat would provide your computer with an address for the DNS server. That DNS server in most cases would be maintained by comcast itself.

Root servers

In the hierarchy of domain name service, root servers sit at the top. Root servers contain the list of IP addresses for the servers which are manging the top level domains such as .com, .org, .net etc. The root servers cannot resolve a web-site such as "www.computergenome.com" but can tell which servers contain information for websites ending in .com and then these servers can be approached for reslving www.computergenome.com. Currently there are 13 root servers in the world and their addresses are well known. So other DNS servers already know about them without needing to ask somebody about their location.

Each of the root server is assigned a letter starting from A to M. The root servers have been assigned with the domain name <letter>.root-servers.net such as A.root-servers.net, B.root-servers.net, etc.

Authoritative name servers

A domain name server might be assigned couple of domains and will be configured with a information for these domains. This server is said to be an authoratative server for these domains. If this server gets a request for a domain which is configured at this server, then its reply is said to be authoritative otherwise it can query another server and send the response which is called as non-authoratitive response.

Now an server can be configured to contain an IP address of an domain or it can contain the IP address of authoritative server for an domain e.g. a server can be configured to contain the IP address of www.computergenome.com. So when it receives a query for resolving www.computergenome.com, it can reply with the IP address. In this case this server becomes the authoritative server for www.computergenome.com. However, the server can be configured to contain the IP address of authoritative server for www.computergenome.com. So when a server receives a query for www.computergenome.com it can reply with the IP address of server which contains the IP address for www.computergenome.com.

DNS Hierarchy

A Simple example

Let us take an example of how the address "en.wikipedia.org" gets resolved.

    1. User types the address "http://en.wikipedia.org" in the web-browser such as chrome, or firefox.
    2. The web browser request the operating system to get the IP address for the doamin "en.wikipedia.org".
    3. The operating system looks into its database and sees that it does not have the IP address of "en.wikipedia.org".
    4. The operating system looks for the address of the configured domain name server and sends a request to that server for resolving "en.wikipedia.org".
    5. The Domain name server looks at its database and finds that it does not know the IP address for the domain "en.wikipedia.org".
    6. The ISP domain name server contacts the root name server for wth the query and gets the IP address of servers that contain maintain website ending in .org .
    7. The query is sent to the DNS servers that maintains the website ending in .org domain. Of the server contains the IP address of the wikipedia.org domain and it returns this IP address in response to the query
    8. The query is sent to DNS server of the wikipedia.org domain, and it knows about the IP address for the eng.wikipedia.org and sends the IP address for that domain
    9. The client connects to the IP address for eng.wikipedia.org and requets for the web page contents

DNS caching

When a DNS server makes a query to another server to resolve a domain name, it can cache the response as well. In each response there is TTL (time to live) value specified as well, which means that the particular response is valid for some time. TTL value is specified in seconds and the DNS server can cache a response for that without needing to ask another server again.

It is possible to make a DNS server as caching server only, which means that it is not an authoratitive server for any domain. This can be done to optimize web traffic. Consider a DNS server of an ISP such as comcast. When the customers of comcast connect to internet they are provided the DNS IP address of the server maintained by comcast. Now this server can be caching only server which stores the responses made by the requests of every user in its case. When a user makes a request to resolve "www.computergenome.com" for the first time, comcast's DNS server will have to ask other DNS server to resolve the domain. But it can cache/store this response. So when any other user requests to resolve the same domain comcast's server does not ask another server for resolving the domain, it can simply look up the IP address from its cache and respond.

Local resolvers

When a DNS client needs to resolve an IP address it would generally request the operating system to make requests. The operating system will generally contain alocal resolver logic which would try to resolve the domain name locally before requesting any DNS server.

The local resolver logic would look in an local database which could be used to resolve the domain name to an IP address. I most of the systems, there is a file hosts which contains commonly used mappings. The most well known exmaple of such mapping is of localhost which is mapped to address 127.0.0.1 in this file. A typial hosts file looks like this

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1   localhost
255.255.255.255 broadcasthost
::1             localhost 

If the requested domain name is not found in the hosts file, then the local resolver would look into any cached DNS respones. The local resolver may also contain a DNS cache of the previous DNS requests. If the mapping is not available in cache as well, then it initiates the DNS request to the configured DNS servers.

DNS server confguration at client end

A DNS client needs to be configured with a set of DNS server so that it can initiate a DNS request when needed. This information is generally configured into the system manually (either by administrator or by the user himself) or by using DHCP. On most of the linux systems the file /etc/resolv.conf can be viewed to see the configured DNS servers. A client can be configured with multiple DNS servers, but only one is contacted at a time. In case the first DNS server does not respond, then the next DNS configured DNS server is tried.

A sample resolv.conf file would look like this

nameserver 121.242.190.210
nameserver 4.2.2.3

The above file shows that the client is configured with two DNS servers. If the first server does not respond to a DNS resolution request then the second DNS server would be contacted.

Iterative vs Recursive address resolution

A recursive address resolution means resolving the complete domain at a time. When the DNS client sends a request to a DNS server for resolving www.computergenome.com, then it will return the response to the client if it is the authoratitive server for that domain or if it has cached the DNS response from a previous request. However, if the DNS server finds that it needs to make a DNS request to another server then it has two options:

    1. Tell the client that it needs to contact some other server and return the address of that server e.g. address of the root server, or
    2. Perform the DNS queries for the sub-domain on its own and return the resolved address to the client

The first method of resolving the address is known as Iterative address reslution while the second method is known as recursive address resolution.

Iterative address resolution

In iterative address resolution, the client or the local resolver is given the address of authoratitive servers for a subdomain which needs to be queried again.e.g. The clinet/local resolver trying to resolve "www.computergenome.com" would get the adress for authoratative server for .com domain. When authoratitive server for .com domain is queried, address for authoratitive server for computergenome.com is obtained. When the authoratitive server for computergenome.com is queried, address for www.computergenome.com is received.

The address resolution in iterative address resolution method might go like this:

    1. The application sends a request to operating system to resolve the domain name "www.computergenome.com"
    2. The local resolver in the operating systen will first check the hosts file to see if the domain name is already configured or not. If the domain name is already present then that value will be returned
    3. If the hosts file does not contain the address, then local resolver will see if the information is present in its DNS cache or not. If it present (and not expired) then the value from the DNS cache of local resolver is returned.
    4. If the local resolver does not find the entry in its cache as well, then it looks for the configured DNS servers which it can contact. It then sends the request to DNS server DNS1 for resolving the domain "www.computergenome.com"
    5. If the DNS server DNS1 contains the domain www.computergenome.com in its zone file or in its cache, then it returns the address of the domain to the client otherwise it returns the address of root server to the client.
    6. The local resolver sends the query to root server for resolving the domain "www.computergenome.com".
    7. The root server does not know the the address for "www.computergenome.com" so it returns the addresses of authoratitive servers for .com domain to the local resolver.
    8. The local resolver contacts the authoratitive server for .com domains with the query "www.computergenome.com".
    9. The authoritatie server for the .com domain does not know about the domain wwww.computergenome.com, but it contains the entry for computergenome.com, so it returns the authoratitive server for computergenome.com to the local resolver
    10. The local resolver sends the query for resolving www.computergenome.com to the authoritative server for computergenome.com
    11. The authoratitive server for computergenome.com contains the entry for the domain www.computergenome.com and it returns the address of the domain to the local resolver.
    12. The local resolver returns the address of the domain "www.computergenome.com" to the application.

Recursive address resolution

In recursive address resoultion the DNS server takes upon itself the responsibility of resolving the address of an domain name. However, this behaviour needs to be enabled at the server, otherwise it will default to iterative behaviour.

In recursive address resoultion, an domain name such as "www.computergenome.com" might be resolved like this.

    1. The application (such as web browser) sends a request to operating system to resolve the domain name "www.computergenome.com"
    2. The local resolver in the operating systen will first check the hosts file to see if the domain name is already configured or not. If the domain name is already present then that value will be returned
    3. If the hosts file does not contain the address, then local resolver will see if the information is present in its DNS cache or not. If it present (and not expired) then the value from the DNS cache of local resolver is returned.
    4. If the local resolver does not find the entry in its cache as well, then it looks for the configured DNS servers which it can contact. It then sends the request to DNS server DNS1 for resolving the domain "www.computergenome.com"
    5. If the DNS server DNS1 contains the domain www.computergenome.com in its zone file or in its cache, then it returns the address of the domain to the client otherwise it queries the root server with the domain "www.computergenome.com"
    6. The root server does not know the the address for "www.computergenome.com" so it returns the addresses of authoratitive servers for .com domain.
    7. The server DNS1 contacts the authoratitive server for .com domains with the query "www.computergenome.com".
    8. The authoritatie server for the .com domain does not know about the domain wwww.computergenome.com, but it contains the entry for computergenome.com, so it returns the authoratitive server for computergenome.com
    9. The server DNS1 sends the query for resolving www.computergenome.com to the authoritative server for computergenome.com
    10. The authoratitive server for computergenome.com contains the entry for the domain www.computergenome.com and it returns the address of the domain to DNS1.
    11. The server DNS1 returns the address of the domain "www.computergenome.com" to the client