Introduction

My colleagues & I run a popular website and are constantly concerned with scaling and performance. Until last week, we had been running our own DNS servers (BIND) on Amazon EC2 instances.
We use Pingdom to monitor many functions of our servers, including DNS. What we saw was a resonable average resolution time of about 130ms, but frequent outliers higher than 500ms! The thought of a half-second penalty to the load time for first time visitors is not appealing. So we started to dig into the problem.
Background
DNS, or Domain Name System, is one of the core pieces of infrastructure on the Internet. Its essentially a distributed database, that allows us to convert from human-readable hostnames to IP addresses quickly. For instance, if you type 'sat.learnhub.com' into your web browser, it will contact your DNS server which will return "75.101.155.247". Then your web browser will make an http request to that IP address and actually retrieve the webpage. Its all transparent to you, the end user, apart from the time it takes.
If the DNS server your browser contacted does not have the IP address cached, then it will attempt to locate and contact an "authoritative name server". Most sites have 2 or more.
Site operators can either run these authoritative name servers themselves, or outsource it to another company. Often these services are provided for free by domain name registrars. The most popular software to run your own DNS servers is BIND.
Hosted DNS Providers
Popular websites get hundreds of DNS queries per second (QPS), often from all over the world. To maintain consistently fast DNS resolution times, say < 100ms, the necessary solution is to have a distributed network of DNS servers. There are many companies that provide hosted DNS services. After some researched, we narrowed our search to these 4.
| Service | Notes |
| DNSMadeEasy | From Tiggee. Very low cost ($30/year). The others are significantly more expensive. |
| Dynect | From Dyn Inc. A younger company better known for consumer (DynDNS) not enterprise DNS solutions. |
| Netriplex | Very friendly salesperson and the longest list of global DNS servers. |
| UltraDNS | From NeuStar, which is a publicly traded company. They seem to be the largest and most established in this market. |
List with more hosted DNS services »
I exchanged phone calls and emails with representatives from all these companies to understand their pricing etc. All but one go by a fixed number of queries (lookups) per month. Dynect goes by a more complicated 95th percentile QPS (Queries Per Second). The benefit of Dynect's model is that it is forgiving of bursts of traffic.
Pingdom Website Monitoring

Results
Pingdom works by setting up "Checks". They can be of different kinds… so far I've only used HTTP and now DNS. Normally you would set up Checks for your own sites and services, but there is nothing stopping you from setting them up for others. So that's just what I did.
I set up 4 new checks, one for each of the hosted DNS providers I'd shortlisted. I set the checks to the shortest interval, 1 minute. I picked sample sites that I knew used each one, and picked the first of each providers DNS servers.
The first observation I had was that DNSMadeEasy was clearly not as good as our current set up. It was averaging over 150ms (compared to our current 130ms) and had many outliers even higher than what we were already experiencing. After a day of watching that I removed the check and gave up on them. I no longer have the data, but I'm sure you'd find the same thing if you tested it again yourself with Pingdom.
Here's the data for the remaining three:
| DNS Server | Test site | Average Response Time | Standard Deviation |
| ns1.p19.dynect.net | Shopify.com | 42ms | 60 |
| ns1.netriplex.com | LuiLui.com | 78ms | 109 |
| udns1.ultradns.net | Digg.com | 80ms | 448 |
It looked like UltraDNS, in addition to having the highest average, also had the highest standard deviation by a considerable margin. This was due to many extremely long samples. Both Dynect and Netriplex were much more consistent.
Conclusion
In my 3-day test, Dynect demolished the others. It was more consistent and had a much lower average response time.
I also happen to like Dynect's pricing model and user interface better as well, so the decision to go with them was pretty easy.
We did the cut over last week, and its all gone very smoothly. Here is our response time graph since the start of the year:

You can see that our average response time dropped significantly this last week. We are averaging around 45ms now. And we no longer expect spots of poor performance. (You can see 2 points in the graph where the average response time is around 350ms.)
The point of this lesson however is not to convince you to use Dynect or any other service. Instead I'm trying to show how to creatively use Pingdom's website monitoring service to help you make more empirical decisions about hosted DNS providers.

Post Comments