Sam de Freyssinet

An archive of thoughts and discoveries in the world of software development

DNS Woes

Just a quick note to say that I have been experiencing some DNS woes with this domain. As a result, many of you have probably been seeing a far from beautiful 404 Not Found served by my former hosting company. Initially I thought this was an issue with my Heroku web dynos. However after doing some further DNS analysis and investigation I discovered that I had what can only be described as DNS EpicFAIL!

When this site was on the previous host it was hosting its own DNS server using Bind. Because there was only one piece of bare metal there needed to be a backup DNS server hosted elsewhere. At the time I decided to use a free service offered by a company based out of California. I will not advertise them here, not through lack of service but to ensure their offerring is not abused by the masses. I setup the secondary DNS host to be a slave to the primary Bind DNS server hosted on my bare metal. This all worked beautifully for several years.

Fast forward to 2012 and the decision is made to move to another host. Heroku is a wonderful service, but they don’t do DNS within their core offering. There are addons that can provide DNS services, but I wanted to have more control than those services offered. In the end I opted to move to Zonomi as they offered the service required at a very resonably price.

After exporting the original Bind zone configration and importing it correctly into the new DNS server, I then updated the secondary slave DNS servers to pull from the new primary servers hosted by Zonomi. This is where the trouble started. I did not hang around to check if the secondary DNS host was able to pull the zone information from Zonomi using the AXFR protocol. As it turns out, Zonomi disallow this presently. I’m not sure why but will certainly investigate. As the new DNS zone information was not updated at my secondary DNS host, the original data remained pointing at my old servers.

Over the last few months I have noticed that this site has intermittently been unvailable some of the time, all of the time or not at all. Initially I thought this was due to Heroku spinning down the web dynos on the free account I started with. However after spinning up a second dyno and still experiencing the issue, it became obvious the issue lie elsewhere. I noticed that Pingdom would report this site as being down even when I could navigate to it without issue. Currently I do not serve Cache-Control headers (coming soon), so it was not a stale cache persisting in a proxy. It was time to dig a little into this issue.

Dig Report at Home
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
dig def.reyssi.net

; <<>> DiG 9.7.3-P3 <<>> def.reyssi.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51223
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;def.reyssi.net.          IN  A

;; ANSWER SECTION:
def.reyssi.net.       599169  IN  A   46.105.102.142

;; Query time: 4 msec
;; SERVER: 10.0.1.1#53(10.0.1.1)
;; WHEN: Mon Mar 26 22:36:25 2012
;; MSG SIZE  rcvd: 48
Dig Report at Work
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
dig def.reyssi.net   

; <<>> DiG 9.7.3-P3 <<>> def.reyssi.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 65009
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:
;def.reyssi.net.          IN  A

;; ANSWER SECTION:
def.reyssi.net.       1851    IN  A   75.101.163.44
def.reyssi.net.       1851    IN  A   174.129.212.2
def.reyssi.net.       1851    IN  A   75.101.145.87

;; AUTHORITY SECTION:
reyssi.net.       171051  IN  NS  ns1.zonomi.com.
reyssi.net.       171051  IN  NS  ns2.zonomi.com.
reyssi.net.       171051  IN  NS  ns3.zonomi.com.
reyssi.net.       171051  IN  NS  ns4.zonomi.com.

;; Query time: 102 msec
;; SERVER: 193.26.222.2#53(193.26.222.2)
;; WHEN: Mon Mar 26 22:39:12 2012
;; MSG SIZE  rcvd: 162

As is very clear, the two records are quite different. The DNS servers that my work uses have the correct zone information for this domain. However my ISP for home seems to be using the record hosted by the secondary DNS server. Because the old secondary server is failing to update correctly, the zone information is sparse at best. It appears that different DNS nameservers were choosing to use different zone records hosted either at Zonomi or the secondary host. To fix the issue I have retired the secondary DNS service as Zonomi offer four separate nameservers.

Hopefully all these DNS woes will be behind me now. It will take around 72 hours for all the DNS changes to propagate across the network, so stability is a little way off. But hopefully everyone should be able to read this by the weekend.

Kill All The Servers image courtesy of DIY LOL – inspired by Ian Barber’s Skype avatar