Skip to main content

DNS woes (#1194)

Topics/tags: Technology

Two days ago, my laptop started acting up. By my laptop, I mean the College-owned loaner MacBook Pro I’ve been using while my regular College-owned MacBook Pro is in the shop [1]. And by acting up, I mean that it seemed reluctant to connect to many Web sites. I couldn’t go to https://math.berkeley.edu [2]. More importantly, I couldn’t open https://remote.cs.grinnell.edu, and I couldn’t ssh to ssh.cs.grinnell.edu. I need access to the MathLAN [4] to do much of my work.

It looked like a DNS issue [5]: When I tried to ping sites that I could not open in my Web browser, I got errors like the following.

ping: cannot resolve remote.cs.grinnell.edu: Unknown host

I also tried dig-ing those sites and got no info.

I tried the usual slew of possible fixes for situations like this.

I turned off my Wi-Fi and turned it back on again.

I reset the DNS cache [6] on my machine [7] with the pair of commands appropriate for macOS Monterey.

sudo dscacheutil -flushcache
sudo killall -HUP mDNSResponder

I rebooted the machine. I hate rebooting encrypted laptops; I’ve gotten used to the quick reboot from an SSD, and, now that the machine has to decrypt, it feels like a decade ago when we needed to wait for the spinning platters. I also have to restart the damn DisplayLink manager each time I reboot [8]. And I hate typing my 29-character password [9].

The fixes worked for a bit, or a few sites, but not universally. After I’d spent a bit of time on the computer, I’d find that it wouldn’t reach sites that I hadn’t tried recently.

What do normal people [10] do when they reach this point? Or even earlier points, such as when they find that their Web browser won’t connect to expected sites?

My next steps were to diagnose the issue a bit more.

Could I connect to the sites from my iPad, which is on the same network? Yes.

Were the iPad and the MacBook using the same DNS server? Yes. It was even the expected one: 192.168.0.1.

If I opened a page on the iPad and then used the fancy mechanism for transferring to the MacBook [11], would I then see the page? Nope. It appears to transfer the URL, but not the underlying DNS info.

Did things work if I used the IP address (obtained from the iPad) rather than the domain name? Usually.

My conclusion? There was something wrong with how my MacBook was processing internal DNS requests. Internal DNS processing is s beyond my knowledge. But I looked a bit more under the Network Control Panel [12] to see what I could figure out.

I discovered that my MacBook was running a variety of extra network services, including two copies of Cisco AnyConnect Socket Filter and one Cortex XDR Network Filter. There was also a third Cisco AnyConnect Socket Filter that was not running. Upon further investigation, I found that one Cisco AnyConnect Socket Filter was a content filter, and the second was a DNS proxy.

And I was reminded that ITS [14] puts lots of fun software on our machines [15]. Perhaps the problem was with some of that software. That seemed like a direction to pursue.

So I called the ITS help desk. The delightful help desk attendant, who we will call (TSD) [16], because that’s how they show up in Teams Chat, did their best to help me. They mentioned that others were having similar problems and hypothesized that it was a server-side issue. But others should be using a different router—and therefore a different DNS—than I am.

Since it was beyond their knowledge base, (TSD) decided that the best approach to put in a ticket and route it to Networking. That’s better than what I can do; while I can put in tickets, I can’t route them immediately [17].

A few minutes later, (TSD) messaged me on Microsoft Teams to ask for the asset tag on my machine. And a few minutes after that, DNS started working fine on the MacBook.

So I thanked (TSD) on Teams. And they said I didn’t do anything. That’s not quite true. They put in a ticket. But Networking hadn’t seen that ticket. At least not yet.

And I was reminded of what I tell my students: Computers are sentient and malicious. More precisely,

Computers are sentient and malicious. They don’t like us. But they show it in subtle ways. An obvious one is that they tend to crash at the worst possible time. Another one is that they refuse to do something until we ask for help from an authority figure, and then work perfectly when they do exactly what we did.

You’ll find that happens regularly in this class. You’ll try something that should work. It won’t work. You’ll try again. It still won’t work. You’ll ask me or a class mentor for help. We’ll do the same thing. And it will work fine.

So remember: Computers are sentient and malicious. When things go wrong, it’s probably not your fault.

At some point, I usually follow up with some more accurate statements. For example,

If you find the user interface confusing, it’s not your fault. A human designed it. Unfortunately, too many interface designers design for themselves, rather than for others.

or

Modern computers have so many interlocking parts that it’s nearly impossible to check every situation. Unchecked situations sometimes lead to problems. That’s one of the reasons I ask you to think so much about testing. Good testing reduces the space of unchecked situations.

I also tell my students that asking for help often requires you to explain what’s going on in some detail, and that when you explain things in detail, you figure out what’s wrong. That wasn’t the situation here.

In any case, this was one of those times when the simple act of asking for help got the job done, even though I learned nothing by explaining my problem, and the people helping did nothing. As I’ve said, computers are sentient and malicious.

I just wish I had determined a way to resolve this problem the next time it crops up. Oh well; at least I have a working DNS.


Followup

I drafted this musing immediately after all of the DNS issues seemed to be solved. I thought things were good. Then I discovered that the DNS was still not fixed. For example, I could not get an IP address for cs61a.org [18].

Here’s what a typical dig failure looks like.

$ dig cs161a.org

; <<>> DiG 9.10.6 <<>> cs161a.org
;; global options: +cmd
;; connection timed out; no servers could be reached

Here’s what the response should look like.

$ dig cs161a.org

; <<>> DiG 9.10.6 <<>> cs161a.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 291
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: d809ab2e2bedc84cabeed85962b1edabed3b62d4ed1078af (good)
;; QUESTION SECTION:
;cs161a.org.            IN  A

;; AUTHORITY SECTION:
org.            3600    IN  SOA a0.org.afilias-nst.info. hostmaster.donuts.email. 1655827429 7200 900 1209600 3600

;; Query time: 34 msec
;; SERVER: 132.161.10.60#53(132.161.10.60)
;; WHEN: Tue Jun 21 11:11:23 CDT 2022
;; MSG SIZE  rcvd: 149

In the followup conversations with ITS, I also learned about one possible issue. Cloudflare, a major DNS server, had some problems the prior evening. But that shouldn’t explain why things worked okay on my iPad and not my MacBook.

It seems like clearing the DNS cache and rebooting once again helped. We’ll see how long that lasts.

Not long enough.

Have I mentioned that I hate computers?


Postscript: Are things working now? It’s hard to tell. I tend to identify that the problem still exists only when things go wrong. It does seem that when I switch routers, the problem recurs.

Oh well. Maybe it will resolve on its own.


Followup: While finishing the musing, I tried a Web search on DNS issues. And I found an interesting discussion that suggests that Cisco AnyConnect Socket Filter may be at fault. Searching for Monterey and AnyConnect reveals that others are also having similar problems.

Perhaps different issues were raised by the combination of AnyConnect, Monterey, and the Cloudflare issues.

In any case, if the problems recur, I now have a solution: Nuke Cisco [19].


[1] That’s a musing for another day.

[2] You’ll learn why in another musing [3].

[3] Perhaps even tomorrow.

[4] The Linux network for Grinnell’s CS department.

[5] DNS, or Domain Name Service, is the service that converts a domain name, such as remote.cs.grinnell.edu, to its IP address, such as 132.161.157.185. To do this resolution, your computer first checks its internal knowledge and then asks a nearby DNS server. That server, in turn, checks its stored knowledge, failing that, asks another DNS server.

[6] The DNS Cache is where the computer stores what it learns about the relationships between names and IP addresses. Sometimes that information gets mangled. Or sometimes domains change their IP addresses. Clearing the cache sometimes helps with DNS issues.

[7] Well, the College’s computer.

[8] After writing that, I decided to add it to my list of startup applications.

[9] Hmmm … Was I told to have a thirty-character password? ’Eh. Twenty-nine is close enough.

[10] Grammarly suggests that I should use ordinary people. But I’ve used normal people intentionally. It suggests that I’m abnormal.

[11] Command-Tab on the MacBook.

[12] Do we still call them Control Panels? It’s one of the System Preferences.

[14] Information Technology Services.

[15] Fortunately, I am still free from the strangely named BeyondTrust.

[16] Parens and all.

[17] I was going to say that I can’t increase their priority, but then I looked back at our ticketing system and it appears that I can. I just try to leave most of my tickets at low priority.

[18] See endnote 2.

[19] Figuratively, not literally. In particular, I should remove the Cisco AnyConnect Socket Filter.


Version 1.0 of 2022-06-22.