Apple Network Failure Destroys an Afternoon of Worldwide Mac Productivity

Originally published at: Apple Network Failure Destroys an Afternoon of Worldwide Mac Productivity - TidBITS

If you had problems launching apps on your Mac—or if it was just behaving weirdly—around 4 PM Eastern Standard Time on 12 November 2020, here’s why.

4 Likes

Well, I think in addition to the massive PITA that this caused when everything started spinning at 3:25 PM EST…my RAID actually sustained damage to one of its four drives.

I purchased a Thunderbay Mini RAID from OWC in April to handle my suddenly-substantial video storage needs. It is configured as RAID-5 so that the information on each drive is mirrored on the other 3, at the cost of 2 TB of space but with the benefit of hot-swappable drives.

The 4 2TB Toshiba drives, preconfigured and sold by OWC with the enclosure, each have about 5,100 hours of use on them. When my iMac started acting like molasses, I took almost the same actions that Adam and Josh did. I also took the RAID offline, along with another backup drive that is still running Time Machine.

When I brought it back up at 4:34 p.m. after everything started working again, the SoftRAID utility began flashing warnings that one of the drives failed a SMART test and is “20 to 60 times more likely to fail in the next 2 to 6 months”. According to the expanded info window, there are now “16 unreliable sectors” on the drive.

So this cost me time in the middle of getting ready for a 65-attendee Zoom meeting for which I was the technician. And it will cost somebody money (maybe me, maybe OWC, maybe Toshiba) to fix the drive. I have the logs that pinpoint normal operation right before the server outage, and the disk issues when it came back up.

Well, it affected Safari and Mail on my MBP16 running Catalina. They would eventually start but it took several minutes and I didn’t help by canceling and rebooting in an attempt to fix the problem. I kind of suspected it was Apple (I checked system status) but I didn’t see any problems with Safari or Mail. Everything was back in about 20 minutes.

Yes some official acknowledgment from Apple would be nice!

David

1 Like

This was just totally unacceptable. What is Apple doing, I’ve seen so many blunders over these past 6 months. At first I thought it was a malware attack of some sort. No problems with Safari or Apple Mail but Firefox bouncing in the dock like a rubber ball. Numerous restarts. Two hours of no productivity, all because of “trustd” (OCSP) apparently. Our Macs have to ‘turn off’ (for lack of a better description) because we are that wired to Apple?

1 Like

8am Friday 13 November 2020 in Sydney Australia. a.k.a. International Verify Your Backups Day - TidBITS

Aha! So, in other words, Apple DID release Big Sur on Friday the 13th, at least in Australia.

And look what it got them. :japanese_goblin:

1 Like

I am so happy I had an early night! Live in Germany, so all this HooHa seems to have gone down while I slept; by the time I got up it was fixed & didn’t even know it happened until right now! :grinning:

I honestly believe this was a rare error on the part of Apple’s network operations staff, such that we’re extremely unlikely to ever suffer from it again.

Unfortunately this isn’t the first time — it’s been happening sporadically for years. For example:

1 Like

Hm. IPadOS too I think…same day, around 9:30pm my iPad spontaneously crashed. When it came back up, I couldn’t log into my WiFi Router, Twitter, Slack, Messages, or Facetime. I thought my iCloud account had been hacked or something. I found a recent note at Apple Support about “error activating Message or Facetime” which seemed close to the issue I was having. (No explanation for Twitter or Slack - I don’t use Apple iCloud logins for either of those…) It took about 2 hours of fiddling and toggling Messages & Facetime, and re-logging into Slack and Twitter, and eventually I got everything back to normal. I don’t believe in coincidences, and am pretty certain my issue is related to the Mac issues. I think Apple either had a serious failure as described in the article, or suffered some kind of attack.

1 Like

Thanks so much for this story!

I wasted a ton of time on this and also suspected drive failures.

I got messages suggesting I wasn’t connected to the internet.

Software update wouldn’t work, suggesting I might be managed by an MDM, which made me think I was hacked.

Yes they absolutely should fail more promptly and gracefully. Poor exception handling in their software. Classic lazy developer code, never planning for the outage condition. I constantly remind my guys to code for this.

3 Likes

I don’t understand how an application failing to launch on your Mac could cause damage to your external raid drives.

I don’t love this source and the commentary feels like some “both sides-ism” covering up for not fully understanding the issues but it does a good job collecting a lot of links to primary and secondary sources covering this outage and the privacy angle of gatekeeper. I would add This one too, though.

I’ve updated the article to include Apple’s response:

Privacy protections

macOS has been designed to keep users and their data safe while respecting their privacy.

Gatekeeper performs online checks to verify if an app contains known malware and whether the developer’s signing certificate is revoked. We have never combined data from these checks with information about Apple users or their devices. We do not use data from these checks to learn what individual users are launching or running on their devices.

Notarization checks if the app contains known malware using an encrypted connection that is resilient to server failures.

These security checks have never included the user’s Apple ID or the identity of their device. To further protect privacy, we have stopped logging IP addresses associated with Developer ID certificate checks, and we will ensure that any collected IP addresses are removed from logs.

In addition, over the next year we will introduce several changes to our security checks:

  • A new encrypted protocol for Developer ID certificate revocation checks
  • Strong protections against server failure
  • A new preference for users to opt out of these security protections

Careful of what we ask for. Redirecting OSCP means that if someone does sign malicious code, it gets detected, and Apple does the right thing by revoking their certificate, you will not know about it and happily execute that signed code. OCSP is “slow” because it adds a network round-trip and is vulnerable to the OCSP server having a DOS attack. Note that from OCSP’s perspective, the known bug is an OCSP request will timeout and allow a revoked certificate to look good, not that good code will not run.

I would say it’s OK to do a temporary redirect for the OCSP server as outlined in the article as a temporary hack to route around a failed server. However, put it back the moment the server is back.

2 Likes

I’m sorry, I must be missing something. Did this problem only affect Apple developers, or did it affect regular old Mac users. AFAIK my home Mac (running 10.13.16) does not communicate with any Apple servers for anything unless I go to the website or launch the App Store. I don’t need to be connected for any cloud services and purposely stay signed out of anything I’m not using. I’m pretty sure I can run any local app I want regardless of my internet connection (and that’s how I like it).

Was this article about me, or only about certain kinds of users, or about cloud services?

– Eric

So, Safari in my imac stoped working and is still not even launching after several days. I’ve tried unplugging the internet, rebooting my computer and have no idea how to proceed. I use my mac in my business daily and although I am using a different browser to get by, a lot of information is in Safari that makes my work easier and faster. I know that you all say that it’s working again, but not in my computer. I am “dead in the water”. Suggestions welcome.
~ Lance

It applied to all macOS Mojave and above users who needed to launch an application during the six hour period that servers were overloaded.

Then you are quite misinformed. macOS has needed to contact Apple servers for time of day as long as I can remember. Depending on whether you have disabled some settings, it will check for regular and background-critical software updates periodically during the day. The Gaming Center contacts Apple servers periodically, even if you never play any of it’s games. iTunes uses Apple servers for multiple purposes. I would have to guess your Mac contacts Apple dozens of times an hour or more for a variety of other reasons.

The article explains how it contacts an Apple server the first time you launch an application signed with an Apple DeveloperID which means all Apple apps, all App Store apps along with most other 3rd party apps these days and will repeat doing so ever 12 hours now (was five minutes).

1 Like

If you are able to launch all other apps, then your issue with Safari is completely unrelated to this issue. Contacting AppleCare would be my recommendation.

Thank you!

Ahh, good point about NTP, I do have my computer set for the Apple time servers (though there are other options). But I really wonder about it contacting other Apple servers all the time. I’m a veteran MacOS user but dislike the rest of the Apple “universe,” so I don’t use any of the standard OS apps (except Preview and sometimes iTunes). No Safari, Mail, Calendar, Facetime, Siri, etc. Guess if I really want to know I can do some monitoring of my network activity. Not that it particularly matters. But I was surprised to see so much anxiety about that server outage!