On July 4, the Apple iCloud (Apple Music, Apple TV and the App Store) went down for three hours! “Oh no”, non-millennials might say, “how will I ever survive”? Case in point, co-anchor of CNBC’s Squawk Box Andrew Ross Sorkin discussed the problem the morning after – if it had been a business day, he would have lost his calendar, and that would have been catastrophic.
This outage was the third big cloud incident in just a few days, and it was caused by an outage at Verizon. Facebook, WhatsApp and Instagram had availability problems, but more significantly critical enterprise providers like Google Cloud were affected. These now ubiquitous services are a proxy for our almost universal dependence on both cloud-based services and large-scale technology providers.
Talking about Apple this morning, Andrew missed an even more critical incident. In late May, Salesforce had a major outage that lasted a full three days for the impacted companies. Impacted users in Canada indicate that only the Victoria Day holiday saved them from a major miss on their May sales numbers. Globally, Salesforce is the undisputed # 1 Customer Relationship Management (CRM) tool. Salesforce customers are major, multi-national enterprise users. Manual and paper-based sales support systems are gone – dependence on Salesforce tools is, in most companies, mission-critical.
As technology leaders, we are all assuming the resiliency and operational integrity of our cloud-based services. When they have an incident, it is just as impactful, and we have no ability to solve the problem. Incident preparedness, including decision and communications protocols, become even more critical. Incident preparedness and response needs to include “what happens when your provider goes down?”
According to PwC, 84 per cent of directors say they have discussed incident response plans, but only 47 per cent report their company has created a written escalation policy or agreement.
We need to do better.