Dive Brief:
- United Airlines fully recovered from three days of operational disruptions caused by a faulty CrowdStrike software update, CEO Scott Kirby said in an open letter to employees and customers Monday.
- “Today, our operation is back to normal and for the last 24 hours our systems, tools and schedules have been stable,” Kirby said. “Our recovery was quick given the circumstances but not immediate.”
- Teams of technicians manually fixed and rebooted more than 26,000 computers and devices one at a time at United contact centers located in 365 airports around the world, according to Kirby. The Friday update impacted an estimated 8.5 million Windows devices globally, Microsoft said in a Saturday statement.
Dive Insight:
The CrowdStrike bug hit United’s systems hard, leading the airline to cancel 694 flights Friday. IT outages grounded an additional 713 United planes during the weekend, which Kirby characterized as one of the busiest travel times of the year.
As systems came back on line, the company reduced flight cancellations to 69 on Monday and 47 by Tuesday afternoon, which respectively represented 2% and 1% of United’s scheduled flights on those days, according to tracking service FlightAware.
Not all carriers were so fortunate — or as successful in their remediation efforts.
Delta Air Lines, by far the hardest hit domestic carrier, had more than 3,500 cancellations over the weekend. The company was still struggling to recover operations Tuesday afternoon, when FlightAware had tallied nearly 500 cancellations.
“Because of the nature of the outage, the ability to respond depends heavily on available resources to do direct intervention with the endpoint affected,” Forrester Senior Analyst Brent Ellis told CIO Dive in an email. Staffing, security and device capabilities in critical areas created a perfect storm for the carriers experiencing the most pain, Ellis said.
Delta CEO Ed Bastian blamed the airline’s crew reassignment software for the ongoing service disruptions Monday. Southwest Airlines traced its December 2022 operational shutdown, which led to the cancellation of nearly 17,000 flights during the holiday travel period, to a similar software failure.
As United rebooted Windows systems flight cancelations abated
The day before the CrowdStrike outage began disrupting global IT estates, Kirby, alongside CFO and EVP Michael Leskinen, had lauded United’s operations and technology teams for their work to reduce the recovery time and cost of previous operational disruptions.
“Over the year, our operations team has invested in technology and improved their processes to better recover from irregular operations,” Leskinen said Thursday morning, during the company's Q2 2024 earnings call.
“That team, the three of them combined with a lot of support from Jason Birnbaum, our chief information officer, are really identifying the places where there’s opportunity to pull permanent cost out,” Kirby added.
United saw total operating expenditures grow just 3% in Q2, despite 11% year-over-year increases in salaries and fuel costs – the company’s two largest expense categories.
As the airline neared completion of a massive migration earlier this year, Leskinen touted cloud’s long-term benefits, including efficiency gains that would drive cost savings over time. The immediate benefits were less certain.
“You don’t save the cost of moving to the cloud until you shut the mainframe down,” Leskinen said in April during the company’s Q1 earnings call.
Operational improvements triggered more immediate returns, as United posted net income of $1.3 billion in Q2, a 23% year-over-year gain. The company had a net loss of $124 million during the first three months of the year, largely due to the grounding of part of its Boeing fleet.
While the executives couldn’t have anticipated the CrowdStrike event, the IT outage made United’s investments in process and technology enhancements seem prescient.
“For industries that heavily rely on technology to support complex processes like crew tracking, bookings and re-bookings and scheduling, it’s important to understand how and when vendors update their software products and what that could mean for your operations,” Christina Powers, partner in West Monroe’s cybersecurity practice, said in an email.
“On the flip side, it’s crucial for software providers to have meticulous release processes, which include robust testing around functionality, compatibility and security,” Powers said.