Dive Brief:
- A mismatched software update in CrowdStrike's Falcon sensor led to the crash that caused a global IT outage of millions of Microsoft Windows systems on July 19, the company said Tuesday.
- CrowdStrike, in a root cause analysis report, said the Falcon sensor expected 20 input fields in a rapid response content update, but the software update actually provided 21 input fields. The mismatch resulted in an out-of-bounds memory read, leading to the system crash.
- “We are using lessons learned from this incident to better serve our customers,” CrowdStrike CEO George Kurtz said in a statement Tuesday. “To this end, we have already taken decisive steps to prevent this situation from repeating, and to help ensure that we – and you – become even more resilient.”
Dive Insight:
CrowdStrike said it has already begun to make significant process changes to make sure such a catastrophic update will not take place in the future.
Early estimates released Friday from reinsurance specialist Guy Carpenter show the outage of at least 8.5 million Windows devices could lead to insured losses of up to $1 billion. Prior research from Parametrix showed Fortune 500 companies, minus Microsoft, could see a direct impact of $5.4 billion.
Despite the crash, CrowdStrike said the bug can not be exploited by a hacker. While the same scenario is now "incapable of recurring," the incident informs process improvements and other mitigation steps to create a more resilient ecosystem, the company said.
The incident will likely lead CrowdStrike to enact significant changes in its design and development processes, and additional changes that could impact corporate governance.
“CrowdStrike must go beyond upgrading, testing and software development changes moving forward,” Allie Mellen, principal analyst at Forrester, said via email. “It is likely CrowdStrike is evaluating potential architectural changes to the sensor, future direction and overhaul for [quality assurance] and how to approach the market to rebuild trust.”
Forrester analysts warn CrowdStrike will have to slow down its innovation pipeline and will likely appoint a quality assurance czar reporting directly to Kurtz.
Federal officials said the CrowdStrike incident will make them redouble efforts to crack down on memory unsafe code, which has been a longstanding problem in connection to recent security vulnerabilities.
Asked about whether the incident could impact its coding practices, CrowdStrike noted limitations to what it could change in order to remain compatible with specific operating systems.
“As Microsoft itself outlined on its security blog, security products in the Windows ecosystem, including the Falcon sensor, commonly leverage kernel drivers as core components of a robust security offering,” a CrowdStrike spokesperson said via email. “The kernel-mode parts of CrowdStrike Falcon’s sensor, like other software that runs in the kernel of the Windows operating system, must be written in C/C++ language, which does not allow for memory-safe coding.”
The security firm is dealing with the fallout from high-profile customers, though 99% of Windows sensors are back online.
CrowdStrike and Microsoft fired back, questioning Delta Air Lines slow recovery after the carrier claimed the outage cost it $500 million due to thousands of canceled flights, pointing blame at the two technology providers.
The carrier last week hired famed attorney David Boies and said it would seek damages for the incident, but CrowdStrike and Microsoft pushed back questioning why Delta had so many more issues compared to other airlines.