Dive Brief:
- Salesforce’s May 10 database failure was related to a power failure in the company’s Washington, D.C. data center, the company reported on Monday.
- The outage left some clients unable to access Salesforce services for approximately 18 hours.
- Though Salesforce was able to restore the database from an earlier backup, a glitch of unknown origin caused file discrepancies that means five hours of data have been permanently lost.
Dive Insight:
A complex domino effect of failures ultimately resulted in the loss of five hours of customer data, according to Salesforce.
Salesforce’s official statement about the outage states that a circuit breaker responsible for controlling power into the Washington, D.C. data center failed. Multiple redundant power systems subsequently failed to engage, ultimately leading to power failures at the computer system level.
During the service disruption, customers on the affected instance, NA14, were unable to access Salesforce. Increased volume on NA14, built up by the outage, then exposed a firmware bug on the storage array, which “increased the time for the database to write to the array and led to a database failure,” the company said.
Salesforce said the cause of the initial power failure remains unknown, though the vendor has subsequently replaced the power circuits at the D.C. data center. Salesforce is unsure why the series of failures ultimately led to file discrepancies in its database, though it assured customers that they are continuing to work with the vendor to determine the cause.