Dive Brief:
-
About 194 years-worth of downtime was logged by Atlassian's Statuspage this year, according to the incident communication tool's 2018 report.
-
Last year's downtime amounted to about 104 years, which equates to an 87% uptick in downtime so far in 2018. Statuspage recorded about 190,000 incidents and schedule maintenance events from this year.
-
The communication company believes the uptick in downtime is reflective of the "cloud-first mentality" companies are adopting in favor of software as a service products. Companies are resolving issues with an average of 4.4 updates per incident.
Dive Insight:
Companies want to be transparent, even with incidents that could hurt business, according to Atlassian's Statuspage.
Downtime leaves a bad taste in the mouths of consumers, as evidenced by the angry tweets of Black Friday shoppers.
Keeping a transparent line of communication between companies and customers mediates confusion and frustration. But the longer an outage, the less patience customers are willing to spare. This is where chaos engineering comes into play.
Anticipating failure is the first step in remediation. Even well-equipped companies, like Amazon, have faced technical issues. Overburdened APIs, slow third-party functions, sites heavy with graphics, servers unprepared for high traffic can all contribute to downtime.
Having a layer of software management in the development pipeline enables the DevOps teams to have more oversight of the systems. Teams can then follow seven critical steps; build, test, deploy, run, monitor, manage and notify.