Editor's note: The following is a guest article from Samuel Bocetta, a former DoD security analyst and technical writer focused on network security and open source applications.
Job burnout. It can happen to the best of us. One critical staffing issue is the rising problem of alert fatigue in IT security teams.
Much as ordinary citizens can become anxious and stressed — even complacent — being constantly bombarded by news of natural disasters or rising crime rates, so can cybersecurity specialists tasked with protecting computer networks and systems.
What is 'alert fatigue'?
Often, what non-professionals perceive as a fairly clinical environment is actually experienced as a live-fire combat scenario for security professionals. They are the recipients of constant alerts of everything from possible breaches to attacks in progress.
When systems are monitored 24/7, there are also a lot of false alarms. Much like with news of impending natural disasters that blow over without causing harm, these false alarms can lead to apathy, resulting in a potentially dangerous condition known as "alert fatigue."
The danger of alert fatigue
The rise of cloud computing means a broader range of attack surfaces and more customers to protect. Naturally, businesses want to keep bases covered and protected, which puts a strain on resources, human and technical.
Alert fatigue results when one is constantly exposed to alarms warning of imminent danger. You'll notice it in neighbourhoods where car alarms are being set off every few minutes, and it's a big concern in the healthcare industry.
With e-commerce and cybercrime growing exponentially, the problem has seeped into the digital realm. It's the electronic equivalent of the boy who cried wolf, and it's taking a toll.
According to one recent report, 72% of CIOs say that alert fatigue is a big problem affecting their team. In a survey conducted by Threat Stack in 2016, it was reported that 82% of respondents determined alert fatigue was having a negative impact on their organization.
In addition to dealing with false alarms that leave teams constantly operating in survival mode is the adrenaline surge that comes from coping with actual threats and active attacks. This takes a toll on organizations and their people that can lead to a level of apathy that normalizes fatigue and leaves companies less secure.
Organizations don't have to learn its lessons the hard way. With awareness and planning, businesses can turn the situation around and devise a proactive threat management system.
Reducing the clutter and keeping teams sharp
Tech professionals are always looking for ways to build better platforms and incorporate new technologies of scale. The process is made easier by paying as much attention to those on our team and becoming aware of the issue of fatigue, knowing the signs and determining the possible consequences of inaction.
Sweeping the problem under the rug won't make it go away.
One first step is to choose the right IT architecture, starting with the fundamentals: a web hosting provider. Not long ago, ultra-secure hosting and cloud storage was the domain of all but a few high-end services such as Amazon Web Services or Microsoft Azure.
Today, secure web hosting is available on a handful of affordable alternatives, such as WP Engine, Bluehost, Kinsta, and many others.
These providers, which now cater to small businesses, offer basic protections like firewalls and malware detection, which alleviates the stress of human monitors being constantly on the alert.
Once hosting is sorted, here are seven things businesses can do to create a balance that will spare teams undue stress without leaving the network and data centers vulnerable:
1. Reduce the number of redundant alerts
Being paged for the same issue repeatedly reduces productivity and the effectiveness of your response. The first step to cutting out the noise of false alarms is to reduce their numbers.
Businesses can accomplish this by manually by setting a separate, finely tuned protocol for each tool and platform if your company and network are small, or combining all of protocols into a unified alerting system with a single platform, alert configuration and origination point.
2. Make each alert actionable and contextual
In order for alerts to be effective security aids, they have to be contextualized and actionable. Context is achieved by pairing the data points across the entire system in order to gain the full picture of the nature and existence of the threat.
Based on this analysis, businesses can then have a defined action to neutralize the problem. This includes knowing the source of origination and what portions of the network are impacted.
3. Relegate alerts to a single-source timeline
Rather than dozens of emails being sent to several different inboxes each time an alarm goes off, set up a unified notification system, such as a Slack channel, that goes to a centralized location accessible and visible to the whole team.
This information stack should include threat intelligence, vulnerability management, CloudTrail events, and other security functions.
4. Adjust detection thresholds
Adjusting detection thresholds helps limit the number of systematic anomalies being flagged as active threats.
Baselines should be fine-tuned regularly to match growth and systematic changes. Machine learning tools can use this metric as a starting point to weed out the most common clutter and automatically scale response up or down as needed. This frees teams to respond to real issues rather than false alarms.
5. Customize notifications and pages
Triaging security alerts so only the highest threats will wake teams in the middle of the night will go a long way toward eliminating resentment and ensure a happy, alert team in the morning.
6. Make sure that the appropriate teams/individuals are alerted
Response time and fatigue can be cut if there is a team-wide decision regarding how to respond to threats, which threats require a response, how often and by which team members.
7. Reevaluate and alter response periodically
Like most protocols and procedures, threat detection and response should be tested, reevaluated, and retooled regularly. This should include weekly team meetings and updates.