Dive Brief:
- Shoppers were met with a notification saying "Hang on a Sec" on J. Crew's website after having some technical difficulties on Black Friday. The clothing company is struggling as is, and it's "strange" it "wouldn't have stress tested the site or have plans in place to throttle traffic," Lauren Bitar, head of retail consulting at RetailNext, told Retail Dive.
- Walmart also faced some downtime after online shoppers faced "technical difficulties" as early as Wednesday. Lowe's may have lost some online shoppers to rival The Home Depot after the site went "down for maintenance" during prime shopping time, reports Retail Dive.
- Shoppers took to Twitter to vent frustrations over shopping difficulties with other retailers, including lululemon and Ulta. Shoppers were met with sale glitches and crashes.
Dive Insight:
Black Friday hit retailers as hard as it hit consumer wallets last week. But when there's an expected influx of website traffic, is there an excuse for outages?
Retailers that want to thrive in the digital age embrace e-commerce. But when a website gets too much traffic too quickly, resulting in downtime, chaos engineers have to step up.
To prepare for the inundation of site visits, shopping carts and payment input that happen Black Friday through Cyber Monday, retailers need to anticipate failure and put a backup plan in place.
E-commerce outages usually result from overburdened APIs, slow third-party functions, sites heavy with graphics, servers unprepared for high traffic, or a failure to look at regional performance levels.
One way to reduce site issues is having a layer of management software in the development pipeline so DevOps teams have more control and oversight into their systems. DevOps processes can be broken down into seven steps including build, test, deploy, run, monitor, manage and notify.
Each process has its own ecosystem of vendors, and any slip-up from a participant can cause an issue. A lack of communication between IT teams, relating to expected traffic upticks, can result in a site issue.
But limits are almost always unknown until they're reached, and sometimes that happens on a day it matters most. This is where automated intervention can minimize the time it takes to get to a resolution