Dec 10, 2021
82 Views
0 0

AWS Outage: From Incident Response Tools to Package Delivery

Written by

An Amazon Web Services outage on Tuesday, December 7, is impacting everyone from cybersecurity defenders to people waiting on their next Amazon package to arrive.
Amazon is calling this a “technical outage” happening in its main US-East-1 region hosted in Northern Virginia. But the impacts appear to be both spotty and global in nature. 
Users attempting to login through the affected console get a blank screen for about a minute, and then a gateway error message.
Let’s take a quick at what Amazon is saying about the outage and the impacts on the supply chain.
On its Service Health Dashboard, Amazon Web Services explains the impact on network monitoring and cybersecurity tools. Here is a summary of key updates:
[9:37AM PST] “We are seeing impact to multiple AWS APIs in the US-EAST-1 Region. This issue is also affecting some of our monitoring and incident response tooling, which is delaying our ability to provide updates. We have identified the root cause and are actively working towards recovery.”
[10:12 AM PST] “…are starting to see some signs of recovery. We do not have an ETA for full recovery at this time.”
Then an hour later, AWS announced the long list of services impacted, and reminded users that some customers could still log in by using an IAM role to do so:
“Services impacted include: EC2, Connect, DynamoDB, Glue, Athena, Timestream, and Chime and other AWS Services in US-EAST-1. The root cause of this issue is an impairment of several network devices in the US-EAST-1 Region.
We are pursuing multiple mitigation paths in parallel, and have seen some signs of recovery, but we do not have an ETA for full recovery at this time. Root logins for consoles in all AWS regions are affected by this issue, however customers can login to consoles other than US-EAST-1 by using an IAM role for authentication.”

As the outage dragged into Tuesday afternoon, it became clear that AWS was still unsure how long normal service restoration might take:
[12:34 PM PST] “We continue to experience increased API error rates for multiple AWS Services in the US-EAST-1 Region. The root cause of this issue is an impairment of several network devices. We continue to work toward mitigation, and are actively working on a number of different mitigation and resolution actions. While we have observed some early signs of recovery, we do not have an ETA for full recovery.”
aws-outage1You can imagine the headaches this is causing IT and cybersecurity teams all over the place.
What are the real-world impacts of an AWS cloud outage? Well, let’s just say it’s not helping our already fragile supply chain. And in some areas, packages may not get delivered today.
From CNBC:
“Samuel Caceres, an Amazon driver in Washington state, told CNBC his delivery facility has been ‘at a standstill’ since 8 a.m. PST. Drivers and warehouse workers have been on stand by since then, he added.
Unable to go about their workday, many warehouse and delivery workers were instructed to wait in break rooms until the issues were resolved. Some Flex drivers, which are contracted workers who make deliveries from their own vehicles, weren’t able to sign up for shifts and were sent home for the day.”
Other impacts are hitting apps, chats, and streams:
“Among the services that reported issues as a result of the outage were Disney’s streaming subscription service, Disney+, Netflix, Slack, stock trading app Robinhood and Coinbase, the largest cryptocurrency exchange in the U.S.”
There are many more examples we could list.
The outage is also impacting both physical security and online games, according to DataCenterDynamics:
“Amazon subsidiaries like IMDb and Ring went down, as did games like Player Unknown’s Battlegrounds, Valorant, Clash of Clans, Destiny 2 and Dead by Daylight, amongst others.”
Regardless of the cause, this is the kind of incident that hopefully is spelled out somewhere in an AWS business continuity plan. 
Is the plan working? And does that plan include how to recover just days before Christmas when its delivery supply chain is severely crippled for a time? 
We don’t know right now.
But Amazon customers are certainly going to find out.
And they won’t be the only ones watching this incident.
The U.S. Department of Defense may have questions as well. The outage is occurring one day after AWS announced the launch of its second “Top Secret” cloud in the U.S. 
The company said having two clouds for the DOD at that security level would result in “the highest levels of resiliency and availability.” Some may be wondering about that right now.
Business continuity planning is a frequent topic of discussion among cybersecurity leaders at SecureWorld Conferences. If you are working on updating your BCP, listen to this fireside chat with CISO Milinda Rambel Stone, as she explains her lead role in helping with business continuity:

source

Article Categories:
Cybersecurity News

Comments are closed.