• Home
  • Startup
  • Money & Finance
  • Starting a Business
    • Branding
    • Business Ideas
    • Business Models
    • Business Plans
    • Fundraising
  • Growing a Business
  • More
    • Innovation
    • Leadership
Trending

Google Has a Bedbug Infestation in Its New York Offices

October 24, 2025

Wood Burning Linked To Nearly 2,500 U.K. Deaths A Year, Study Finds

October 24, 2025

People Who Say They’re Experiencing AI Psychosis Beg the FTC for Help

October 23, 2025
Facebook Twitter Instagram
  • Newsletter
  • Submit Articles
  • Privacy
  • Advertise
  • Contact
Facebook Twitter Instagram
UptownBudget
  • Home
  • Startup
  • Money & Finance
  • Starting a Business
    • Branding
    • Business Ideas
    • Business Models
    • Business Plans
    • Fundraising
  • Growing a Business
  • More
    • Innovation
    • Leadership
Subscribe for Alerts
UptownBudget
Home » AWS Outage—New Analysis Explains What Went Wrong And Why
Innovation

AWS Outage—New Analysis Explains What Went Wrong And Why

adminBy adminOctober 23, 20250 ViewsNo Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email

Amazon Web Services has explained what went wrong to cause the major outage that crippled many businesses this week. In a post event summary, AWS outlined how an initial issue in its DynamoDB had a cascading impact, prolonging the outage.

Between 11:48 p.m. on Oct. 19 and 2:40 a.m. on October 20, Amazon DynamoDB experienced “increased API error rates” in its Virginia US-East-1 Region, the main region for deploying applications.

This led to various apps and services being rendered useless, including Snapchat, Fortnite, Ring, Roblox, Coinbase and messaging app Signal.

AWS describes how during this period, “customers and other AWS services with dependencies on DynamoDB were unable to establish new connections to the service.”

It says the incident was triggered by “a latent defect” — in other words, a hidden fault — within the service’s automated DNS management system. This caused endpoint resolution failures for DynamoDB, AWS noted.

DNS — also known as the internet’s phone book — is the system that translates domain names such as Forbes.com to IP addresses so browsers can load internet resources.

Services such as DynamoDB maintain “hundreds of thousands of DNS records to operate a very large heterogeneous fleet of load balancers in each Region,” AWS said. “Automation is crucial to ensuring that these DNS records are updated frequently to add additional capacity as it becomes available, to correctly handle hardware failures, and to efficiently distribute traffic to optimize customers’ experience,” according to AWS.

But the “latent race condition” — which happens when multiple requests are sent concurrently to the same endpoint — in the DynamoDB DNS management system resulted in an incorrect empty DNS record for the service’s regional endpoint (dynamodb.us-east-1.amazonaws.com) that automation failed to repair.

Issues In The Network Load Balancer

Then as systems started to recover, the network load balancer experienced increased connection errors for some in the same area between 5:30 a.m. and 2:09 p.m. on Oct. 20. “This was caused by health check failures in the NLB fleet, which resulted in increased connection errors on some NLBs,” AWS explained.

In tandem, between 2:25 a.m. and 10:36 a.m. on Oct. 20, new EC2 instance launches failed. While instance launches began to succeed from 10:37 a.m., some newly-launched instances experienced connectivity issues, which were resolved by 1:50 p.m., according to AWS.

“The delays in network state propagations for newly launched EC2 instances also caused impact to the network load balancer service and AWS services that use NLB,” AWS said.

Amazon Apologises For Outage, Explains Next Steps

AWS has now issued an apology for the incident. “We apologize for the impact this event caused our customers,” AWS wrote. “While we have a strong track record of operating our services with the highest levels of availability, we know how critical our services are to our customers, their applications and end users, and their businesses. We know this event impacted many customers in significant ways. We will do everything we can to learn from this event and use it to improve our availability even further.”

AWS said it is “making several changes as a result of this operational event.”

For example, it has already disabled the DynamoDB DNS Planner and the DNS Enactor automation worldwide. “In advance of re-enabling this automation, we will fix the race condition scenario and add additional protections to prevent the application of incorrect DNS plans.”

For NLB, AWS is adding a velocity control mechanism to limit the capacity a single NLB can remove when health check failures cause AZ failover.

For EC2, AWS is building an additional test suite to augment its existing scale testing, which will exercise the DWFM recovery workflow to “identify any future regressions.”

The AWS outage had a huge impact, leaving some firms unable to operate for hours due to issues with the apps they depend on. AWS has delivered its post event analysis very quickly, which is to its credit. However, the damage has already been done to its reputation.

Read the full article here

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Articles

Wood Burning Linked To Nearly 2,500 U.K. Deaths A Year, Study Finds

Innovation October 24, 2025

WhatsApp To Lose ChatGPT Integration For 50 Million Users: Here’s What To Do

Innovation October 22, 2025

Teen’s ‘Stomach Ache’ Was Stage 3 Ovarian Cancer. Read Her Journey.

Innovation October 21, 2025

‘Arc Raiders’ Is Now An Even More Serious Problem For Bungie’s ‘Marathon’

Innovation October 20, 2025

Organizations Can’t Deploy Passwordless, Declare Victory And Walk Away

Innovation October 19, 2025

Xbox Triple Confirms It’s Still Making Next-Gen Consoles Post-ROG Ally

Innovation October 18, 2025
Add A Comment

Leave A Reply Cancel Reply

Editors Picks

Google Has a Bedbug Infestation in Its New York Offices

October 24, 2025

Wood Burning Linked To Nearly 2,500 U.K. Deaths A Year, Study Finds

October 24, 2025

People Who Say They’re Experiencing AI Psychosis Beg the FTC for Help

October 23, 2025

AWS Outage—New Analysis Explains What Went Wrong And Why

October 23, 2025

WhatsApp To Lose ChatGPT Integration For 50 Million Users: Here’s What To Do

October 22, 2025

Latest Posts

Teen’s ‘Stomach Ache’ Was Stage 3 Ovarian Cancer. Read Her Journey.

October 21, 2025

Programming in Assembly Is Brutal, Beautiful, and Maybe Even a Path to Better AI

October 20, 2025

‘Arc Raiders’ Is Now An Even More Serious Problem For Bungie’s ‘Marathon’

October 20, 2025

Feds Seize Record-Breaking $15 Billion in Bitcoin From Alleged Scam Empire

October 19, 2025

Organizations Can’t Deploy Passwordless, Declare Victory And Walk Away

October 19, 2025
Advertisement
Demo

UptownBudget is your one-stop website for the latest news and updates about how to start a business, follow us now to get the news that matters to you.

Facebook Twitter Instagram Pinterest YouTube
Sections
  • Growing a Business
  • Innovation
  • Leadership
  • Money & Finance
  • Starting a Business
Trending Topics
  • Branding
  • Business Ideas
  • Business Models
  • Business Plans
  • Fundraising

Subscribe to Updates

Get the latest business and startup news and updates directly to your inbox.

© 2025 UptownBudget. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.