When the Internet Stopped: Inside the AWS Outage That Broke Everything

21 October, 2025 II Team 0 Comments 1 category

Listen to this article

Monday, October 20th, 2025, started like any other morning for millions of people around the world. But around 3 AM Eastern Time, something went terribly wrong. Within minutes, a cascade of digital dominoes began falling. Snapchat stopped loading. Fortnite players couldn’t log in. Bank apps froze mid-transaction. Smart home devices went silent. Even Amazon’s own warehouse workers found themselves staring at blank screens, unable to do their jobs.

What happened? Amazon Web Services the invisible backbone of the modern internet—went down. And when AWS stumbles, the entire digital world feels the tremors.

The Scope of the Disaster

By 8 AM UK time, the outage had snowballed into one of the most widespread internet disruptions in recent memory. Downdetector, the website that tracks service outages, was flooded with over 50,000 reports. The heat maps looked like someone had set the internet on fire—bright red zones across New York, Chicago, Dallas, Los Angeles, London, and cities across Europe and Asia.

The list of affected services read like a who’s-who of the digital age. Snapchat users couldn’t send messages. Roblox and Fortnite gamers were locked out of their virtual worlds. Duolingo learners saw their precious streaks threatened. Ring doorbell cameras went blind. Venmo and PayPal transactions hung in limbo. Even dating apps like Hinge crashed, leaving countless would-be connections in digital purgatory.

But it wasn’t just consumer apps. Banks like Lloyds reported service disruptions. United Airlines and Delta experienced delays as their systems went haywire. Universities found their learning management systems—including Canvas and Zoom—completely inaccessible, throwing classes into chaos. Cryptocurrency traders on Coinbase watched helplessly as they couldn’t access their portfolios during market hours.

Perhaps most telling? Amazon’s own operations weren’t immune. Warehouse workers stood idle in break rooms. Delivery drivers couldn’t access their route information. Even the internal payroll app went offline, preventing employees from checking their earnings.

What Actually Went Wrong?

Here’s where things get technical—but stay with me, because understanding this matters for everyone who uses the internet.

Amazon Web Services isn’t just one thing. It’s a massive collection of data centers, servers, and software that companies rent to run their websites and apps. Think of it as the engine room of the internet. When you open Snapchat or play Fortnite, you’re not directly connecting to those companies’ computers. You’re connecting to AWS servers that those companies are borrowing.

The problem Monday morning centered on something called DynamoDB in Amazon’s US-EAST-1 region, located in Northern Virginia. DynamoDB is a database service—essentially a massive digital filing cabinet where companies store all their data. It’s one of the core services that thousands of businesses rely on to keep their apps running.

According to experts who analyzed the incident, AWS pushed out a software update to DynamoDB that contained an error. That single mistake triggered a chain reaction. The database couldn’t communicate properly with another critical system—the DNS (Domain Name System), which acts like the internet’s phone book, converting website names into the numerical addresses computers use to find each other.

When DNS can’t talk to the database, nothing works. It’s like having a phone book where all the numbers are suddenly written in invisible ink.

Mike Chapple, an IT professor at the University of Notre Dame who formerly worked for the National Security Agency, put it bluntly: “DynamoDB isn’t a term that most consumers know, but it is one of the record-keepers of the modern internet.”

Why Northern Virginia?

You might wonder: why does one region in Virginia have so much power over the global internet? The answer reveals just how centralized our digital infrastructure has become.

Northern Virginia hosts the largest concentration of data centers in the United States. Amazon alone has invested over $50 billion in data centers there. The US-EAST-1 region is essentially the nervous system of AWS—and by extension, a huge chunk of the internet.

This isn’t the first time this particular region has caused problems. AWS experienced major outages in 2020, 2021, and 2023, many of them traced back to issues in US-EAST-1. Each time, the same question emerges: why don’t these companies have better backup systems?

Rob Jardin, chief digital officer at cybersecurity company NymVPN, explained what likely happened: “These issues can happen when systems become overloaded or a key part of the network goes down, and because so many websites and apps rely on AWS, the impact spreads quickly.”

Tech analyst Lance Ulanoff raised an even more pointed question: “I continue to be confused about why there is not instant redundancy when you can have something that is seemingly so small, or so localized, have such a cascading and global impact.”

It’s a fair criticism. In theory, AWS should be designed so that if one region fails, traffic automatically shifts to other regions. But the reality is more complicated. Many companies configure their services to run primarily in US-EAST-1 because it’s cheaper, faster, and where AWS originally launched. Moving to a multi-region setup requires more planning, more money, and more complexity—expenses that many businesses skip until disaster strikes.

The Bigger Picture: Our Fragile Digital World

Here’s the uncomfortable truth that Monday’s outage exposed: the internet is far more fragile than most people realize.

AWS controls roughly 30% of the worldwide cloud computing market. Microsoft and Google control most of the rest. That means three companies effectively run the infrastructure that powers the modern economy. When one of them has a bad morning, hundreds of millions of people feel it.

Corinne Cath-Speth from Article 19, an organization that promotes freedom of expression, highlighted the democratic implications: “These disruptions are not just technical issues, they’re democratic failures. When a single provider goes dark, critical services go offline with it—media outlets become inaccessible, secure communication apps like Signal stop functioning, and the infrastructure that serves our digital society crumbles.”

Even Elon Musk weighed in, bragging that his X platform (formerly Twitter) wasn’t affected. He took shots at Amazon founder Jeff Bezos with memes while promoting X’s new chat features as an alternative to the disrupted Signal messaging app. But his observation about Signal was actually quite revealing: “AWS is in the loop and can take out Signal at any time.” It’s an uncomfortable reminder of how much control these infrastructure companies have over our digital lives.

Flashbacks to Recent Tech Disasters

Monday’s outage inevitably drew comparisons to the CrowdStrike incident from July 2024. In that case, a faulty software update from the cybersecurity firm crashed Microsoft Windows systems worldwide, grounding thousands of flights and affecting hospitals and banks. The cost ran into hundreds of millions of dollars.

The pattern is becoming disturbingly familiar: a single point of failure, a software update gone wrong, a cascade of consequences that ripple across the globe. Each incident reveals just how interconnected—and vulnerable—our digital infrastructure has become.

What’s particularly concerning is that neither incident was caused by hackers or malicious actors. These were simple human errors in software development and deployment. As our world becomes more dependent on cloud services, the stakes for these mistakes grow exponentially higher.

The Human Cost

Beyond the technical details and corporate statements, real people dealt with real consequences on Monday.

Students couldn’t access their classes or submit assignments. Teachers scrambled to figure out backup plans with no notice. Small business owners watched helplessly as their e-commerce sites went dark during business hours. Freelancers missed deadlines because their work tools disappeared. People couldn’t access their money through banking apps. Others couldn’t communicate with loved ones through their preferred messaging platforms.

One Reddit user shared their frustration about a smart plug that stopped working: “Before I realized why the plug wasn’t working, I tried unsuccessfully to reset one of them. Now I can’t get it to work at all.” It’s a small inconvenience in the grand scheme, but it illustrates a larger problem: we’ve built our homes, our work, and our daily routines around services that can vanish without warning.

For Amazon’s own employees, the irony was bitter. Warehouse workers reported standing around for hours because the internal systems that tell them what to pick and pack were completely offline. Flex drivers couldn’t get their delivery routes. Some employees couldn’t even access the app that lets them get paid early—adding financial stress to an already frustrating day.

The Long Road to Recovery

AWS began reporting improvements around midday Eastern Time, but the recovery was anything but quick or smooth. Services came back sporadically throughout the day. Some apps would work for a few minutes, then crash again. It wasn’t until around 6 PM ET—roughly 15 hours after the initial problems—that AWS finally declared “all AWS services returned to normal operations.”

But even that wasn’t the end. Some services had massive backlogs of data to process. Messages that had been sent hours earlier were just starting to arrive. Transactions were slowly working their way through the queue. For businesses, the cleanup would continue for days.

AWS’s official statements during the outage were frustratingly vague. The status page showed error rates and acknowledged problems, but provided little detail about what was actually happening or when things would be fixed. This lack of transparency is typical during major outages—companies are often more focused on fixing the problem than explaining it—but it leaves millions of users in the dark, literally and figuratively.

What Needs to Change

This outage should be a wake-up call, but history suggests the lessons won’t stick. After every major incident, there’s a brief period of hand-wringing and promises to do better. Then things return to normal until the next disaster.

But real solutions are possible. Here’s what needs to happen:

Better Redundancy: Companies need to design their systems to automatically fail over to other regions when problems occur. Yes, it’s more expensive. Yes, it’s more complex. But the alternative is what we saw Monday.

Diversification: Businesses should consider using multiple cloud providers instead of putting all their eggs in one basket. If AWS goes down, having backup systems running on Google Cloud or Microsoft Azure could keep operations running.

Improved Testing: Software updates need more rigorous testing before they’re deployed to production systems. The error that caused Monday’s outage should have been caught before it affected millions of users.

Transparency: Cloud providers need to be more forthcoming about what went wrong and how they’re preventing it from happening again. Vague status updates don’t cut it when entire businesses are offline.

Regulatory Oversight: As these cloud providers become more critical to the functioning of society, there may need to be government oversight to ensure minimum standards of reliability and disaster recovery.

Living in a Cloud-Dependent World

The reality is that we’re not going back. The cloud isn’t a trend—it’s the foundation of modern digital life. Small businesses can’t afford to run their own server farms. Apps can’t function without massive computing power behind them. The efficiency and scale that cloud services provide are genuinely revolutionary.

But with that revolution comes risk. We’ve traded local control for global convenience. We’ve swapped owned infrastructure for rented services. And when the landlord has problems, we all get evicted.

The AWS outage of October 20, 2025, will join the long list of digital disasters that periodically remind us how fragile our interconnected world has become. Companies will patch their systems, write incident reports, and implement new safeguards. Life will go on, and most people will forget this happened within a few weeks.

But the fundamental vulnerability remains: the modern internet, for all its wonder and sophistication, is built on a surprisingly small number of critical chokepoints. And when one of them fails, we all remember at least for a day that the cloud is just someone else’s computer. And sometimes, those computers break.

The question isn’t whether this will happen again. History tells us it will. The question is whether we’ll be better prepared next time, and whether the companies that control our digital infrastructure will take the steps necessary to prevent these cascading failures.

Monday’s outage was resolved. The next one is just a matter of time.

Category: Trending Now