- Delta Air Lines is the latest major carrier to suffer a major meltdown.
- The airline is still struggling to recover from a CrowdStrike outage that affected many businesses last week.
- Experts say the airline industry’s drive toward ruthless efficiency is partly to blame for these meltdowns.
Airline meltdowns are seemingly becoming an annual tradition.
Something weird happens, the entire industry melts down, most carriers recover quickly, but for some reason one major airline goes through days of pain and chaos as they struggle to get their operations back in order.
We saw it with Southwest Airlines in December 2022. Then it was United Airlines in June 2023 and now it’s Delta Air Lines’ turn.
Delta is the current poster child for travel headaches as it continues to flounder following Friday’s CrowdStrike outage.
But why does this keep happening? Airlines seem to be uniquely vulnerable to long-tailed disruptions when things go wrong, and technology often lies at the heart of their inability to recover quickly.
Southwest, United, Delta: Anatomy of a breakdown
In every case, it starts with something seemingly annoying but not unheard of: winter weather passing over Denver in December 2022, a series of summer storms in the Northeast in June 2023, an IT glitch that affected the backend systems of many airlines earlier this month.
What all three seem to have in common was the struggles airlines eventually had getting crews back into place after the initial disruption happened.
Certainly, with the Southwest meltdown, crew scheduling software was identified as a particular pain point in getting the airline back to normal. United’s scheduling capabilities were also implicated in its headaches last summer, though its issues were compounded by air traffic control staffing shortages in the Northeast – which have also yet to be totally resolved.
For Delta, crew scheduling seems to be the crux of the current disruptions.
“The CrowdStrike error required Delta’s IT teams to manually repair and reboot each of the affected systems, with additional time then needed for applications to synchronize and start communicating with each other,” a statement the airline released on Monday said. “Delta’s crews are fully staffed and ready to serve our customers, but one of Delta’s most critical systems – which ensures all flights have a full crew in the right place at the right time – is deeply complex and is requiring the most time and manual support to synchronize.”
Experts say that’s not entirely surprising. Airline crews are regulated by strict rules about work hours to protect their health and everyone’s safety, but it takes an enormous amount of computing power to coordinate the unique schedules of tens of thousands of employees and keep airplanes staffed legally, even when everything is normal. It only becomes more complicated when things go wrong.
“They need to run the optimization to figure out what’s the best way to get the passengers and crews to the right place. That’s a process that can take hours. This is not a solution, a math problem, that airlines solve every day,” Cole Wrightson, chief product officer at FLYR, which specializes in airline backend technology, told me. “It can take a day or two days to go through that.”
Especially if a system like crew scheduling software is slow to come back online after an outage, or gets overloaded with an unusually high volume of changes, as Delta’s statements have suggested is happening to their system, the computers can get overwhelmed.
It’s like when your phone overheats if you try to use too many apps at once.
How did the airlines get here?
Following Southwest’s 2022 meltdown, the company promised change, and to its credit, or possibly due to the dumb luck of not running the same systems as other major airlines, it performed admirably last week.
United, similarly, said it would work on improving its response to irregular operations (and chartering fewer private jets) in their wake. United was probably the second-most affected airline by Friday’s outage, but it still recovered much more quickly than Delta.
Delta, doubtlessly, will also commit to improving its IT infrastructure and rehabilitating its premium image. But will it be too little too late?
“You more or less had the most reliable airline in the country … For them to just go to hell basically, the schedule, that’s not what people have been expecting with Delta in recent years,” William J. McGee, senior fellow for aviation and travel at the American Economic Liberties Project, told me. “It just reinforces the point that we’ve been making: the systems are not resilient, and they don’t plan for contingencies.”
Last week’s Cruising Altitude:Why you feel so tired on travel days.
It’s a bigger structural problem, according to McGee and Wrightson.
In some ways, the current problems at Delta are caused by the airline industry’s drive toward ruthless efficiency.
“They have cost-cut and cost-cut and cost-cut in recent decades, and this is what we get. We have a system that is designed, basically, that every hour is rush hour now, from 6 a.m. until 11 p.m. on the good days, and the system is designed to operate as if nothing is ever going to go wrong,” McGee said. “How many days of the week are there where there’s not going to be an IT glitch, serious weather, aircraft maintenance issues, sick crewmembers, problems with air traffic control? There’s just countless things that can go wrong, and for airline executives to, day in and day out, put out a schedule that is based on the most rosy, optimistic outlook you can have, after a while it gets old.”
Additionally, Wrightson said, airlines often rely on extremely outdated technology at the core of all their systems.
“They’re changing the skin of their website, the function of the skin you might have, something to upgrade it to be much healthier, much more resilient, a higher customer experience, but you’re not changing the internals of the health, and your skin as an organ is going to be dependent on the health of the rest of your organs,” he said. “You might be able to use duct tape and cardboard to stick something on and have a new flexibility and capability, but you’re always going to have to map that back to the limitations built into the system in the 70s, 80s and 90s.”
(Yes, airlines still use computer systems that date back to a time before the internet in some cases. That’s why you can still find dot matrix printers in airports.)
How does this get resolved?
It’s going to take money to solve these problems, and in a tight-margin business that’s also an on-again, off-again Wall Street darling, these kinds of long-term investments in unglamorous backend technology and fallback plans can be a hard proposition for executives to make.
But McGee and Wrightson both said it has to happen if airlines want to avoid these kinds of meltdowns in the future, rather than factoring them in as just the cost of doing business.
“It’s bigger than crew scheduling, it’s bigger than IT itself,” McGee said. “The system just can’t snap back like it used to.”
He previously worked in airline operations and noted that for some high-intensity routes, airlines would keep extra crews and aircraft on standby to help smooth things over when something went awry.
He said he’s sure today’s airline executives would roll their eyes at that suggestion. Still, the American Economic Liberties Project, where he now works, has some more incremental suggestions on what airlines can do to make themselves more resilient, even if it’s not the first choice of their shareholders. Mainly, he said, airlines should set up resilience plans and set aside funding to address unexpected disruptions.
“They all have crisis management plans, they will tell you, no doubt,” McGee said. “Have they worked? Are they realistic? Are they reasonable?”
Wrightson also said that airlines, despite their complexity, could follow the example of some tech companies and work to improve their systems more incrementally. That might also save money over the long term, even if it potentially leads to more minor bugs and disruptions in the short term.
“If I’m CTO, CIO of an airline, we would probably want to put some upper bounds on the age of our software,” he said. “The amount of money that airlines are going to have to spend to get off of these aging systems is much greater than if they had addressed it earlier.”
He acknowledged it may take years to fully move away from the multi-layered computer systems airlines currently rely on, but he said taking intentional small steps in that direction will help the industry become more resilient overall.
“The tech industry has shown that faster, smaller steps is a better way to innovate,” he said. “When you’re talking about a United or a Delta or any of the other global top 30 airlines, they need to maintain all of their customers and all of those billions of dollars, and they need to replace all of those car parts while the car continues to move.”
Zach Wichter is a travel reporter for USA TODAY based in New York. You can reach him at [email protected].