Lessons from the Grounding of the Boeing 737 Max: Viewpoint

This week President Trump grounded the Boeing 737 Max as a result of a crash of a second plane in Africa. The plane had already been grounded in Canada, China, the EU, UK, etc. There is a lot riding on this plane. It is expected that 40 percent of future Boeing worldwide revenues will come from the Max series (there are a number of models out there or will be out there soon). Major U.S. carriers such as Southwest Airlines and American Airlines had deployed a large number of planes in the last few months.

What is the back story and what can IT professionals in insurance learn? The 737 Max has an automated system call MCAS. The new plane has slightly larger engines (to get more people on the plane) and are slightly further back on the body. As a result, when the planes are in manual mode, they tend to favor flying nose up. If the nose goes too far up it can stall. MCAS is meant to push the nose down automatically. However, it appears that the interaction between the MCAS and other systems may not have calibrated correctly, causing the nose to point down, taking over the controls, and crashing the plane. The pilots can’t override MCAS easily or at all if it kicks in. The 737 Max is not like earlier models. However, Boeing convinced regulators that it was “not new enough” to require real retraining and specific certifications on these new software features, which allowed the plane to be deployed much faster than would have been the case with a normal “new” plane. I saw someone on TV say that the plane was similar to the basic model certified in the 1960s. But it isn’t and never was!

Now Boeing will face high legal claims out of the accident. If there had been a third accident before the planes were taken out of service, the results for Boeing would have been catastrophic.

This is as much an IT problem as it is an engineering problem.

First, the plane introduced feature creep. It added new software to compensate for “minor” changes to the plane. Then, the introduction of those new features were clearly not managed correctly. Quality assurance around integration testing between the new features and the old features (measuring altitude, climb, wind speed, etc.) were not fully assessed or documented. To top it off, the users (in this case the pilots) were initially not properly trained on the features. There was no override if something went wrong, such as the nose pointing down when it shouldn’t be pointed down.

Does it sound like a lot of core systems projects you have been around over the years? New features added on top of other features without understanding the impact? Defects introduced and not caught until the system is in production? Untrained users who can’t use the system correctly? IT Governance that is not measuring the overall project status correctly and rushing something into production when it is not ready for prime time? And in some cases, putting the reputation and future viability of the company at risk?

Software is software. The basic lessons remain no matter how advanced it gets and what industry we are talking about.