Tesla’s Phantom Brake Problem Exposes Need for Human Audits of Machine-Read Data

“Slamming on the brakes” is often a blaming moment. Someone most likely was not paying attention to conditions, but is that always the case? In the case of usage-based insurance models, which determine an individual driver’s rates based on patterns of behaviors and perceived risks, is it fair to blame drivers for harsh braking or engaging the antilock braking system?

In fact, incidence of harsh braking could mean many things. One could have been distracted by texting. (Ugh.) The driver might not have been paying attention to traffic. Perhaps the driver acted quickly and avoided hitting a child who ran out in front of the vehicle. Or, the vehicle’s sensors might have been tricked into activating for no apparent reason, which some 350 Tesla owners have said happened to their vehicles, according to a recent report from the National Highway Traffic Safety Administration.

Data Without Context Creates New Challenges

The problem is, without the corresponding context, the data point alone (harsh braking) does not really tell the full story. To reliably and consistently evaluate driver risk, telematics data needs to be contextualized to its use case, and generally agreeable as being fit for that purpose. Most importantly, there needs to be a mechanism in place for identifying and excluding confusing or aberrant data from a computer error or sensor malfunction.

The conventional wisdom around AI and the data that go into modern behavioral-based driver scoring is that behind-the-wheel actions can be measured, coached, improved, and rewarded (or penalized). However, there is an important caveat to that conventional wisdom: data that do not belong to the driver do not belong in the equation.

Whether it is phantom braking, self-acceleration, zombie mode or any other situation where a vehicle acts with an unexpected outcome, the driver has an excuse of, “It wasn’t me!” In years gone by, drivers would make complaints and not be able to replicate what happened, but, in today’s connected world, data footprints of all kinds are being used to identify and explain unexpected outcomes.

How do we exploit data for good, while limiting data that is unfit for purpose? How do we create a safety zone between the real world, synthetic training data, and the many one-in-a-million event edge cases that are unveiled daily in the complex, yet mundane, task of commuting and trip making by car?

Separating the Right Data from the Wrong Data

Right now, auto rates are being determined based on personas that range from a tailgater who makes lots of jerky stops and starts on a perfectly straight road to a speed racer who makes quick lane changes at excessive speed. Other personas like road rager, red light runner or texting impaired can all be digitally described with dots on a map or locations in your phone, where GPS, time, sensors, and route can tell a story about someone driving their car.

But we need to be careful about how we treat the data to assign these personas. While we don’t always have clever ways of separating good data from bad, insurers can let humans opt out of specific data points and trips so there is a level of verifiable truth concerning operational behavior. Insurers also can get smarter about layering in important contextual information such as weather data, time of day and traffic patterns/data from other connected vehicles to start adding nuance to the individual data points being collected.

Companies with closed loop systems cannot offer the human opt-out for exogenous events, nor do they have the ability to contextualize that data. Therefore, if they only listen to their own data for setting insurance prices, observations of wrong data can be counted against the drivers, and that is not right.

Further, if only the loss event is considered in modeling, the safety and prevention events remain uncounted. Newer ways of rewarding lower mileage and smooth driving during off-peak traffic times and off-main-road safer routes are just starting to emerge for enhancing driver personas.

Pricing auto insurance using telematics and other observational data is not just about phantom violent deceleration or harsh breaking; rather, it’s about the need to have the right data for the right purpose at the right time to stay front and center as the lines blur between robots and people being at the controls. The key right now, as we manage the transition to more data-driven pricing models, is to allow for data flexibility, both in terms of the ability for humans to override or mediate the data being collected and for risk models to continue to be augmented with additional contextual data that provides a more complete picture.