Can You Have Too Much Data?

by Taylor Milner and Molly Tracy

In McKinsey’s recent article “Pushing manufacturing productivity to the max,” Robert Feldman et. al. suggests that the manufacturer’s dream should be a continuous, real-time data stream. He paints a picture of operators referencing data dashboards that flash alarms when a metric is out of spec, with “profit per hour” at the forefront of these metrics. If you asked McKinsey’s Feldman if there was ever a limit to how much data is helpful, it seems he would say no. But is there such a limit? We would argue that yes, it is indeed possible to have “too much data.”

Below are three signs that data is hindering or preventing progress in your organization. Then, we discuss four ways you can prepare for the inevitable increase of data that every organization faces.

1. Data is not the same as information

Have you ever been in a meeting where you are presented with graph after graph but aren’t quite sure why? Or maybe you’ve tried analyzing data to find a pattern, only to find yourself going in circles.

Data is good, but it is not very useful until it is turned into information. Information is what we make decisions on. It can be quite hard to turn data into good information. How do we understand if a pattern is causal or only correlated? How do we filter out bad data points, outliers, and other non-representative data?  

One sign that you have too much data is when you confuse data for information. You try to find more in the data than is actually there.

We once worked with team that was focused on reducing raw material waste in their facility. A meeting was called to begin the problem solving process. A member of the CI team presented the data by SKU, by day, by shift, by line, by raw material type, and on and on. As the meeting went on, it became clear that it was expected that the problem would be solved right there in the meeting. The hope was that by looking at the data from as many angles as possible, a pattern would emerge and a root cause solution would be found.

In this case, there was no pattern that could be recognized in the data and be translated to a solution. In fact, even when the top waste causes were understood by physically standing on the manufacturing floor and studying the loss points, nuances were found in how the waste was created that would be very difficult to spot in any data set. It was these nuances that when understood, led the team to a root cause and solution. In this case, more data would not have helped, as it would only have led to the false appearance of having more and more information.

This leads us to our next point:

2. Data only tells you that there’s a problem, not what to do with it

One of the first manufacturing lines we worked on had an early version of an automated line data collection systems. This was a huge step up from the line’s only data being collected by a case counter at the very end, as it fed real-time data about the problems that were occurring on the line.

The reports from the line’s throughput bottleneck showed that the most frequent problem was a “door open fault.” In attempting to solve this problem, the operations manager had directed his maintenance team to change door handles, hinges, switches and anything else that might lead to the fault. Unfortunately, all of this work had no impact on the occurrence of the problem; it stubbornly remained at the top of the list.

That was, until one of the operators commented, “There is nothing wrong with the doors. We get that fault when I open the door to prevent a jam from occurring inside the machine. If it jams, it’s a hassle to clean out.”

Real-time data and automated fault reporting was a great advancement, but it could not be followed blindly. Alarms, automated reporting, and adherence to metric notifications are simply not enough to solve problems. A logical and rigorous problem-solving method is still required. As one of our colleagues once wrote, “Data tells you where to stand, not what to fix.”

If less data will get the problem solving process started properly, then more data will likely lead back to trying to find more in it than really is there.

3. A data overload will desensitize you

Making fact-based decisions is far better than jumping to conclusions, and you need data to do this. That being said, you know you have too much data when handling the data coming at your people is preventing them from taking action.

McKinsey’s piece on increasing real-time data streams shows a visual of an operator’s dashboard, with at least six different metrics shown. Three of the six are marked as not adhering to the metric. In that instance, how do you choose which problem to address first?

We once worked on an engagement to reduce the number of alarms (sensors detecting out of spec situations) at an energy facility. The sheer number of alarms occurring was greater than the operations and maintenance teams at the facility could handle. Instead, these teams spent most of their time resetting the alarms, rather than fixing the problems that were causing them.

Most of the time everything was okay, but occasionally a critical alarm was missed that led to even greater issues for the facility. To remedy this situation, we did two things: we solved some of the problems that were causing the most frequent alarms, and we increased the tolerance for alarms that cause zero to very little risk to the facility. This allowed the critical alarms to shine through and gave the operations and maintenance team the ability to tackle the issues before they escalated.

Despite all of this, the amount of data in industry is not going to decrease. So how do you set your organization up to be successful with ever-increasing data?

The real-time data stream that McKinsey’s article describes is in the future, but it may be a long way off for many facilities and organizations. What good data habits can your organization set in the interim so that increasing amounts of data don’t get in your way?

  1. Think before you start the analysis - Decide what information you actually need from your data before diving into the analysis. This will help you to stop the analysis when you have what you need, rather than diving into more rabbit holes.

  2. Don’t take alarms at face value when working to get to the root of the problem. Hiding behind data dashboards can disconnect you from the physical machines, so get out on the floor! Smell the problem to really understand what’s going on.

  3. Recognize data from information - data is not “all-knowing.” Instead, be on the lookout for common data blunders such as misinterpretation, treating correlation as causation, seeing patterns where they don’t exist, or assuming there is no solution when the data doesn’t show you the solution.

  4. Prioritization is about what you don’t do - choose which alarms to fix first. Pick one thing to tackle at a time. Alarms may be going off everywhere, but what is the most urgent or most frequent one? What’s the most problematic?


Like this article? You might like: