Data Collection: Is Too Much Data a Bad Thing?

A few decades ago, food and beverage manufacturers had one strategy for data collection: paper and pencil. But plant floor operators would spend too much of their day jotting down measurements from weight scales, gauges, HMIs, etc. Despite operators’ data collection efforts, paper and pencil systems were too rudimentary to provide the information that organizations needed for improving production processes and overall quality.

But now, with breakthroughs in industrial automation, servers, databases, and other IT, manufacturers no longer have to rely on paper and pencil systems. They can fully automate data collection and capture extensive amounts of data from every production line—every few milliseconds, around the clock.

However, there is a downside to all of this automation and ease. Since food and beverage manufacturers were starved for data before, many now want to gather as much as they can. Unfortunately, too much data can be difficult to digest. Manufacturers are, in effect, suffering from “data gluttony,” where they gather massive amounts of data, but still lack the quality and manufacturing information they seek.

The Data Gluttony Problem

Data gluttony often results in huge expenses for organizations. After all, manufacturers have to store all of their data somewhere. If they collect both process-specific and product-specific data across each line, in multiple plants, every few seconds or milliseconds, they can fill up tons of hard drives in no time. With the added costs of databases, servers, security, and required IT support, it can all get very expensive, very fast—thereby defeating the cost-reduction focus of modern quality control strategies.

Data gluttony also hinders process improvement efforts, as food and beverage manufacturers can easily feel overwhelmed by the sheer volume of data at their fingertips. Querying millions of data values from a database can prove challenging, if not impossible. Even if massive amounts of data could be retrieved, what analysis tools can be conveniently leveraged for analyzing millions of data values? Just imagine copying a few million data values into a spreadsheet. How would you analyze it all? By themselves, massive datasets make it difficult to figure out what is truly driving quality and where to make improvements—such as how to cut waste and giveaway. It’s like trying to find a needle in a haystack.

Breaking Through the Noise

To overcome data gluttony and find clarity in the noise, food and beverage manufacturers need to recognize that not everything needs to be measured. They have to stop collecting as much as they can and instead take the time to identify what data really matters. What purpose do these data serve? Why do we need to gather these data? How will these data show us how to improve our quality and operations?

Data sampling is equally important. Some organizations think if they do not capture every possible measurement in production, they will somehow “miss out.” For instance, a food producer might say, “We want to collect cooking temperatures every few milliseconds for food safety monitoring.” But temperatures will not significantly change in just a few seconds. The producer would just end up with a mountain of numbers that add no value or additional understanding to what is already known. It is better to form rational sampling plans, with reasonable data collection frequencies, focusing on how much data are needed, not how much are wanted.

Interrogating the Data

In truth, what manufacturers need is not data, but information. This is where statistical techniques and data analysis come into play. Only by regularly aggregating, summarizing, and analyzing their data can manufacturers uncover the actionable intelligence needed to make the right process improvement decisions. Without strategically planning to perform such analyses, data collection efforts can prove meaningless—and expensive.

The most successful organizations are ones that take a step back and analyze their data on a frequent, regular basis. They schedule time to interrogate their datasets using a variety of statistical tools and techniques—all in order to uncover new insights into how they can improve operations. Sometimes the most valuable information can come from innocuous datasets or even “in-spec” data.

For example, one beverage company that I worked with thought that, since everything was in spec, there were no opportunities for improving fill levels. After convincing them to gather fill volume data, we confirmed that no bottles were underfilled or overfilled. While that was good news, we also found that fill levels varied widely and that, overall, bottles were overfilled by a significant amount. Data analysis revealed operational differences between shifts, and big inconsistencies between fill heads and bottle types. Using these insights, the company made a variety of improvements, resulting in $1.1 million in annual savings—on just one of their 20-plus production lines. Without interrogating the data they would never have enjoyed these savings.

Ultimately, having data for the sake of having them does not lead to improvements. Extracting meaningful intelligence out of the data is what generates results—and organizations do not need every single measurement coming off their lines to do so. Instead, using intelligent sampling tactics, data collection activities can be both effective and efficient.

And now, thanks to the advent of software-as-a-service (or SaaS) technologies, there are even greater opportunities for improvement that extend beyond the four walls of a plant and across the entire enterprise. Using a centralized cloud-based repository, food and beverage manufacturers can easily consolidate quality data from multiple plants, regions, vendors, and even ingredient suppliers. Organizations can conduct the same regular data interrogation—but on a grander scale—and reveal greater opportunities to improve quality, reduce costs, and ensure standardization and consistency across the entire value chain, setting the stage for an exponential return on investments in quality.