-->

Analytics, Big Data and ... Hocus Pocus?

By Barry Schaeffer

Businesses that are scrambling to keep up with the quickly changing e-commerce world are turning to big data and analytics as important, if not primary tools. Collect enough data and apply complex analytical methods to it, the story goes, and you will find the answers you need to understand today and plan for tomorrow.

We’ve given these tools catchy names. Big Data Analytics (BDA) has an authoritative ring — but the underlying disciplines haven’t changed in decades. Whatever we call it, analysis involves sampling what’s happening now and using statistical methods to derive trends that allow us to make changes to improve our results. If it doesn’t do that, it isn’t worth much.

In a BDA world, you grab every piece of data you can from your commerce, Web-based and otherwise, and then apply statistical techniques to it to tell you why your customers behave as they do and what they are likely to do if you change your approach.

What could be the problem?

Statistics Always 'Works'

Our fascination with all things digital, however, may be blinding us to the incredible subtlety of statistical analysis and prediction … a dangerous blindness if our commercial future is in play. Most of the literature on this subject comes from the vendors selling the tools and techniques to implement it. Do a search for big data analytics, for example, and vendor-related hits nearly always top the list.
The term itself — analytics — masks the complexity and difficulty of generating usable information based on collection and analysis of data, especially with an uncontrolled population as is usually the case in e-commerce. 

Statistical analysis is a multi-layered discipline that goes far beyond the calculation process if it is to yield anything useful. What’s worse, statistical processes always appear to “work,” yielding results that aren’t easily recognized as invalid when in fact they may be. The commercial world needs a deeper understanding of what it calls analytics — which is no mean feat with a subject as arcane as statistics.

Learning from our Elders

Edwards Deming's work offers one promising path to understanding. Deming was a pioneer of modern business analysis — and a renowned statistician before that. Deming categorized statistical analysis into two types: “enumerative” and “analytic.” Each is good for some things but not for others, and each has a different threshold of required knowledge and control going in, a different set of unknowns and very different ability to accurately predict future behavior.

Enumerative analysis allows us to test characteristics and changes in sampling populations where usable answers can be generated by calculations against collected data alone. Studies of this type generally do not include sufficient information to enable prediction about the larger population. Enumerative analysis works particularly well when the population can be controlled and variations carefully introduced, as in scientific and biological experiments.

Analytic study attempts to identify and account for differences between the sample and the larger population so that test results can be used as predictive tools. This is orders of magnitude more complex and difficult than its enumerative cousin.

With big data collection and analytics, we need the latter. But we often don’t have sufficient control of the samples on which we base our calculations. This opens us to a range of errors that can render not only useless results, but damaging if we make decisions based on them. Unfortunately, making the sample larger or increasing the amount of applied computer power — even massively so as in big data applications — won’t do much to improve our chances.

Deming and other researchers point out that the complexities associated with analytic statistical study are often so extensive that even many advanced statistics courses tend to gloss over them, focusing instead on the process rather than the results. Perhaps this explains why the big data world tends to compress the entire process using the name “analytics”, engaging in at least as much glossing over (p. 15).

Calculation Does Not Equal Knowledge

Deming and those who have followed him point to the fact that analyzing data is only one part of prediction. As the population and behavior being studied become more complex and less capable of experimental control — a good description of the e-commerce market — the importance of knowledge about the population and its reasons for behavior grow.
In the big data and analytics process, no matter how much data is collected, information about the individuals in the sample will be left out of the calculation process. This makes the entire effort suspect and can introduce significant error into the results.

Credit: Cmswire.com

No comments:

Post a Comment