@IOUG pointed to the skills gaps inherent in Big Data when the question is asked, “How ready are you for big data?”.

The skills gaps are listed as:

  1. Ingesting data in all shapes and sizes
  2. Imputing data when there is an incomplete picture
  3. Joining data where there are few obvious relationships
  4. Analyzing data through statistical and algorithmic techniques
  5. Conveying data to offer better insights to human and/or machine decision-making
  6. Visualizing data so that there’s a story that others may understand.

These are possessed by PhDs (aka Data Scientists), but need to possessed by others (who are designated as Data Artisans).

I think the problem with the article is that it is written by a Big Data tool manufacturer. It is like a hammer-maker describing how to build a house.

The most important component that is missing is that of the business model: How does the business function? The real answer is that no one knows, but most people have a reasonable approximation.

My opinion is that Big Data allows experiments to test the goodness of fit that these approximations to the business model have. This is really an ongoing process that requires an understanding of the business and statistical experimental skills.

Another issue that Big Data advocates seem to be overlooking is contaiminated data—data that is misleading or false because of censorship, libel laws, or planted propaganda. Any Big Data collected is going to be biased because of these factors. How big is the bias? I have no idea.

In other words, I think Big Data is heavily reliant on true freedom of speech to know what people really feel about something, instead of thinking and uttering politically safe thoughts. And the business needs to be reasonably open to analysis for a good approximation to how the business functions, instead of people saying one thing and doing another—“The chasm between what is said and what is done”.

