Big Data Blog

Big Data Blog

Big Data, Predictive Analytics, and the Ideal Chronicler

Nov 26

Written by:
11/26/2012 1:41 PM  RssIcon

I recently finished reading the book Everything Is Obvious: How Common Sense Fails Us by Duncan Watts, where I learned about the Ideal Chronicler, a hypothetical being conceived of by the philosopher Arthur Danto.  Since the beginning of time, the Ideal Chronicler has observed every single person, object, action, thought, and intention, and has the power to synthesize all of that historical information with real-time data and make predictions about what might happen next. 

Sounds a lot like big data and predictive analytics, doesn’t it?

Although the Ideal Chronicler knows everything that is happening now, as well as everything that has led up to now, and can make inferences about how all the events it knows about might fit together, it can’t foresee the future in the sense of understanding the significance of what is happening now — because this requires the perspectives and hindsight from further into the future.

For example, when Isaac Newton published his masterpiece Philosophiæ Naturalis Principia Mathematica in 1687, the Ideal Chronicler might have been able to say it was a major contribution to celestial mechanics, and even predict that it would revolutionize science.  But the Ideal Chronicler could not claim that Newton was laying the foundation for what eventually became modern science, or was playing a key role in what eventually was referred to as the Enlightenment.

Danto’s point was that historical explanations do not reproduce the events of the past, but instead explain why they mattered.  However, the only way to know what mattered, and why, is to be able to see what happened as a result — information that, by definition, not even the impossibly talented Ideal Chronicler possesses.

Although I am an advocate for the potential of big data and predictive analytics, some discussions about the technology and techniques behind it over-hype the predictive power of real-time analytics.  Just because we can analyze massive volumes of historical data and synthesize it with fast moving real-time data in a variety of formats from a multitude of sources, doesn’t change the fact that it often takes more time to determine what the consequences are of what just happened.

“Choices that seem insignificant at the time we make them,” Watts explained, “may one day turn out to be of immense import.  And choices that seem incredibly important to us now may later seem to have been of little consequence.  We just won’t know until we know.  In much of life, the very notion of a well-defined outcome at which point we can evaluate, once and for all, the consequences of an action is a convenient fiction.  In reality, the events that we label as outcomes are never really endpoints.  Instead, they are artificially imposed milestones, just as the ending of a movie is really an artificial end to what in reality would be an ongoing story.  Something always happens afterward, and what happens afterward is liable to change our perception of the current outcome, as well as our perception of the outcomes that we have already explained.  It’s actually quite remarkable in a way that we are able to completely rewrite our previous explanations without experiencing any discomfort about the one we are currently articulating.”

Even when predictive analytics enables real-time business decisions that produce a near-term positive result, for example triggering the purchase of shares of a company just before its stock price soars, that outcome is not an endpoint.  Over time, the investment could produce a long-term negative result if the company’s stock tanks, at which point, in hindsight, the original decision to invest would seem like an obvious mistake — even though it didn’t seem that way at the time.

Like the Ideal Chronicler, big data and predictive analytics can help us create predictions about the future that are based on data-driven facts, not intuition-driven fictions.  However, even the best predictions of data science are a convenient fiction.  Until the future becomes history, we will not know if our predictions were true, and more troubling, we may not remember how confident we were in those predictions that time proved false.

Search Big Data Blogs


Big Data (126)
Analytics (66)
Pervasive (50)
DataRush (33)
Hadoop (31)
Industry trends (22)
predictive analytics (20)
Scalability (20)
Multicore (15)
Data Mining (12)
Parallelism (10)
Java (9)
Jim Harris (9)
Cloud (8)
Cyber Security (8)
MapReduce (8)
big data analytics (7)
Data Volumes (7)
Data Warehouse (7)
RushAnalytics (7)
Volumes (7)
Actian (6)
Algorithms (6)
Cost-effective (6)
David Loshin (6)
Decision Support (6)
Julie Hunt (6)
RushAnalyzer (6)
analytics tools (5)
Dataflow (5)
machine learning (5)
Data Science (4)
Forrester (4)
Google (4)
Green IT (4)
Healthcare (4)
Phil Simon (4)
YARN (4)
analytics processes (3)
Big Data Science (3)
BigQuery (3)
Bloor (3)
data centers (3)
data integration (3)
Data Preparation (3)
data tools (3)
data-driven (3)
DataMatcher (3)
machine generated data (3)
Malstone B (3)
Mike Hoskins (3)
Opera Solutions (3)
Retail Analytics (3)
Security (3)
Smart Grid (3)
software (3)
Solutions (3)
telecommunications (3)
transportation analytics (3)
Age of Data (2)
analytics accuracy (2)
architecture (2)
Austin (2)
Bloor Research (2)
Business Intelligence (2)
data management (2)
Data Rush (2)
David Inbar (2)
David Norris (2)
fraud (2)
fraud detection (2)
Gartner (2)
GigaOM (2)
Hadoop Summit (2)
IntegrationWorld (2)
intelligent transportation systems (2)
internet of things (2)
McKinsey (2)
meetup (2)
ParAccel (2)
Pervasive DataRush (2)
Rexer Analytics (2)
smart meters (2)
#FollowFriday (1)
a (1)
Amazon (1)
analytics workflow (1)
Application Development (1)
automation (1)
Benchmarks (1)
best practices (1)
Cloud Analytics Summit (1)
cloud computing (1)
Cloudera (1)
contests (1)
cost (1)
cyber security issues (1)
data flow architecture (1)
Data Integrator - Hadoop Edition (1)
data quality (1)
data visualization (1)
digital marketing (1)
Door64 (1)
easy big data analytics (1)
Ericson (1)
Esri (1)
Facebook (1)
Fuzzy Matching (1)
Goverment (1)
Hadoop User Group (1)
Hadoop World (1)
hardware (1)
HBase (1)
HDFS (1)
industrial internet (1)
Jazoon (1)
Jim Falgout (1)
MalStoneB (1)
Mansour Raad (1)
Neil Raden (1)
Netflix (1)
NetFlow (1)
operational intelligence (1)
Paige Roberts (1)
para (1)
PIG (1)
pilot projects (1)
Predictive Analytics World (1)
psychohistory (1)
Public Sector (1)
Redshift (1)
Robin Bloor (1)
ROI (1)
Rosaria Silipo (1)
RushAccelerator (1)
RushLoader (1)
Sampling (1)
Signal and Noise (1)
SmartDataCollective (1)
spatial analytics (1)
speed (1)
sports (1)
Stephen Swoyer (1)
Steve Shine (1)
Strata (1)
SXSW (1)
Telecom Analytics (1)
Telecommunications Industry Association (1)
TIA (1)
Transportation (1)
TurboRush (1)
VectorWise (1)
Zementis (1)

Latest Posts

Actian Big Data & Analytics Blog has MOVED!
Big Data Phrenology
Big Data, Simpson's Paradox and Sufficient Tools
Data Science and the Art of Data Visualization

Big Data Blog Archives

<April 2014>

Accelerating Big Data 2.0™