Big Data Blog

Big Data Blog

Does Machine Learning + Predictive Analytics = Asimov’s Psychohistory?

Nov 19

Written by:
11/19/2012 3:32 PM  RssIcon

Last week, I had an interesting conversation with James Urquhart of GigaOm and enStratus, with a comment or two from Reuven Cohen of Forbes and Virtustream. The conversation centered around the idea that current and planned future paths of big data predictive analytics are looking more and more like the imaginary science of psychohistory, invented by classic science fiction author, Isaac Asimov, in his Foundation novels.

For those of you who aren’t familiar with the novels, the basic concept was invented by a character named Hari Seldon. He discovered that while the actions of individual humans were virtually impossible to predict, the actions of massive numbers of humans as groups were completely and accurately predictable. Given enough data, Hari Seldon could predict the future course of human history in a galactic empire. In the Foundation series of books, this worked. Hari Seldon accurately predicted the course of human history for centuries after his death, until a single, remarkable individual with extraordinary gifts altered the course of life for billions, wreaking havoc with Seldon’s predictions.

Here’s the Twitter conversation, condensed a little and with misspellings and such corrected:

James Urquhart: I've been thinking lately about how big data and very large scale machine learning are starting to look Azimov-esque…

Reuven Cohen: It’s funny you say that. I've been thinking along the same lines.

Paige Roberts: Big data predictive analytics = psychohistory? In the direction it's going, give it a few years, could be.

James Urquhart: Yeah. A presentation I saw by a member of the machine learning group at Google opened my eyes to the evolution. For example, he noted that volume of data mattered more than algorithms/rules themselves. More data, better decisions. That seems to indicate that, as we grow our data storage/processing scale, our machines will make better predictions. And that would be whether we "programmed" or even seeded the right algorithms or not. o_O

Paige Roberts: The direction it takes is largely up to us, but we're already trying to predict human behavior on a smaller scale. Volume isn't the only key, though. People still have to ask the right questions. Hari Seldon was a mathematician.

Every day, I encounter something else that makes me feel like I'm living in the sci fi novels I read as a kid.

James Urquhart: RT @RobertsPaige: Every day, I encounter something else that makes me feel like I'm living in the sci fi novels I read as a kid. <+100

I can see what James was getting at, that since we are essentially using data to teach machines to find patterns in order to make predictions, then the questions we ask, or the precision of the algorithms we use to ask them, become less important than how much data we provide. When the data is sufficiently large, the machine may be able to figure out the patterns, even if we don’t have it precisely analyzing exactly the right parameters up front. But that concept, freaky though it is, wasn’t what put my brain into overdrive.

It’s the concept of analyzing data to accurately predict human behavior that seems particularly timely, and a little creepy. Without venturing into the realm of science fiction, I find a bunch of examples in daily life where this is already happening. I don’t have to wait a thousand years for a galactic empire to form for machine learning to predict the course of my life. But I think Asimov, and his imaginary counterpart, Hari Seldon, had it completely wrong in one way: Individual human behavior is highly predictable.

Given analysis of the actions of millions of humans under certain circumstances over a period of time, we can now predict which individuals are about to take a certain action, or which action a particular individual will take under certain conditions. This leads me to think we’re heading for more of a “Minority Report” kind of future than “Foundation and Empire,” but with machines in place of precognitive humans.

Current modern predictive analytics aren’t trying to figure out the rise and fall of human empires, but they are predicting who will vote for whom in the US presidential election. Many people believe that Obama’s victory was in no small part due to very precisely individually targeted messaging. Not only was analytics used to figure out which individuals might be on the fence politically, but to tailor and target ad messaging that would appeal to those individuals. The course of the United States, and therefore the course of world history and the future of the human race, has been influenced, at least to some extent, by the predictive analytics used by Obama’s campaign team. Hari Seldon would be impressed.

But big data predictive analytics aren’t just being used to predict huge, world-changing events like a presidential election. This awesomely powerful technology of machine learning is currently being used to predict which movie I am most likely to want to download and watch Saturday night. I don’t mean that predictive analytics are used to predict which movie the average woman in the US between the ages of 45 to 50 will want to watch, I mean me, specifically. Based on the fact that I liked “Shaun of the Dead,” predictive analytics says I will also like “Dead Heads.” Based on several previous actions that have taught the machine what I like, analytics predict that I will enjoy “The Avengers.” No human at Netflix has a clue who I am or what movies I watch, but their machines have learned my likely preferences and behaviors spot on.

Analytics predict the ideal moment and location to send me an ad to influence me to buy a new iPad. Machine learning plus predictive analytics determines how likely I am to decide to switch cable or cell phone providers this week, and what the provider can do to convince me not to. They figure out if I downloaded illegal music or cheated on my taxes, or whether the jet ski I purchased in Galveston yesterday was actually bought by me or by someone who stole my credit card number. They decide if I’m likely to default on a loan before I even apply for it.

It may not be psychohistory, but if that isn’t machines predicting human behavior, I don’t know what is. What do you think?

Search Big Data Blogs

Tags

Big Data (126)
Analytics (66)
Pervasive (50)
DataRush (33)
Hadoop (31)
Industry trends (22)
predictive analytics (20)
Scalability (20)
Multicore (15)
Data Mining (12)
Parallelism (10)
Java (9)
Jim Harris (9)
KNIME (9)
Cloud (8)
Cyber Security (8)
MapReduce (8)
big data analytics (7)
Data Volumes (7)
Data Warehouse (7)
RushAnalytics (7)
Volumes (7)
Actian (6)
Algorithms (6)
Cost-effective (6)
David Loshin (6)
Decision Support (6)
Julie Hunt (6)
RushAnalyzer (6)
analytics tools (5)
Dataflow (5)
machine learning (5)
Data Science (4)
Forrester (4)
Google (4)
Green IT (4)
Healthcare (4)
Phil Simon (4)
YARN (4)
analytics processes (3)
Big Data Science (3)
BigQuery (3)
Bloor (3)
data centers (3)
data integration (3)
Data Preparation (3)
data tools (3)
data-driven (3)
DataMatcher (3)
machine generated data (3)
Malstone B (3)
Mike Hoskins (3)
Opera Solutions (3)
Retail Analytics (3)
Security (3)
Smart Grid (3)
software (3)
Solutions (3)
telecommunications (3)
transportation analytics (3)
Age of Data (2)
analytics accuracy (2)
architecture (2)
Austin (2)
Bloor Research (2)
Business Intelligence (2)
data management (2)
Data Rush (2)
David Inbar (2)
David Norris (2)
fraud (2)
fraud detection (2)
Gartner (2)
GigaOM (2)
Hadoop Summit (2)
IntegrationWorld (2)
intelligent transportation systems (2)
internet of things (2)
McKinsey (2)
meetup (2)
ParAccel (2)
Pervasive DataRush (2)
Rexer Analytics (2)
smart meters (2)
#FollowFriday (1)
a (1)
Amazon (1)
analytics workflow (1)
Application Development (1)
automation (1)
Benchmarks (1)
best practices (1)
Cloud Analytics Summit (1)
cloud computing (1)
Cloudera (1)
contests (1)
cost (1)
cyber security issues (1)
data flow architecture (1)
Data Integrator - Hadoop Edition (1)
data quality (1)
data visualization (1)
digital marketing (1)
Door64 (1)
easy big data analytics (1)
Ericson (1)
Esri (1)
Facebook (1)
Fuzzy Matching (1)
Goverment (1)
Hadoop User Group (1)
Hadoop World (1)
hardware (1)
HBase (1)
HDFS (1)
industrial internet (1)
Jazoon (1)
Jim Falgout (1)
MalStoneB (1)
Mansour Raad (1)
Neil Raden (1)
Netflix (1)
NetFlow (1)
operational intelligence (1)
Paige Roberts (1)
para (1)
PIG (1)
pilot projects (1)
Predictive Analytics World (1)
psychohistory (1)
Public Sector (1)
Redshift (1)
Robin Bloor (1)
ROI (1)
Rosaria Silipo (1)
RushAccelerator (1)
RushLoader (1)
Sampling (1)
Signal and Noise (1)
SmartDataCollective (1)
spatial analytics (1)
speed (1)
sports (1)
Stephen Swoyer (1)
Steve Shine (1)
Strata (1)
SXSW (1)
Telecom Analytics (1)
Telecommunications Industry Association (1)
TIA (1)
Transportation (1)
TurboRush (1)
VectorWise (1)
Zementis (1)

Latest Posts

Actian Big Data & Analytics Blog has MOVED!
Big Data Phrenology
Big Data, Simpson's Paradox and Sufficient Tools
Data Science and the Art of Data Visualization

Big Data Blog Archives

Archive
<April 2014>
SunMonTueWedThuFriSat
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910
Monthly
Go

Accelerating Big Data 2.0™