Introduction to (part of) IBM Watson

Recently, I joined the IBM Watson beta program (you can join too here) to see what it had to offer.  It looks like IBM is using the “Watson” word to cover a broad array of analytical and machine learning capabilities.  One area that Watson is used is to do statistical analysis without knowing any programming and/or statistics.  For example, I went into their portal and uploaded a new dataset that I just got from the Town Of Cary regarding traffic stops:

image image

image

I then hit the “New Exploration” button just to see what would happen and voila, I have graphs!

image 

image

image

 

So this is kind interesting, they seem to use both modeling sweeping and parameter sweeping and then use natural language questions to explore the dataset.  This is quite impressive as it allows someone to who nothing about statistics to ask questions and get answers.  I am not sure if there is a way to drill down into the models to tweet the questions nor does there look to be a way to consume the results.  Instead, it looks like a management dashboard.  So it is a bit like when you view the results of a dataset, they have taken it to the n degree.

I then went back and hit the “Create a Prediction” button

image

I picked a random y variable (“disposition) with the default values and voila, graphs:

image

Interestingly, it does some sweeping and it picked up that the PrimaryKey is correlated with date – which would make sense since the date is part of the PK value 🙂

image

In any event, I think this is a cool entry into the machine learning space from IBM.  They really have done a good job in making data science accessible.  Now, if they could put their weight into “Open Data” so there are lots of really cool datasets to analyze available, they would really position themselves well in an emerging market.  I can’t wait to dig in even more with  Watson…