Data mining in Excel 2007
Mr. Donald Farmer gave a lecture yesterday, May 31st 2010, at the Israeli BI User Group. This is what I summed up from the session.
Predictive Analytics
A good predictive analytics project would have:
Actionable - you get useful information from it
Innovative - gives something new, new insight from the model
Trustworthy - the model makes sense
Seamless - part of your everyday operations, so that it will get used more easily by the users
"All models are wrong, but some are useful" George Box
No model is perfect, you can't predict everything, but even then the model is useful. Don't expect to build a perfect model, just try to build a useful model.
How do you build a useful model?
Right problem
Right criteria - a realistic criteria for the business, small improvements over time, working gradually
Right data
Right results - results that meet your criteria
Right Delivery - giving people a model that users can understand
|
Traditional BI |
Predictive Analytics |
|
Exploration |
Discovery |
|
Drill down |
Classification |
|
Trending |
Perdiction |
|
Force constraints |
Discover outliners |
|
Apply rules and models |
Find Patterns and relationships |
Right problem
Cross-sell and up-sell (selling something additional or selling them something similar but more expensive market basket analysis)
Customer acquisition (getting new customers, the demographics of your current customers)
Customer retention (keeping your customer with you, pattern of leaving)
|
Scenario |
Tasks |
|
Cross-sell and up-sell |
Association |
|
Customer acquisition |
Clustering |
|
Customer retention |
Classification, estimation |
Data Mining Add in for Excel 2007
Analyze - the tab for beginning Data Mining
Turn the data into a table and then you can use the Analyze tab
Market basket analysis - what can I sell with the current product. Use profit per product to analyze for better profit.
Analyze key influencers - understanding what are the main drivers of a action (what makes people buy).
Detect categories - finding groupings within the groupings (no use to include ID). We can also define how many categories we'd like to get.
Fill from example - filling in missing data about your customer according to other carasteristic that we have about them.
Highlight exception - findings exceptions, how far away from the center is the data for that customer.