A week or two ago there was an article on www.softwareadvice.com called Predictive Analytics : Testing for Accuracy. In this article there was 3 very well know data scientist/data miners/predictive modellers. One of these is a Karl Rexter (a friend of mine form the BIWA world) along with Dean Abbott and John Elder.
People keep asking me what is the best way to test their data mining model, with most people expecting that they have to do lots and lots of statistics. They are then confused when I say ‘Oh No you Don’t’, all you need to do is …. All you need to do is to follow the approaches that are detailed in their article. One thing that they all have in common is that they keep in mind the business problem and how/what the results they obtain mean for the business problem.
- Lift charts and decile tables to compare performance against random results
- Target shuffling to determine validity of the results
- Bootstrap sampling to test the consistency of the model
Ok Some statistics are used but not too many!!
View highlights from the report below or read it in its entirety here. Alternatively have a look at the article summary on SlideShare.