Methodology:

For every data set, the authors set aside 5000 instances for training and used the rest of the instances as a single large test set. Using the 5000 training instances, the authors performed 5 fold cross vaildations and calibrated the algorithms using Platt Scaling and Isotonic Regression. While reporting the results, only the results of the best parameters for each algorithm is reported. The different performance measures were also normalised to fall between 0 and 1 for uniformity.

Bootstrap Sampling:

The results obtained by the authors show that there is no single algorithm that performs best. Different algorithms perform better on different data sets. This leads to possibility that the choice of the data sets could have influenced the outcome. To solve this problem, the authors performed a
bootstrap sampling. This was done by randomly choosing a data set using bootstrap sampling. From the chosen data set, 8 different metrics were chosen using bootstrap sampling. The mean over these metrics was computed for each algorithm and they were ranked from 1 to 10. The Bootstrap sampling was repeated 1000 times and the results are given in the original paper.

Results (as found by the authors):

In general, without calibration
Bagged trees, neural nets and random forests perform best. Calibration with Platts scaling or Isotonic regression improves the performance of most learning algorithms. Algorithms which do not benefit from calibration are neural nets and memory based learning methods such as KNN. After calibration the best algorithms were Boosted trees, SVMs and random forests. The algorithms which exhibited poor performance in general were Naive Bayes, decision trees, logistic regression and KNN. However there exists no single best learning algorithm




Continue to "experiments performed by me"
Back
Back to project main page
Back to Varun's home page