Methodology:
For
every data set, the authors set aside 5000 instances for training and
used the rest of the instances as a single large test set. Using the
5000 training instances, the authors performed 5 fold cross
vaildations and calibrated the algorithms using Platt Scaling and
Isotonic Regression. While reporting the results, only the results of
the best parameters for each algorithm is reported. The different
performance measures were also normalised to fall between 0 and 1 for
uniformity.
Bootstrap Sampling:
The
results obtained by the authors show that there is no single algorithm
that performs best. Different algorithms perform better on different
data sets. This leads to possibility that the choice of the data sets
could have influenced the outcome. To solve this problem, the authors
performed a bootstrap sampling.
This was done by randomly choosing a data set using bootstrap sampling.
From the chosen data set, 8 different metrics were chosen using
bootstrap sampling. The mean over these metrics was computed for each
algorithm and they were ranked from 1 to 10. The Bootstrap sampling was
repeated 1000 times and the results are given in the original paper.
Results (as found by the authors):
In general, without calibration Bagged trees, neural nets and random forests
perform best. Calibration with Platts scaling or Isotonic regression
improves the performance of most learning algorithms. Algorithms which
do not benefit from calibration are neural nets and memory based learning methods such as KNN. After calibration the best algorithms were Boosted trees, SVMs and random forests. The algorithms which exhibited poor performance in general were Naive Bayes, decision trees, logistic regression and KNN. However there exists no single best learning algorithm
Continue to "experiments performed by me"
Back
Back to project main page
Back to Varun's home page