Computer Science 8751
Machine Learning

Programming Assignment 2
Decision Trees (35 points)
Due Monday, October 16, 2006

Introduction

Decision trees are one of the simplest machine learning methods but also a very effective method. In this assignment you are to implement the ID3 decision tree method using information gain as your criteria for selecting the best feature. You should test your method on the data used in class (the coolcars dataset, your own personal dataset and the promoters-936 dataset).

Details

Your method should make use of the dataset class ou created in assignment 1. You may assume that there are no missing features and no continuous features. Your code should produce a nicely formatted output of the decision tree that is readable.

What to Hand In

You should hand in a documented copy of your code (including your dataset class files). Also create an archive of the code and email it to rmaclin@gmail.com.

In addition hand in the decision trees produced for your dataset, the coolcars dataset and the promoters-936 dataset.

Extra Credit

(5 points) Implement the method shown in class to work for continuous features. You should test your solution on at least two datasets.