Some sample exam 1 questions:
1. Briefly define the following terms:
Concept Learning
Continuous-Valued Attribute
Discrete-Valued Attribute
Inductive Learning
The Inductive Learning Hypothesis
Version Space
Inductive Bias
Noise
Decision Tree
Entropy
Information Gain
Gain Ratio (in decision trees)
Overfitting
Gradient Descent
Artificial Neural Network
Linear Threshold Unit
Sigmoid Unit
Perceptron
Multi-Layer Perceptron
Batch mode Gradient Descent
Incremental or Stochastic Gradient Descent
Input Unit
Hidden Unit
Output Unit
Margin in a Support Vector Machine
Support Vector
Slack Variables
Dual Representation of a Problem in SVMs
Kernel Function in SVMs
2. Outline the four key questions that must be answered when designing a
machine learning algorithm. Give an example of an answer for each question.
3. Define the following algorithms: (a real question would just ask for one
of these)
Find-S
List-Then-Eliminate (Version Space)
Candidate Elimination (Version Space)
ID3
Perceptron Training Algorithm (assuming linear artificial neurons)
Backpropagation (assuming sigmoidal artificial neurons)
4. For each of the algorithms above, show how it works on a specific problem
(examples of these may be found in the book or in the notes).
5. Why is inductive bias important for a machine learning algorithm? Give
some examples of ML algorithms and their corresponding inductive biases.
6. How would you represent the following concepts in a decision tree:
A OR B
A AND NOT B
(A AND B) OR (C OR NOT D)
7. What problem does reduced-error pruning address? How do we decide when
to prunce a decision tree?
8. How do you translate a decision tree into a corresponding set of rules?
9. What mechanism was suggested in class for dealing with continuous-valued
attributes in a decision tree?
10. What mechanism was suggested in class for dealing with missing attribute
values in a decision tree?
11. What types of concepts can be learned with a perceptron using linear units?
Give an example of a concept that could not be learned by this type of
artificial neural network.
12. A multi-layer perceptron with sigmoid units can learn (using an
algorithm like backpropagation) concepts that cannot be learned by
artificial neural networks that lack hidden units or sigmoid activation
functions. Give an example of a concept that could be learned by such
a network and what the weights of a learned representation of this concept
might be.
13. An artificial neural network uses gradient descent to search for a local
minimum in weight space. How is a local minimum different from the global
minimum? Why doesn't gradient descent find the global minimum?
14. A concept is represented in C4.5 format with the following files. The
.names file is:
Class1,Class2,Class3. | Classes
FeatureA: continuous
FeatureB: BValue1, BValue2, BValue3, BValue4
FeatureC: continuous
FeatureD: Yes, No
The data file is as follows:
2.5,BValue2,100.0,No,Class2
1.1,BValue4,300.0,Yes,Class1
2.3,BValue3,150.0,No,Class3
1.4,BValue1,350.0,No,Class2
What input and output representation would you use to learn this problem
using an artificial neural network? Give the input and output vectors for
each of the data points shown above. What are the advantages and
disadvantages of your representation?
15. How is a problem phrased as a linear program in a support vector machine?
Give an example. What are slack variables used for and how are they
represented in the linear program?
16. Explain the concept of a Kernel function in Support Vector Machines.
Why are kernels so useful? What properties should a kernel have to be
used in an SVM?