Some sample exam 2 questions:

1. Briefly define the following terms:

   Unsupervised learning

   Clustering Algorithm

   Dendrogram

   Control Learning

   Delayed Reward

   Discounted Future Reward

   Markov Decision Process

   Bayes Theorem

   Maximum a posteriori hypothesis

   Maximum likelihood hypothesis

   Bayes optimal classifier

   Gibbs classifier

   Bayes network

   Bagging

   Boosting

   Stacking

   Market Basket

   Itemset

   The Apriori Properties

   Eager learning

   Lazy Learning

   Curse of dimensionality

   kd Tree

   Single Point Crossover

   Two Point Crossover

   Uniform Crossover

   Point Mutation

   Inverted Deduction

   PAC Learning

   Agnostic Learning

   e-exhausting a Version Space

   Shattering a Set of Instances

   Vapnik-Chervonenkis Dimension

2. What are the two main approaches for generating clusters?  Explain in
    general terms how these approaches work.

3. List two methods that could be used to estimate the validity of the
   results of a clustering algorithm.  Explain how these methods work.

4. Explain how the following clustering methods work:

   Agglomerative Single Link

   Agglomerative Complete Link

   K-Means

5. A distance measure is important both in memory-based reasoning methods
   such as the k-nearest neighbor method and in clustering.  Why is it so
   critical in these methods.  In which is it possible to "learn" to do
   a better job of measuring the distance between points?  Why?

6. Give pseudo-code for the learning cycle of a Q learner.  What is the
   update rule for a deterministic world?  How about a non-deterministic
   world?

7. How are the V(s) and Q(s,a) functions related in Q learning?  What are
   the advantages of using the Q function over the V function?

8. What is Bayes theorem?  Discuss two examples showing how Bayes theorem
   can be used to justify approaches to learning.  Also, discuss an example
   of a learning method based on Bayes theorem.

9. What is a Bayesian Belief network?  Give an example of such a network.
   What are the advantages of a Bayesian network over a naive Bayes learner?

10. Explain the fundamental difference between the Bagging and Ada-Boosting
    ensemble learning methods?  How do these notions relate to the concept of
    generating a good ensemble?  What are the advantages and disadvantages of
    each method?

11. How does the Apriori algorithm learn an association rule (give the
    algorithm)?  Give two examples of ways to speedup this algorithm.
    Show an example of how the algorithm works.
    
12. How does a k-Nearest Neighbor learner make predictions about new data points?
    How does a distance-weighted k-Nearest Neighbor learner differ from a
    standard k-Nearest Neighbor learner?  What is locally weighted regression?

13. How does a Radial Basis Function network work?  How does a kernel function
    work?

14. How are concepts represented in a genetic algorithm?  Give an example of
    of concept represented in a GA.

15. What operators are used in a genetic algorithm to produce new concepts?
    Give an example of a mechanism that can be used to judge a GA concept.

16. Give pseudo-code for a general genetic algorithm.  Make sure to outline
    the way concepts are represented, the operators used to create new
    concepts, how concepts are chosen to reproduce, and how concepts are
    evaluated.

17. Give two different mechanisms for selecting which members of a GA
    population should reproduce.  What are the advantages and disadvantages of
    your mechanisms?

18. How does genetic programming work?  How is a genetic program defined?
    What genetic operators can be applied to a genetic program?

19. How does the sequential covering algorithm work to generate a set of
    rules for a concept?

20. How does FOIL work to generate first-order logic rules for a concept?

21. What does it mean to view induction as inverted deduction?  Give a
    deduction rule and explain how that rule can be inverted to induce new
    rules.