The final exam will be cumulative.

For sample questions for the material covered for Midterm 1 you should look at this page. For sample questions for the material covered for Midterm 2 you should look at this page.

Sample questions concerning material covered since Midterm 2:

- BRIEFLY define the following terms and give an example of how each term is used:
- Discounted future reward
- Online dynamic programming
- Exploration/exploitation tradeoff
- Value function V for reinforcement learning
- Q function for reinforcement learning
- Policy in reinforcement learning
- Independence in probability
- Conditional independence
- Conditional probability
- The product rule for conditional probability
- Joint probability distribution
- Inference by enumeration
- Bayes rule
- Bayes network
- Conditional probability table
- Inference by stochastic simulation
- Moral graph
- Irrelevant variables

- How do the Q and V values for a state and actions relate? Show the Q and V values for a sample learning problem.
- Give the Q learning algorithm for the deterministic case. Make sure to include a precise definition of the Q update rule.
- Given a joint probability distribution:
toothache¬tootachecatch¬catchcatch¬catchcavity.108.012.072.008¬cavity.016.064.144.576
- P(cavity OR toothache)
- P(cavity AND toothache)
- P(cavity | toothache)

- Given the burglary, earthquake, alarm, johncalls, marycalls network from the textbook, notes and class calculate:
- P(alarm|burglary,¬earthquake)
- (johncalls,marycalls,burglary,earthquake,alarm)

- Describe how inference by enumeration is done using a Bayes network. How does variable elimination work? What should be done with irrelevant variables?
- Give two ways of recognizing irrelevant variables. How does this help in inference by enumeration?
- How does inference by stochastic simulation work? Give the algorithm for:
- Sample from an empty graph/network
- Rejecting sampling
- Likelihood weighting
- Markov chain Monte Carlo sampling