This page will be updated throughout the semester.

**Instructor: Marshall Hampton**

**Office: 172 SCC**

**Email: mhampton at d.umn.edu
(preferred contact method)**

**Telephone: 726-6329**

**Office hours: 1-2:15, M W Th F, or by appointment..**

**Class homepage:** http://www.d.umn.edu/~mhampton/m5233s14.html

**Lecture/Lab Times and Locations: ** 3 - 3:50 pm, M, W, F (1/21 - 5/9). On Mondays we will be working in a computer lab (MonH 209), on Wednesdays and Fridays we will be in EduE 40.

**Prerequisites: **Any two of the following: Biol 5233, Math
3355, CS 1511, Stat 3611, or equivalents. Please come see me if you
have any questions about the preparation required for this course. Since Biol 5233 has not been offered recently, other biology coursework in genetics and biochemistry is acceptable.

**Textbook: ** *Bioinformatics and Molecular Evolution*, by Paul Higgs and Teresa Attwood. Blackwood Publishing, ISBN-13: 978-1405106832. There will also be supplementary readings distributed by email.

**Topics: **The official course description is "Mathematical,
algorithmic, and computational foundations of common tools used in
genomics and proteomics. Topics include: sequence alignment algorithms
and implementations (Needleman-Wunsch, Smith-Waterman, BLAST, Clustal),
scoring matrices (PAM, BLOSUM), statistics of DNA sequences (SNPs, CpG
islands, isochores, satellites), and phylogenetic tree methods (UPGMA,
parsimony, maximum likelihood). Other topics will be covered as time
permits: RNA and protein structure prediction, microarray analysis,
post-translational modification prediction, gene regulatory dynamics,
and whole-genome sequencing techniques." One thing not mentioned there
that I would like to at least briefly cover is hidden Markov models
(HMMs). We will be using the programming language Python as our primary computational tool, with the biopython module. All of the software we will be using is free.

**Exams:** There will be a midterm Friday, March 14th, and a final exam on Tuesday, May 13th, in EduE 40, from 2 to 4 pm.

A practice midterm and final will be posted here a week before each exam.

Practice Final Exam

**Projects:** There will be several projects and presentations for small groups.

**Grading: ** You will be evaluated in a
variety of ways: homework/labs, class participation (including worksheets) and presentations, and
exams. The midterm exam will be count for approximately 15% of the grade, and the final exam about 25%.

**Assignments and labs: ** Most of the assignments will be either readings or group lab assignments. Labs will mostly be done using the computational platform Sage.

- Read the first two chapters of the textbook, which covers much of the biological knowledge that is essential for the course.
- Read chapters 3 and 4 of the textbook.
- Lab 1: Introduction to Sage and biopython. Sage servers (pick one at random):

sage.d.umn.edu

erdos.d.umn.edu

Server 3.

- Lab 2: The Jukes-Cantor model. Sage servers (pick one at random):

sage.d.umn.edu

erdos.d.umn.edu

Server 3.

- Read chapter 6 of the textbook, and the emailed short paper on dynamic programming.
- Lab 3: The scoring matrices. Sage servers:

sage.d.umn.edu

erdos.d.umn.edu

Server 3.

- Read chapter 7 of the textbook.
- Lab 4: Dynamic programming. Sage servers:

sage.d.umn.edu

erdos.d.umn.edu

Server 3. - Lab 5: Entropy. Sage servers:

sage.d.umn.edu

erdos.d.umn.edu

Server 3. - Read chapter 8 of the textbook.
- Read chapters 9 and 10 in the textbook, and the phylogeny article of T. Ryan Gregory.
- Lab 6: Phylogenetics. Sage servers:

sage.d.umn.edu

erdos.d.umn.edu

Server 3. - Lab 7: Regular expressions. Sage servers:

sage.d.umn.edu

erdos.d.umn.edu

Server 3.

**Resources:**

Khan Academy. These videos are usually 15 minutes or less and focus on one topic at a time. For this course, the biology video "DNA" is most relevant, but many of them might be useful for providing some context, particularly those focused on molecular biology such as "Chromosomes, Chromatids, Chromatin, etc." and the series starting with "Introduction to Cellular Respiration".

NCBI education site. This has many links to tutorials and primers on a variety of subjects.

NCBI Entrez Often the right place to start

Sage This is a python-based open-source math project includes biopython as an optional package.

Interactive Python tutorial. This is a very nicely done introduction to some of the basic features of the Python language.

Rosalind, a good site for practicing using biopython for bioinformatics.

Python Tutorial A short tutorial by the creator of python, Guido van Rossum. In addition, the Python Library Reference has more complete documentation.

Think Python by Allen Downey. He has some other good free books that use Python that may be useful or interesting to you: Think Stats and Think Complexity.

Biopython tutorial. Covers many things, such as parsing structure files from the Protein Data Bank, that we will not cover in this course (as well as a lot that we will cover).

** Student Conduct Code:** see the full description at http://www.d.umn.edu/assl/conduct/code/.

**Policy statement:** The University of Minnesota is committed to
the policy that all persons shall have equal access to its programs,
facilities, and employment without regard to race, religion, color,
sex, national origin, handicap, age, veteran status, or sexual
orientation.

**Disabilities: **An individual who has a disability, either
permanent or temporary, which might affect his/her ability to perform
in this class should contact the instructor as soon as possible so that
he can adapt methods, materials and/or tests as needed to provide for
equitable participation.