Syllabus for Math 5233, Mathematical Foundations of Bioinformatics

This page will be updated throughout the semester.

Instructor: Marshall Hampton

Office: 172 SCC

Email: mhampton at d.umn.edu (preferred contact method)

Telephone: 726-6329 (NB: I don't usually answer the phone.)

Office hours: 12:45-2, M, Tu, W, F, or by appointment.

Class homepage: http://www.d.umn.edu/~mhampton/m5233s16.html

Lecture/Lab Times and Locations: 3 - 3:50 pm, M, W, F (1/21 - 5/9). On Mondays we will be working in a computer lab (MonH 209), on Wednesdays and Fridays we will be in EduE 17.

Prerequisites: Any two of the following: Biol 5233, Math 3355, CS 1511, Stat 3611, or equivalents. Please come see me if you have any questions about the preparation required for this course. Since Biol 5233 has not been offered recently, other biology coursework in genetics and biochemistry is also acceptable.

Textbook: Notes and readings will be distributed by email, there is no required printed text.

Topics: The official course description is "Mathematical, algorithmic, and computational foundations of common tools used in genomics and proteomics. Topics include: sequence alignment algorithms and implementations (Needleman-Wunsch, Smith-Waterman, BLAST, Clustal), scoring matrices (PAM, BLOSUM), statistics of DNA sequences (SNPs, CpG islands, isochores, satellites), and phylogenetic tree methods (UPGMA, parsimony, maximum likelihood). Other topics will be covered as time permits: RNA and protein structure prediction, microarray analysis, post-translational modification prediction, gene regulatory dynamics, and whole-genome sequencing techniques." One thing not mentioned there that I would like to at least briefly cover is hidden Markov models (HMMs). We will be using the programming language Python as our primary computational tool, with the biopython module. All of the software we will be using is free.

Exams: There will be a midterm on March 18th and a final exam (Thursday May 5th, 2-4 pm).
A practice midterm and final will be posted here a week before each exam.


Practice Midterm.
Practice Final.

Projects: There will be several projects and presentations for small groups.

Grading: You will be evaluated in a variety of ways: homework/labs, class participation (including worksheets) and presentations, and exams. The midterm exam will be count for approximately 20% of the grade, and the final exam about 30%.

Worksheets
Jukes-Cantor
Entropy and Scoring Matrices
BWT and de Bruijn graphs
Statistics of word occurence
Neighbor Joining and UPGMA
Trees
Bayes' theorem
Synonymous and nonsynonymous changes
Sankoff Algorithm
Markov Chains
HMM profiles
Neural Networks
Population genetics
Nussinov

Assignments and labs: Most of the assignments will be either readings or group lab assignments. Labs will mostly be done using the computational platform Sage.

Resources:

NCBI education site. This has many links to tutorials and primers on a variety of subjects.

NCBI Entrez Often the right place to start

UCSC Genome Browser

Sage This is a python-based open-source math project includes biopython as an optional package.

Interactive Python tutorial. This is a very nicely done introduction to some of the basic features of the Python language.

Rosalind, a good site for practicing using biopython for bioinformatics.

Python Tutorial A short tutorial by the creator of python, Guido van Rossum. In addition, the Python Library Reference has more complete documentation.

Think Python by Allen Downey. He has some other good free books that use Python that may be useful or interesting to you: Think Stats and Think Complexity.

Biopython tutorial. Covers many things, such as parsing structure files from the Protein Data Bank, that we will not cover in this course (as well as a lot that we will cover).


Student Conduct Code: see the full description at http://regents.umn.edu/sites/regents.umn.edu/files/policies/Student_Conduct_Code.pdf.

Policy statement: The University of Minnesota is committed to the policy that all persons shall have equal access to its programs, facilities, and employment without regard to race, religion, color, sex, national origin, handicap, age, veteran status, or sexual orientation.

Disabilities: An individual who has a disability, either permanent or temporary, which might affect his/her ability to perform in this class should contact the instructor as soon as possible so that he can adapt methods, materials and/or tests as needed to provide for equitable participation.