Linguistics 4104 Corpus Linguistics Course Home (Spring 2018)


Table of Contents:


 

Instructor: Chongwon Park, Ph.D. (TA: John Yang)

Google group address: ling4104_001s18d@d.umn.edu

Office Hours: 10:00 - 10:55AM, MW (H420S)

Class Time: 11:00 - 11:50, MWF (Bohannon Hall 108)

Required Texts:

Practical Corpus Linguistics (Martin Weisser, Wiley-Blackwell)
Computational Methods for Corpus Annotation and Analysis
(Xiaofei Lu, Springer)

Course Description:

The aim of this course is to learn how to analyze linguistic phenomena based on data extracted from large databases. [1] Students will learn the distinction between corpus methods and the traditional, intuition-based approaches. After reviewing key linguistic concepts learned in the prerequisite linguistics course (LING 1811), [2] students will learn how to test existing hypotheses based on the data extracted from large corpora. [3] Students will also acquire basic computer programming skills in Awk (or Python or R) to clean up and manipulate the data structure for the purpose of linguistic exploration.

Requirements:

You will have a total of 10 assignments and 3 exams. All assignments and exams consist of problem-solving questions.

Attendance and Evaluation:

It is important for you to be present for every class. Every homework assignment should be turned in on the due date (or before the due date) in class. Evaluation will be based on the following weight. IMPORTANT: I DO NOT accept late assignments (no exceptions). E-mail submissions WILL NOT be accepted.

Number
Points
Total
Homework
10
5 (per homework)
50
Exam 1
1
10
10
Exam 2
1
15
15
Exam 3
1
25
25
Total
100

While students are expected to attend every single class period, there are circumstances that lead to excused absence from the classroom. Excused absences are defined at http://www.duluth.umn.edu/vcaa/ExcusedAbsence.html. To be eligible for an excused absence, students must provide written documents such as doctor's notes and advisor's letters. To encourage your attendance, for each class you miss 1 point will be deducted, but if your attendance is perfect (any absences being excused) you will receive 3 bonus points.

Final Grades:

Course Schedule for Spring 2018:

Date
Topic
Assignments and Due Dates
Required Reading
Jan. 10 (W)
Introduction


Jan. 12 (F)
What's corpus?
Weisser, Ch. 2
Jan. 15 (M)
No class!
 
Jan. 17 (W)
Undersanding corpus design
Weisser, Ch. 3
Jan. 19 (F)
Understanding corpus design
Weisser, Ch. 3
Jan. 22 (M)
Preparing your data
Weisser, Ch. 4
Jan. 24 (W)
Preparing your data
Weisser, Ch. 4
Jan. 26 (F)
Concodancing
Assignment 1, Due Feb. 2 (F)
Weisser, Ch. 5
Jan. 29 (M)
Regular expressions
Weisser, Ch. 6
Jan. 31 (W)
Regular expressions
Weisser, Ch. 6
Feb. 2 (F)
Regular expressions
Assignment 2, Due Feb. 9 (F)
Weisser, Ch. 6
Feb. 5 (M)
Parts of speech tagging
Weisser, Ch. 7
Feb. 7 (W)
Parts of speech tagging
Weisser, Ch. 7
Feb. 9 (F)
Parts of speech tagging (NLTK)
Assignment 3, Due Feb. 16 (F)
Weisser, Ch. 7
Feb. 12 (M)
COCA and BNC
Student presentation
Weisser, Ch. 8
Feb. 14 (W)
COCA and BNC
Student presentation
Weisser, Ch. 8
Feb. 16 (F)
COCA and BNC

Student presentation
Assignment 4, Due Mar. 23 (M)

Weisser, Ch. 8
Feb. 19 (M)
Frequency analysis
Weisser, Ch. 9
Feb. 21 (W)
Frequency analysis
Weisser, Ch. 9
Feb. 23 (F)
Frequency analysis
Assignment 5, Due Mar. 2 (F)
Weisser, Ch. 9
Feb. 26 (M)
Words in contexts
Weisser, Ch. 10
Feb. 28 (W)
Words in contexts
Weisser, Ch. 10
Mar. 2 (F)
Words in contexts
Weisser, Ch. 10
Mar. 5 (M)
Spring Break
Spring Break
Mar. 7 (W)
Spring Break
Spring Break
 
Mar. 9 (F)
Spring Break
Spring Break
 
Mar. 12 (M)
Exam 1
Weisser Ch. 1 ~ Ch. 10
Mar. 14 (W)
Markup and annotation
Weisser Ch. 11
Mar. 16 (F)
Markup and annotation
Assignment 6, Due Mar. 23 (F)
Weisser Ch. 11
Mar. 19 (M)
Text processing (command line)
Lu, Ch. 2
Mar. 21 (W)
Text processing (command line)
Lu, Ch. 2
Mar. 23 (F)
Text processing (command line)
Assignment 7, Due Mar. 30 (F)
Lu, Ch. 2
Mar. 26 (M)
Lexical annotation
Lu, Ch. 3
Mar. 28 (W)
Lexical annotation
Lu, Ch. 3
Mar. 30 (F)
Lexical annotation
Assignment 8, Due Apr. 6 (F)
Lu, Ch. 3
Apr. 2 (M)
Lexical analysis
Lu, Ch. 4
Apr. 4 (W)
Lexical analysis
Lu, Ch. 4
Apr. 6 (F)
Lexical analysis
Assignment 9, Due Apr. 13 (F)
Lu, Ch. 4
Apr. 9 (M)
Syntactic annotation
Lu, Ch. 5
Apr. 11 (W)
Exam 2
Lu, Ch. 1~ Ch. 4
Apr. 13 (F)
Syntactic annotation
Lu, Ch. 5
Apr. 16 (M)
Syntactic annotation
Lu, Ch. 5
Apr. 18 (W)
Syntactic annotation
Assignment 10, Due Apr. 23 (M)
Lu, Ch. 5
Apr. 20 (F)
Semantic analysis
Lu, Ch. 6
Apr. 23 (M)
Semantic analysis
Lu, Ch. 6
Apr. 25 (W)
Semantic anlaysis
Lu, Ch. 6
Apr. 27 (F)
Review
 
   
 
Final Exam
May 2 (W) 10:00 - 11:55am
Comprehensive

 

Academic Dishonesty:

Academic dishonesty tarnishes UMD's reputation and discredits the accomplishments of students. UMD is committed to providing students every possible opportunity to grow in mind and spirit. This pledge can only be redeemed in an environment of trust, honesty, and fairness. As a result, academic dishonesty is regarded as a serious offense by all members of the academic community. In keeping with this ideal, this course will adhere to UMD's Student Academic Integrity Policy, which can be found at http://www.d.umn.edu/conduct/integrity. This policy sanctions students engaging in academic dishonesty with penalties up to and including expulsion from the university for repeat offenders.

Appropriate Classroom Conduct:

The instructor will enforce and students are expected to follow the University's Student Conduct Code (http://www.d.umn.edu/conduct/code). Appropriate classroom conduct promotes an environment of academic achievement and integrity. Disruptive classroom behavior that substantially or repeatedly interrupts either the instructor's ability to teach, or student learning, is prohibited. Disruptive behavior includes inappropriate use of technology in the classroom. Examples include ringing cell phones, text-messaging, watching videos, playing computer games, checking email, surfing the Internet, or Facebooking (or facebooking) on your computer instead of note-taking or other instructor-sanctioned activities.