Start |
End |
Slides |
Required Readings |
Recommended Readings |
Jan 4 | Jan 18 |
Introduction
| J&M Ch 1
|
Advances in NLP
|
Jan 18 | Jan 18 |
Regular Languages and Finite State Automata
| SLP3 Ch 2
|
 
|
Jan 29 | Jan 29 |
Morphology with Finite State Transducers
| J&M Ch 3
|
 
|
Feb 1 | Feb 12 |
Text Categorization using Naive Bayes
|
Notes (Sections 1-4)
SLP3 (Upto Section 7.3)
|
Gender in Job Postings
Improvements to Multinomial Naive Bayes
Performance Measures
Error Correcting Output Codes
|
Feb 12 | Feb 15 |
Sentiment Mining and Lexicon Generation
|
Survey (Sections 1-4.5)
Tutorial (Sections 1-5)
|
Semantic Orientation of Adjectives
Unsupervised Classification of Reviews
|
Feb 15 | Feb 19 |
Log Linear Models for Classification
|
Notes (Section 2)
SLP3 (7.4-7.6)
|
Max Entropy models for WSD
|
Feb 19 | Feb 19 |
Generative vs. Max Entropy Models
|
Max Entropy Tutorial
|
Intro to Max Entropy Models
|
Feb 19 | Feb 22 |
Information Retrieval and Topic Models
|
SLP3 Ch 15
LSA and PLSA
|
Detailed Tutorial on LDA
|
Feb 27 | Mar 27 |
Assignment 1
| Resources
|
|
Mar 2 | Mar 19 |
Project (Part 1)
|  
|
 
|
Mar 5 | Mar 8 |
Representation Discovery for Words and Documents
|
Goldberg 8.1-8.4, 10, 11
Doc2VecC (Sections 1-3)
|
Embeddings vs. Factorization
Trends and Future Directions on Word Embeddings
|
Mar 8 | Mar 12 |
N-gram Features with CNNs
|
Goldberg 13
|
Practitioner's Guide to CNNs
|
Mar 12 | Mar 15 |
RNNs for Variable Length Sequences
|
Goldberg 14.1-14.3.1,14.4-14.5
Goldberg 15, 16.1.1, 16.2.2
Understanding LSTMs
|
Recurrent Additive Networks
RNNs and Vanishing Gradients (Section 4.3)
|
Mar 15 | Mar 15 |
Tricks for Training RNNs
| Deep Learning for NLP Best Practices
|
|
Mar 15 | Mar 15 |
Domain Adaptation
|
Paper
|
|
Mar 15 | Mar 19 |
Language Models
|
SLP3 Ch 4
Goldberg 9, 10.5.5
Character Aware Neural LMs
|
Exploring limits of Language modeling
|
Mar 22 | Apr 2 |
POS Tagging with Hidden Markov Models
|
SLP3 (Ch 9, 10.1-10.4)
|
|
Apr 2 | Apr 5 |
Named Entity Recognition with CRFs
| Notes (Section 4)
Detailed Notes
|
Non-Local Features and Knowledge in NER
|
Apr 5 | Apr 5 |
BiLSTM+CRF and Other Neural Models for Sequence Labeling
| Goldberg 19.1-19.3, 19.4.2
Bidirectional LSTM-CRF Models
|
|
Apr 5 | Apr 9 |
Constrained Conditional Models for Sequence Labeling
|
Paper on CCM Learning (Sections 2, 3.2)
|
|
Apr 7 | Apr 21 |
Assignment 2
|  
|
|
Apr 9 | Apr 16 |
Statistical Natural Language Parsing
|
SLP3 Ch 12.1-12.2, 13.1-13.5,13.8
Lectures Notes on PCFGs
Lectures Notes on Lexicalized PCFGs
|
|
Apr 16 | Apr 16 |
Neural Models over Tree Structures
|
Goldberg 18
Tree LSTMs
|
|
Apr 19 | Apr 23 |
Seq2Seq Models & Attention
|
Goldberg 17.1, 17.2, 17.4
|
Attention is All You Need
|
Apr 23 | Apr 26 |
Dialog Systems
|
|
|
Apr 26 | Apr 26 |
Wrap Up
|
|
|
Textbook and Readings
Yoav Goldberg
Neural Network Methods for Natural Language Processing,
Morgan and Claypool (2017) (required).
Dan Jurafsky and James Martin
Speech and Language Processing, 3nd Edition,
(under development).
Grading
Assignments: 30%; Project: 20%; Minors: 20%;
Final: 30%; Class participation, online discussions: extra credit.
Course Administration and Policies
- Subscribe to the class discussion group on Piazza. (access code: col772)
- All programming assignments are to be done individually.
You may discuss the subject matter with other students in the class,
but all solutions, code, writeups must be your own. In your writeup mention names of any students with whom you discussed the projects.
You are expected to maintain the utmost level of academic integrity in the course.
- Programming assignments may be handed in up to a week late, at a penalty of 10% of the maximum grade per day.
- The project is to be done in a group of two. You may take special written permission in case you wish to do a project in group of any other size (even one). Except for unusual circumstances, all team members will get the same grade.
- There is no late policy for the project submission. Project needs to be submitted by the deadline.
Cheating Vs. Collaborating Guidelines
As adapted from
Dan Weld's guidelines.
Collaboration is a very good thing. On the other hand,
cheating is considered a very serious offense.
Please don't do it! Concern about cheating creates an unpleasant
environment for everyone.
If you cheat, you get a zero in the assignment, and additionally
you risk losing your position as a student in the department and the institute.
The department's policy on cheating is to report any cases to
the disciplinary committee.
What follows afterwards is not fun.
So how do you draw the line between collaboration and cheating?
Here's a reasonable set of ground rules.
Failure to understand and follow these rules will constitute cheating,
and will be dealt with as per institute guidelines.
- The Kyunki Saas Bhi Kabhi Bahu Thi Rule:
This rule says that you are free to meet with fellow students(s)
and discuss assignments with them.
Writing on a board or shared piece of paper is acceptable during the meeting;
however, you should not take any written (electronic or otherwise) record away from the meeting.
This applies when the assignment is supposed to be an individual effort or
whenever two teams discuss common problems they are each encountering
(inter-group collaboration).
After the meeting, engage in a half hour of mind-numbing activity
(like watching an episode of Kyunki Saas Bhi Kabhi Bahu Thi),
before starting to work on the assignment.
This will assure that you are able to reconstruct what you learned from the meeting,
by yourself, using your own brain.
- The Right to Information Rule:
To assure that all collaboration is on the level,
you must always write the name(s) of your collaborators on your assignment.
This also applies when two groups collaborate.