COL772: Natural Language Processing - Spring 2019
Monday, Thursday 2-3:20 pm in Bharti 201


Instructor: Mausam
(mausam at cse dot iitd dot ac dot in)
Office hours: by appointment, SIT Building Room 402
TAs (Office hours, by appointment):
Keshav Kolluru, keshavkolluru AT gmail.com
Nikhil Gupta, nikhilgupta1997 AT gmail.com
Krunal Shah, ktgshah AT gmail.com
Makkunda Sharma, makkundasharma AT gmail.com

Course Contents

NLP concepts: Tokenization, lemmatization, part of speech tagging, noun phrase chunking, named entity recognition, coreference resolution, parsing, information extraction, sentiment analysis, question answering, text classification, document clustering, document summarization, discourse, machine translation.

Machine learning concepts: Naive Bayes, MaxEnt classifiers, Hidden Markov Models, Conditional Random Fields, Probabilistic Context Free Grammars, Word2vec models, RNN-based neural models, Sequence to sequence neural models.

Schedule

Start End Slides Required Readings Recommended Readings
Dec 31Jan 10 Introduction J&M Ch 1 Advances in NLP
Jan 10Jan 16 Regular Languages and Finite State Automata SLP3 Ch 2  
Jan 16Jan 17 Morphology with Finite State Transducers J&M Ch 3  
Jan 17Jan 23 Text Categorization using Naive Bayes Notes (Sections 1-4)
SLP3 Ch 4
Gender in Job Postings
Improvements to Multinomial Naive Bayes
Performance Measures
Jan 23Jan 24 Sentiment Mining and Lexicon Generation Survey (Sections 1-4.5)
Tutorial (Sections 1-5)
SLP3 Ch 19
Semantic Orientation of Adjectives
Unsupervised Classification of Reviews
Jan 26Feb 19 Assignment 1.1 Resources

Feb 11Feb 14 Log Linear Models for Classification Notes (Section 2)
SLP3 Ch 5
Max Entropy models for WSD
Feb 14Feb 14 Generative vs. Max Entropy Models Max Entropy Tutorial Intro to Max Entropy Models
Feb 19Feb 22 Information Retrieval and Topic Models SLP3 Ch 6.1-6.6
LSA and PLSA
Detailed Tutorial on LDA
Feb 22Feb 22 An Intro to Deep Learning for NLP

Feb 23Feb 28 Representation Discovery for Words Goldberg 8.1-8.4, 10, 11
Embeddings vs. Factorization
Contextual Embeddings
Trends and Future Directions on Word Embeddings
Feb 28Mar 14 Assignment 2
Feb 28Mar 11 N-gram Features with CNNs Goldberg 13
Practitioner's Guide to CNNs
Mar 11Mar 14 RNNs for Variable Length Sequences Goldberg 14.1-14.3.1,14.4-14.5
Goldberg 15, 16.1.1, 16.2.2
Understanding LSTMs
Recurrent Additive Networks
RNNs and Vanishing Gradients (Section 4.3)
Mar 14Mar 18 Tricks for Training RNNs Deep Learning for NLP Best Practices
Mar 18Apr 4 Language Models SLP3 Ch 3
Goldberg 9, 10.5.5
Character Aware Neural LMs
Exploring limits of Language modeling
Apr 4Apr 8 POS Tagging with Hidden Markov Models SLP3 (Ch 9, 10.1-10.4)

Apr 8Apr 11 Named Entity Recognition with CRFs Notes (Section 4)
Detailed Notes
Non-Local Features and Knowledge in NER
Apr 11Apr 15 BiLSTM+CRF and Other Neural Models for Sequence Labeling Goldberg 19.1-19.3, 19.4.2
Bidirectional LSTM-CRF Models

Apr 15Apr 15 Seq2Seq Models & Attention Goldberg 17.1, 17.2, 17.4
Attention is All You Need
Apr 18Apr 22 Statistical Natural Language Parsing SLP3 Ch 12, 13.1-13.2
Lectures Notes on PCFGs
Lectures Notes on Lexicalized PCFGs

Apr 18May 7 Assignment 3  
Format Checker
Apr 25Apr 25 BERT BERT Paper

Apr 25Apr 25 Wrap Up Noam Chomsky on ML
Noam Chomsky on AI
Peter Norvig on Chomsky



Textbook and Readings

Yoav Goldberg Neural Network Methods for Natural Language Processing,
Morgan and Claypool (2017) (required).

Dan Jurafsky and James Martin Speech and Language Processing, 3nd Edition,
(under development).

Grading

Assignments: 30%; Project: 20%; Minors: 20%; Final: 30%; Class participation, online discussions: extra credit.

Course Administration and Policies

Cheating Vs. Collaborating Guidelines

As adapted from Dan Weld's guidelines.

Collaboration is a very good thing. On the other hand, cheating is considered a very serious offense. Please don't do it! Concern about cheating creates an unpleasant environment for everyone. If you cheat, you get a zero in the assignment, and additionally you risk losing your position as a student in the department and the institute. The department's policy on cheating is to report any cases to the disciplinary committee. What follows afterwards is not fun.

So how do you draw the line between collaboration and cheating? Here's a reasonable set of ground rules. Failure to understand and follow these rules will constitute cheating, and will be dealt with as per institute guidelines.