Instructor: Srikanta Bedathur (walk-in office timings: Mon./Thu.: 3:30-5:30pm) Course slot: AA (Mon., Thu. 2:00-3:30) Location : IIA - 201 Assistants: 1. Pawan Kumar 2. Namrata Jain
Overview
Information retrieval -aka “search”- plays a central role in our modern digital lives. In this course we cover the fundamental concepts of information retrieval as well as some of the recent advances in the field such as the use of knowledge graphs for retrieval, neural methods for retrieval tasks, issues of fairness and fake news, and the use of succinct data-structures in building efficient search systems.
The course consists of roughly 50% lectures, with rest comprising of student-presentations, homework assignments and paper-reading tasks spread throughout the semester. There will be a course project.
Contents
Part I: Basics
- Retrieval framework
- Models: Boolean, vector-space, probabilistic, language modeling
- Systems: search engine framework, inverted indexes – construction and storage, compression, algorithms
- Ecosystem: TREC and other commonly used collections
- Evaluation: metrics for evaluation and experiment protocols
Part II: Advanced
- Advanced retrieval models: phrase queries and proximity based models, diversification, passage retrieval
- Learning to Rank: methods, embeddings and semantic matching by query expansion, neural information retrieval
- Knowledge-graphs: entity centric search, enhanced using knowledge graphs
- Responsible IR: fake-news, privacy, bias and fairness
Part III:
- Paper presentations by students (list of papers will be announced shortly)
Prerequisites:
- datastructures and algorithms
- comfortable with programming in Java/C++, probability and statistics, linear algebra
- background in machine learning and/or NLP is desired but not mandatory
Textbooks
- Introduction to Information Retrieval by Christopher Manning, Prabhakar Raghavan, and Hinrich Schütze, Cambridge University Press. I strongly encourage you to own a copy of this book. A high-quality preprint of the book is available from the book website
- Modern Information Retrieval : The Concepts and Technology behind Search by Ricardo Baeza-Yates and Ribeiro-Neto, 2010.
- Search Engines: Information Retrieval in Practice by Croft, Metzler and Strohman, 2010.
- Information Retrieval – Implementing and Evaluating Search Engines by Büttcher, Clarke and Cormack, MIT Press, 2010.
Calendar
(Note Slides will be posted on moodle)
Date | Topic | Notes |
---|---|---|
22-07 | Introduction and Course Organization |
Grading scheme (tentative)
Activity | Weight |
---|---|
Mid-term | 20% |
Major | 20% |
Paper reading and report | 10% (+10% extra) |
Project | 25% |
Assignments | 25% |