COL 864 - Spl Topics in AI (2019-20 Sem 1)

Instructor: Srikanta Bedathur (walk-in office timings: Mon./Thu.: 3:30-5:30pm) Course slot: AA (Mon., Thu. 2:00-3:30) Location : IIA - 201 Assistants: 1. Pawan Kumar 2. Namrata Jain


Overview

Information retrieval -aka “search”- plays a central role in our modern digital lives. In this course we cover the fundamental concepts of information retrieval as well as some of the recent advances in the field such as the use of knowledge graphs for retrieval, neural methods for retrieval tasks, issues of fairness and fake news, and the use of succinct data-structures in building efficient search systems.

The course consists of roughly 50% lectures, with rest comprising of student-presentations, homework assignments and paper-reading tasks spread throughout the semester. There will be a course project.

Contents

  1. Part I: Basics

    • Retrieval framework
    • Models: Boolean, vector-space, probabilistic, language modeling
    • Systems: search engine framework, inverted indexes – construction and storage, compression, algorithms
    • Ecosystem: TREC and other commonly used collections
    • Evaluation: metrics for evaluation and experiment protocols
  2. Part II: Advanced

    • Advanced retrieval models: phrase queries and proximity based models, diversification, passage retrieval
    • Learning to Rank: methods, embeddings and semantic matching by query expansion, neural information retrieval
    • Knowledge-graphs: entity centric search, enhanced using knowledge graphs
    • Responsible IR: fake-news, privacy, bias and fairness
  3. Part III:

    • Paper presentations by students (list of papers will be announced shortly)

Prerequisites:

  • datastructures and algorithms
  • comfortable with programming in Java/C++, probability and statistics, linear algebra
  • background in machine learning and/or NLP is desired but not mandatory

Textbooks

  1. Introduction to Information Retrieval by Christopher Manning, Prabhakar Raghavan, and Hinrich Schütze, Cambridge University Press. I strongly encourage you to own a copy of this book. A high-quality preprint of the book is available from the book website
  2. Modern Information Retrieval : The Concepts and Technology behind Search by Ricardo Baeza-Yates and Ribeiro-Neto, 2010.
  3. Search Engines: Information Retrieval in Practice by Croft, Metzler and Strohman, 2010.
  4. Information Retrieval – Implementing and Evaluating Search Engines by Büttcher, Clarke and Cormack, MIT Press, 2010.

Calendar

(Note Slides will be posted on moodle)

Date Topic Notes
22-07 Introduction and Course Organization

Grading scheme (tentative)

Activity Weight
Mid-term 20%
Major 20%
Paper reading and report 10% (+10% extra)
Project 25%
Assignments 25%
Avatar
Srikanta Bedathur
Associate Professor and Pankaj Gupta Faculty Fellow

Related