COL774: Machine Learning



General Information

Semester: Sem I, 2023-24.

Instructor: Parag Singla (email: parags AT cse.iitd.ac.in)

Class Timings (Slot C):
  • Tue, 8:00 am - 8:50am
  • Wed, 8:00 am - 8:50am
  • Fri, 8:00 am - 8:50am
Venue: LHC 418

Sign up for Piazza
Code: As (will be) announced in class.

Sign up for Enrolling in the Course (If not already enrolled): (vacancy has been increased!)Click here.
Notes:
  1. Only students from CSE/EE/Maths/SIT/ScAI should fill this (unless I have confirmed an exception over email).
  2. Students who are only planning to sit through this course Should NOT fill this form.

TA Assignment:Click here

Announcements

  • [Oct 21, 2023] Assignment 3 (Both Parts) is out! Due Date (for both parts): Tuesday October 31st, 11:50 pm.
  • [Oct 8, 2023] Assignment 3 (Part I) is out! Due Date (for both parts): Sunday October 29th, 11:50 pm.
  • [Sep 26, 2023] Assignment 2 (Complete) is out! Due Date: Wednesday Oct 4, 11:50 pm.
  • [Sep 6, 2023] Assignment 2 (Part I) is out! Due Date: Wednesday Oct 4, 11:50 pm.
  • [Aug 25, 2023] TA assignment has beein posted on the website!
  • [Aug 13, 2023] Assignment 1 is out! Due Date: Friday Sep 1,11:50 pm.
  • [July 25, 2023]: Course Website has been updated!

    Course Objectives

    (a) To familiarize with/develop the understanding of fundamental concepts of Machine Learning (ML)
    (b) To develop the understanding of working of a variety of ML algorithms (both supervised as well as unsupervised)
    (c) To learn to apply ML algorithms to real world data/problems
    (d) To update with some of the latest advances in the field

    Course Content

    NOTE: The exact list of topics below is tentative (until we are past that week). We will update it as we go through the lectures in each week. So, stay tuned!

    Week Topic Supplementary Notes
    (by Andrew Ng and Others)
    Class Notes/Other Resources
    1 Introduction July 25, July 26, July 28
    2 Supervised Learning Basics - Linear Regression, Gradient Descent lin-log-reg.pdf Aug 1, Aug 2, Aug 4
    3 Gradient Descent (Including Convergence Properties), Stochastic Gradient Descent lin-log-reg.pdf
    Aug 8, Aug 9, Aug 11, Aug 16, Aug 18
    4 Linear Regression - alternate intepretation (probabilistic), Logistic Regression, GLMs lin-log-reg.pdf
    Aug 19, Aug 22, Aug 23
    5 Gaussian Discriminant Analysis (GDA) gda_nb.pdf Aug 25, Aug 29, Aug 30
    6 Naive Bayes gda_nb.pdf
    Sep 1, Sep 5, Sep 6,
    7,8 Support Vector Machines svm.pdf
    Sep 19, Sep 20, Sep 22,
    Sep 26, Sep 29
    9 Decision Trees, Random Forests Mitchell, Chapter 3.
    dtrees.pdf.
    Online Resources: Random Forests,
    Gradient Boosting - Wikipedia,
    Oct 10, Oct 11 Oct 13, Oct 14
    Paper by Friedman (2001) (up to Section 4.5)
    10 Neural Networks Mitchell, Chapter 4.
    nnets.pdf nnets-hw.pdf
    Oct 17, Oct 18 Oct 20
    11 Deep Learning cnn.pdf Online Resource:
    Convolutional Neural Networks
    Oct 25, Oct 27, Oct 28-slides, Oct 28-notes
    12 K-Means, Gaussian Mixture Models kmeans.pdf gmm.pdf Oct 31, Nov 1, Nov 3
    13 Expectation Maximiation (EM), Principal Component Analysis (PCA) em.pdf pca.pdf Nov 7, Nov 8, Nov 10
    14 Learning Theory, Model Selection Mitchell, Chapter 7.
    theory.pdf model.pdf
    Nov 14, Nov 15, Nov 17

    Class Notes/Videos (Date-Wise):

    For week-wise notes, see the Content Table Above.
    Video Lectures: Aug 30, Sep 1, Sep 5, Sep 6, Sep 19, Sep 20, Sep 22 Sep 26, Sep 29, Oct 10, Oct 11, Oct 13, Oct 14, Oct 17, Oct 18, Oct 20, Oct 25, Oct 27, Oct 28 (Part a), Oct 28 (Part b), Oct 31,
    Nov 1, Nov 3, Nov 7, Nov 8, Nov 10, Nov 14, Nov 15, Nov 17

    Video Lectures from Previous offering can be acccessed here COL 774, Sem I, 2021-22 Course Page (Search for Videos)

    Additional Resources

    Review Material

    Topic Notes
    Probability prob.pdf
    Linear Algebra linalg.pdf
    Gaussian Distribution gaussians.pdf
    Convex Optimization (1) convex-1.pdf

    References (latest)

    References (older)

    Assignment Submission Instructions

    1. You are free to discuss the assignment problems with other students in the class. But all your code should be produced independently without looking at/referring to anyone else's code.
    2. Python is the default programming languages for the course. You should use it for programming your assignments unless otherwise explicitly allowed.
    3. Code should be submitted using Moodle Page. Make sure to include commenrs for readability.
    4. Create a separate directory for each of the questions named by the question number. For instance, for question 1, all your submissions files should be put in the directory named Q1 (and so on for other questions). Put all the Question sub-directories in a single top level directory. This directory should be named as "yourentrynumber_firstname_lastname". For example, if your entry number is "2021cs19535" and your name is "Nitika Rao", your submission directory should be named as "2021cs19535_nitika_rao". You should zip your directory and name the resulting file as "yourentrynumber_firstname_lastname.zip" e.g. in the above example it will be "2021cs19535_nitika_rao.zip". This single zip file should be submitted online.
    5. Honor Code: Any cases of copying will be awarded a zero on the assignment and an additional penalty equal to the negative of the total weightage of the assignment. More severe penalties may follow.
    6. Late Policy: You are allowed a total of 5 late (buffer) days acorss the first 3 assignments. You are free to decide how you would like to use them. The late policy (if any) for the last assignment will be announced separately. You will get a penalty of 10% deduction in marks (per day) for every additional late day in submission used beyond the allowed 5 buffer days (applicable to first 3 assignments only).

    Practice Questions

    Assignments

    1. Assignment 3
      Starter Code and Dataset. Part A [Updated: see Piazza post for details].
      Starter Code and Dataset. Part B
      Due Date (both parts) [Updated]: Tuesday Ocotber 31st. 11:50 pm.
    2. Assignment 2 [Updated! Sep 26th, 2023]
      Datasets. Part 1,Part 2 (linked from pdf)
      Due Date: Wednesday October 4, 2023. 11:50 pm
    3. Assignment 1
      Datasets: ass1_data.zip
      Due Date: Friday September 1, 2023. 11:50 pm

    Grading Policy (Tentative)

    Assignments (4) Ass1: 7%. Ass2: 9%, Ass3: 9%, Ass4: 10 %. [Total Assignment Weight: 35%]
    Minor 25%
    Major 40%