Introduction to Machine Learning (ELL784)

General Information

No one shall be permitted to audit the course for an off-line (contact semester). People are welcome to sit through it, however. For an online semester, the special IITD audit rules apply. Owing to number constraints, we are compelled to open this course primarily to M.Tech, M.S.(R) and Ph.D. students of the EE Department and the Bharti School: for students with the following programme codes:
EEN/EEE/EEA/EET/EES/EEP/EEY/BSY/EEZ/BSZ/JVL/JOP/JTM/JVY
For others: it will be a first-come-first-served process. This course is not open to B.Tech and Dual Degree students, who are supposed to opt for ELL409 (Machine Intelligence and Learning). This is a Departmental Elective (DE), one of the `essential electives' for the Cognitive and Intelligent Systems (CIS) stream of the Computer Technology Group, Department of Electrical Engineering. A general note for all EE Machine Learning courses: students will be permitted to take only one out of the following courses: ELL409 (Machine Intelligence and Learning), and the two CSE Machine Learning courses: COL341 Machine Learning and COL774 Machine Learning. Those who do not fulfil the above criteria, and are found enrolled after the completion of the add-drop period will be forcibly removed from the course.

For those who fulfil the above criterion, you may note a cap of 50, on a first-come-first-served basis.
People are welcome to sit-through the lectures without a formal registration. Please drop an email to the instructor with your preferred email address for communication. You can additionally join the WhatsApp group for the course: https://chat.whatsapp.com/Bv2Si65HR4C4iigTYrXRm8

Credits: 3 (LTP: 3-0-0) [Slot C]

Schedule for Classes:

Tuesday	08:00 am - 09:00 am	MS-Teams (online)
Wednesday	08:00 am - 09:00 am	MS-Teams (online)
Friday	08:00 am - 09:00 am	MS-Teams (online)

Schedule for Examinations:

Minor: 15 February 2022 (Tuesday), 01:30 pm - 02:30 pm.
Scanned scripts to be uploaded on gradescope by 03:00 pm.

Major: 10 April 2022 (Sunday), 08:15 am - 09:15 am. [Venue: LH-408]
The exam will be an in-person exam, conducted as above. Scanned scripts to be uploaded on gradescope by 09:45 am.

Teaching Assistants:

Books, Papers and other Documentation

Textbook:

C. M. Bishop. Pattern Recognition and Machine Learning. First Edition. Springer, 2006. (Second Indian Reprint, 2015). [Bishop]

Reference Books:

I. Goodfellow, Y. Bengio, A. Courville. Deep Learning. MIT Press, 2016. [DL]
P. Flach. Machine Learning: The Art and Science of Algorithms that Make Sense of Data. First Edition, Cambridge University Press, 2012.
S. J. Russell, P. Norvig. Artificial Intelligence: A Modern Approach. Third Edition, Prentice-Hall, 2010.
Y. S. Abu-Mostafa, M. Magdon-Ismail, H.-T. Lin. Learning from Data: A Short Course. First Edition, 2012.

Papers:

P. Domingos. A Few Useful Things to Know about Machine Learning. Communications of the ACM, vol. 55, no. 10, pp. 78 - 87, 2012.
C. Stauffer, W. E. L. Grimson. Adaptive Background Mixture Models for Real-Time Tracking. Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 246 - 252, 1999.
C. Stauffer, W. E. L. Grimson. Learning Patterns of Activity Using Real-Time Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 747 - 757, 2000.
P. N. Belhumeur, J. P. Hespanha, D. J. Kiregman. Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711 - 720, 1997.

Some Interesting Web Links:

Lecture Schedule, Links to Material

Please see the link to the II Sem (Spring) 2020-2021 offering of this course, for an idea of the approximate structure of the course.

S.No.	Topics	Lectures	Instructor	References/Notes
0	Introduction to Machine Learning	01-01	SDR
	Flavours of Machine Learning: Unsupervised, Supervised, Reinforcement, Hybrid models. Decision Boundaries: crisp, and non-crisp, optimisation problems. Examples of unsupervised learning.	04 Jan (Tue) {lecture#01}	SDR	MS-Teams folder: video_k_means_em1_04jan22.mp4, slides_k_means_em1_04jan22.pdf, lecture_notes_k_means_em1_04jan22.pdf
1	Unsupervised Learning: K-Means, Gaussian Mixture Models, EM	01-06	SDR	[Bishop Chap.9], [Do: Gaussians], [Do: More on Gaussians], [Ng: K-Means], [Ng: GMM], [Ng: EM], [Smyth: EM]
	The K-Means algorithm: Introduction. Algorithms: history, flavours. A mathematical formulation of the K-Means algorithm. The Objective function to minimise. The K-Means algorithm: The Objective function to minimise.	04 Jan (Tue) {lecture#01}	SDR	MS-Teams folder: video_k_means_em1_04jan22.mp4, slides_k_means_em1_04jan22.pdf, lecture_notes_k_means_em1_04jan22.pdf
	The basic K-Means algorithm, computation complexity issues: each step, overall. Limitations of K-Means. K-Means: Alternate formulation with a distance threshold.	05 Jan (Wed) {lecture#02}	SDR	MS-Teams folder: video_k_means_em2_05jan22.mp4, slides_k_means_em2_05jan22.pdf, lecture_notes_k_means_em2_05jan22.pdf
	An introduction to Gaussian Mixture Models. The Bayes rule, and Responsibilities. Maximum Likelihood Estimation. Parameter estimation for a mixture of Gaussians, starting with a simple 1-D single Gaussian case. ML-Estimation: the simple case of one 1-D Gaussian, to the general case of K D-dimensional Gaussians.	07 Jan (Fri) {lecture#03}	SDR	MS-Teams folder: video_k_means_em3_07jan22.mp4, slides_k_means_em3_07jan22.pdf, lecture_notes_k_means_em3_07jan22.pdf
	The general case of K D-dimensional Gaussians. Getting stuck, using Lagrange Multipliers. The EM Algorithm for Gaussian Mixtures. Application: Assignment 1: The Stauffer and Grimson Adaptive Background Subtraction Algorithm. An introduction to the basic set of interesting heuristics!	11 Jan (Tue) {lecture#04}	SDR	MS-Teams folder: video_k_means_em4_11jan21.mp4, slides_k_means_em4_11jan21.pdf, lecture_notes_k_means_em4_11jan21.pdf
	The Stauffer and Grimson algorithm (contd)	12 Jan (Wed) {lecture#05}	SDR	MS-Teams folder: video_k_means_em5_12jan21.mp4, slides_k_means_em5_12jan21.pdf, lecture_notes_k_means_em5_12jan21.pdf
	The Stauffer and Grimson algorithm (contd)	15 Jan (Sat) {lecture#06}	SDR	MS-Teams folder: video_k_means_em6_eigen1_15jan22.mp4, slides_k_means_em6_15jan22.pdf,
2	Unsupervised Learning: EigenAnalysis: PCA, LDA and Subspaces	06-10	SDR	[Ng: PCA], [Ng: ICA], [Burges: Dimension Reduction], [Bishop Chap.12]
	Introduction to Eigenvalues and Eigenvectors. Properties of Eigenvalues and Eigenvectors.	15 Jan (Sat) {lecture#06}	SDR	MS-Teams folder: video_k_means_em6_eigen1_15jan22.mp4, slides_eigen1_15jan22.pdf
	Properties of Eigenvalues and Eigenvectors (contd).	18 Jan (Tue) {lecture#07}	SDR	MS-Teams folder: video_eigen2_18jan22.mp4, slides_eigen2_18jan22.pdf
	Properties of Eigenvalues and Eigenvectors (contd). Gram-Schmidt Orthogonalisation, other properties.	19 Jan (Wed) {lecture#08}	SDR	MS-Teams folder: video_eigen3_19jan22.mp4, slides_eigen3_19jan22.pdf
	The KL Transform (contd). The SVD and its properties.	21 Jan (Fri) {lecture#09}	SDR	MS-Teams folder: video_eigen4_21jan22.mp4, slides_eigen4_21jan22.pdf
	The SVD and its properties (contd). Application: Assignment 2: Eigenfaces and Fisherfaces	25 Jan (Tue) {lecture#10}	SDR	MS-Teams folder: video_eigen5_linear1_25jan22.mp4, slides_eigen5_25jan22.pdf
3	Linear Models for Regression, Classification	10-14	SDR	[Bishop Chap.3], [Bishop Chap.4], [Ng: Supervised, Discriminant Analysis], [Ng: Generative]
	General introduction to Regression and Classification.	25 Jan (Tue) {lecture#10}	SDR	MS-Teams folder: video_eigen5_linear1_25jan22.mp4, slides_linear1_25jan22.pdf
	Linearity and restricted non-linearity. Maximum Likelihood and Least Squares. The Moore-Penrose Pseudo-inverse.	28 Jan (Fri) {lecture#11}	SDR	MS-Teams folder: video_linear2_28jan22.mp4, slides_linear2_28jan22.pdf, lecture_notes_linear2_28jan22.pdf
	The Moore-Penrose Psuedo-inverse (contd). Regularised Least Squares. Three approaches to classification: restricted non-linear models (linear combination of possible non-linear feature transformations). Introduction to linear models: equation of a line in terms of the physical significance of the space, and the weights w. Introduction to linear models: equation of a line in terms of the physical significance of the space, and the weights w (contd). Linear Discriminant Functions: 2 classes, and K classes. Fisher's Linear Discriminant (basic build-up). Fisher's Linear Discriminant. Application: Assignment 2: Eigenfaces and Fisherfaces	29 Jan (Sat) {lecture#12}	SDR	MS-Teams folder: video_linear3_29jan22.mp4, slides_linear3_29jan22.pdf
	Introduction to linear models: equation of a line in terms of the physical significance of the space, and the weights w. Introduction to linear models: equation of a line in terms of the physical significance of the space, and the weights w (contd). Linear Discriminant Functions: 2 classes, and K classes. Fisher's Linear Discriminant (basic build-up). Fisher's Linear Discriminant. Application: Assignment 2: Eigenfaces and Fisherfaces	01 Feb (Tue) {lecture#13}	SDR	MS-Teams folder: video_linear4_01feb22.mp4, slides_linear4_01feb22.pdf
	Fisher's Linear Discriminant (contd). Fisher's Linear Discriminant. Application: Assignment 2: Eigenfaces and Fisherfaces	02 Feb (Wed) {lecture#14}	SDR	MS-Teams folder: video_linear5_svm1_02feb22.mp4, slides_linear5_02feb22.pdf
4	SVMs	14-20	SDR	[Bishop Chap.7], [Alex: SVMs], [Ng: SVMs], [Burges: SVMs], [Bishop Chap.6]
	SVMs: the concept of the margin.	02 Feb (Wed) {lecture#14}	SDR	MS-Teams folder: video_linear5_svm1_02feb22.mp4, slides_svm1_02feb22.pdf
	SVMs: the optimisation problem, getting the physical significance of the y = +1 and y = -1 lines. The two `golden' regions for the 2-class perfectly separable case. The generalised canonical representation in terms of one inequation.	04 Feb (Fri) {lecture#15}	SDR	MS-Teams folder: video_svm2_04feb22.mp4, slides_svm2_04feb22.pdf
	The basic SVM optimisation: the primal and the dual problems. An illustration of the kernel trick. Lagrange Multipliers and the KKT Conditions.	08 Feb (Tue) {lecture#16}	SDR	MS-Teams folder: video_svm3_08feb22.mp4, slides_svm3_08feb22.pdf
	Lagrange Multipliers and the KKT Conditions (contd). The Soft-Margin SVM. Abstracting the basic concepts of the hard-margin SVM, to use in a similar formulation. The function to optimise, the inequality constraints, the KKT conditions from Lagrange's theory. The Primal and Dual formulations. An illustration of the kernel trick. Lagrange Multipliers and the KKT Conditions. Introduction to Kernels.	09 Feb (Wed) {lecture#17}	SDR	MS-Teams folder: video_svm4_09feb22.mp4, slides_svm4_09feb22.pdf
---	Minor	Minor: 15 February 2022 (Tuesday)	---	01:30 pm - 02:30 pm. Scanned scripts to be uploaded by 03:00 pm.
	Introduction to Kernels. Kernels in Regression.	18 Feb (Fri) {lecture#18}	SDR	MS-Teams folder: video_kernel2_18feb22.mp4, lecture_notes_kernel2_18feb22.pdf
	Kernels in Regression (contd).	22 Feb (Tue) {lecture#19}	SDR	MS-Teams folder: video_kernel3_22feb22.mp4, lecture_notes_kernel3_22feb22.pdf
	Properties of kernels, Constructing kernels. The Soft-Margin SVM (contd). The physical significance of the slack parameter in the formulation. The hard margin SVM `template', into which we wish to put, the new soft margin SVM formulation.	23 Feb (Wed) {lecture#20}	SDR	MS-Teams folder: video_kernel4_svm5_23feb22.mp4, lecture_notes_kernel4_23feb22.pdf, lecture_notes_svm5_23feb22.pdf
5	Neural Networks	21-xx	SDR	[Bishop Chap.5], [DL Chap.6], [DL Chap.9]
	Introduction to Neural Networks: the Multi-Layer Perceptron: Conventions, restricted non-linearity.	25 Feb (Fri) {lecture#21}	SDR	MS-Teams folder: video_nn1_25feb22.mp4, lecture_notes_nn1_25feb22.pdf
	Basic Perceptron	02 Mar (Wed) {lecture#22}	SDR	MS-Teams folder: video_nn2_02mar22.mp4, lecture_notes_nn2_02mar22.pdf
	The Basic Perceptron (contd). Mathematical Basics: `factorisation'	04 Mar (Fri) {lecture#23}	SDR	MS-Teams folder: video_nn3_04mar22.mp4, lecture_notes_nn3_04mar22.pdf
	Mathematical Basics: `factorisation' (contd). Second order Taylor series expansion.	05 Mar (Sat) {lecture#24}	SDR	MS-Teams folder: video_nn4_05mar22.mp4, lecture_notes_nn4_05mar22.pdf
	Mathematical Interlude (contd): The need for a second order Taylor series expansion. Eigenanalysis of a Hessian. The Backpropagation Algorithm: some initial points	08 Mar (Tue) {lecture#25}	SDR	MS-Teams folder: video_nn5_08mar22.mp4, lecture_notes_nn5_08mar22.pdf
	The Backpropagation Algorithm (contd).	09 Mar (Wed) {lecture#26}	SDR	MS-Teams folder: video_nn6_09mar22.mp4, lecture_notes_nn6_09mar22.pdf
	Deep Learning structures and concepts: the hidden layer as a kernel function. The XOR example: a linear model does not suffice.	11 Mar (Fri) {lecture#27}	SDR	MS-Teams folder: video_nn7_11mar22.mp4, lecture_notes_nn7_11mar22.pdf
	The XOR example (contd). Other hand-crafted examples, with another activation function. An introduction to the ReLU.	15 Mar (Tue) {lecture#28}	SDR	MS-Teams folder: video_nn8_15mar22.mp4, lecture_notes_nn8_15mar22.pdf
	Vector-Matrix representations. Activation Functions: some further discussion. The sigmoid and tanh revisited. ReLU, Leaky ReLU, ELU.	16 Mar (Wed) {lecture#29}	SDR	MS-Teams folder: video_nn9_16mar22.mp4, lecture_notes_nn9_16mar22.pdf
	Some small hand-crafted neural network examples. Some insight into the expressive power of feed-forward neural networks. Some insight into 2-D inputs, and intepreting first layer weights as images, and its importance for CNNs.	19 Mar (Sat) {lecture#30}	SDR	(Make-up lecture for the missed 22 Mar (Tue) lecture) MS-Teams folder: video_nn10_19mar22.mp4, lecture_notes_nn10_19mar22.pdf
	---	22 Mar (Tue) {no lecture}	SDR	(no class)
	Some insight into 2-D inputs, and intepreting first layer weights as images, and its importance for CNNs (contd): an empirical observation, and evidence for the idea of local receptive fields, low-level differentiation/edge features, with some biological motivation as well, from the visual cortex. Some insight into the expressive power of feed-forward neural networks: the insight from shallow networks with a large width. An example with asymmetric values for inputs and outputs for a digital circuit, and estimating neural network parameters, for D input neurons and 2^D neurons in one hidden layer, and one output neuron.	23 March (Wed) {lecture#31}	SDR	MS-Teams folder: video_nn11_23mar22.mp4, lecture_notes_nn11_23mar22.pdf
	Some insight into the expressive power of feed-forward neural networks (contd). A domain-independent introduction to Convolution	25 Mar (Fri) {lecture#32}	SDR	MS-Teams folder: video_nn12_25mar22.mp4, lecture_notes_nn12_25mar22.pdf
	A domain-independent introduction to Convolution (contd).	26 Mar (Sat) {lecture#33}	SDR	MS-Teams folder: video_nn13_26mar22.mp4, lecture_notes_nn13_26mar22.pdf
	Introducing CNNs: some basic features.	29 Mar (Tue) {lecture#34}	SDR	MS-Teams folder: video_nn14_29mar22.mp4, lecture_notes_nn14_29mar22.pdf
	CNN basics. The basic LeNet architecture.	30 Mar (Wed) {lecture#35}	SDR	MS-Teams folder: video_nn15_30mar22.mp4, lecture_notes_nn15_30mar22.pdf
---	Major	05 Apr (Tue) - 12 Apr (Tue)	---	---
xx	Mathematical Basics for Machine Learning	xx-xx	xx	[Burges: Math for ML], [Do, Kolter: Linear Algebra Notes],

The above list is (obviously!) not exhaustive. Other reference material will be announced in the class. The Web has a vast storehouse of tutorial material on AI, Machine Learning, and other related areas.

Assignments

... A combination of theoretical work as well as programming work.
Both will be scrutinized in detail for original work and thoroughness.
For programming assignments, there will be credit for good coding.
Sphagetti coding will be penalized.
Program correctness or good programming alone will not fetch you full credit ... also required are results of extensive experimentation with varying various program parameters, and explaining the results thus obtained.
Assignments will have to be submitted on or before the due date and time.
Late submissions will not be considered at all.
Unfair means will be result in assigning as marks, the number said to have been discovered by the ancient Indians, to both parties (un)concerned. Assignment 1
Assignment 2
Assignment 3

Examinations and Grading Information

The marks distribution is as follows (out of a total of 100):

Minor I	37
Assignments	25
Major	38
Grand Total	100

ELL784 Marks and Grades (Anonymised)

Some points about examinations, including the honour code:

Instructions for online examinations
Unfair means will be result in assigning as marks, the number said to have been discovered by the ancient Indians, to both parties (un)concerned.

Attendance Requirements:

Attendance requirements for Online Semesters: in accordance with the IIT Delhi rules for an online semester.
Illness policy: illness to be certified by a registered medical practioner.
Attendance in Examinations is Compulsory.

Course Feedback

Link to Course Feedback Form

Sumantra Dutta Roy, Department of Electrical Engineering, IIT Delhi, Hauz Khas,

New Delhi - 110 016, INDIA. sumantra@ee.iitd.ac.in