COL 341: Fundamentals of Machine Learning


Course Objectives


After successfully completing the course, the students are expected to develop:

o   Understanding of the basic concepts in machine learning

o   Deeper mathematical understanding of the foundations of machine learning methods

o   Skills in using popular tools, libraries and software for problem solving

o   Hands on experience in solving problems that may be encountered in the industry

o   Introductory understanding and exposure to selected research topics in Machine Learning


Course Outline


Week Number

Topics to be covered



Introduction to the course and basic concepts in machine learning

o   Pattern Recognition and Machine Learning, by Christopher M. Bishop (Chapter 1.1)

o   Python Numpy Tutorial by Justin Johnson


Linear regression, feature creation, ridge regression, cross validation

o   Andrew Ng course (CS229) notes on linear and logistic regression.

o   Convex Optimization, Stephen Boyd and Lieven Vandenberghe

o   Linear Algebra Review and Reference by Zico Kolter


Lasso regression, logistic regression, optimization basics, gradient descent and its variants, step size selection

o   Regression Shrinkage and Selection Via the Lasso

o   Least Angle Regression

o   Compressive Sensing Resources


Multi-class classification, evaluation metrics, Introduction to neural networks

o   Chapter 4.1, PRML, Bishop

o   Classification model metrics: : Precision and recall, sensitivity and specificity, ROC curves


Neural networks, Hessian, overfitting in neural networks, convolutional neural networks (CNNs) and deep learning

o   Chapters 5.1, 5.2, 5.3, 5.4 of PRML, Bishop

o   Neural networks and back-propagation.

o   Roi Livni (Princeton) lecture notes on back-propagation.

o   CS231n (Stanford) slides on back-propagation.

o   Matt Gormley (CMU) slides on back-propagation

o   Convolutional Neural Networks

o   CNN notes from CS231n (Stanford University), Understanding and visualizing CNNs, Transfer learning, CS231n slides on CNN

o   Barnabas Poczos’s lecture notes on CNN


Generative models, Gaussian Discriminant Analysis (GDA), Naïve Bayes Classification

o   Andrew Ng (CS229) lecture Notes on Generative Models


Understanding Multivariate Gaussian Distributions, linear regression revisited

o   Multivariate Gaussian Distribution

o   short reference on Multivariate Gaussian Distributions, by Leon Gu, CMU

o   Multivariate Gaussian Distributions, by Chuong B. Do, Stanford University

o   Probabilistic modelling – Linear regression & Gaussian processes by Fredrik Lindsten, Thomas B. Schön, Andreas Svensson and Niklas Wahlström


Estimation theory

o   Estimation Theory notes, slides from Cristiano Porciani, Point estimation notes by Herman Bennett


Density estimation, Parzen Window, KNN classifier

o   PRML, Bishop, Chapter 2.5

o   Lecture Notes on Nonparametrics by Bruce E. Hansen


Decision trees, Random Forests

o   Chapter 8.1-8.4 from Pattern Classification 2nd Edition, Duda, Hard and Stork

o   Chapter on Decision Trees by Lior Rokach and Oded Maimon

o   Classical paper on Random Forest by Leo Breiman


Support Vector Machines (SVMs)

o   Andrew Ng course (CS229) lecture notes on Support Vector Machines (SVMs)

o   Chapters 5.1, 5.2, 5.3, 5.4, 5.5 from Stephen Boyd and Lieven Vandenberghe


Unsupervised Learning, Principal Component Analysis (PCA)

o   A Tutorial on Principal Component Analysis (PCA) by Jonathon Shlens

o   Slides on PCA by Barnabás Póczos


Non-Negative Matrix Factorization (NMF), K-means, Non-linear dimensionality reduction

o   Lee, Daniel D., and H. Sebastian Seung. Algorithms for non-negative matrix factorization. In Advances in neural information processing systems, pp. 556-562. 2001.

o   Lee, Daniel D., and H. Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature 401, no. 6755 (1999): 788.

o   Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30–37, 2009.

o   PRML, Bishop, Chapter 9.1

o   A Global Geometric Framework for Nonlinear Dimensionality Reduction, Joshua B. Tenenbaum,Vin de Silva,John C. Langford and Nonlinear Dimensionality Reduction by Locally Linear Embedding, Sam T. Roweis and Lawrence K. Saul


Boosting, Generative adversarial networks

o   Yoav Freund and Robert E. Schapire.  A short introduction to boosting.  Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.

o   VC Dimension & The Fundamental Theorem, Lecture notes, Class of Roi Livni

o   Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. "Generative adversarial nets." In Advances in neural information processing systems, pp. 2672-2680. 2014.


Grading structure (tentative)

·         Exams — 45 (Minors — 25, Major — 20)

·         Assignments/project — 45 (Assignments — 30, Project — 15)

·         Quizzes/home work/class participation — 10


Homework assignments (ungraded)

o   Week 1

o   Learn Python, NumPy and SciPy

o   Write a matrix multiplication program in Python using (a) for loops and (b) by using NumPy/SciPy. Plot the Gflops of your programs as a function of matrix sizes

o   Week 2

o   Prove that J(\theta) is convex in \theta (refer to Andrew Ng lecture notes on linear and logistic regression for definition of J(\theta).

o   Read Linear Algebra Review and Reference by Zico Kolter

o   Week 3

o   Read chapters 2.1, 2.2, 2.3, 2.5, 9.2, 9.3 and optionally 9.4 of Convex Optimization text book

o   Week 4

o   Review probability and statistics

o   Read 1D, 2D and 3D convolutions

o   Learn Keras/Tensorflow