COL 341: Fundamentals of Machine Learning
After successfully completing the course, the students are
expected to develop:
o
Understanding of the basic concepts
in machine learning
o
Deeper mathematical understanding of
the foundations of machine learning methods
o
Skills in using popular tools,
libraries and software for problem solving
o
Hands on experience in solving
problems that may be encountered in the industry
o
Introductory understanding and
exposure to selected research topics in Machine Learning
Week Number 
Topics to be covered 
References 
1 
Introduction to the
course and basic concepts in machine learning 
o
Pattern
Recognition and Machine Learning, by Christopher M. Bishop (Chapter 1.1) 
2 
Linear regression, feature
creation, ridge regression, cross validation 
o
Andrew
Ng course (CS229) notes on linear and logistic
regression. o
Convex
Optimization, Stephen Boyd and Lieven Vandenberghe o
Linear
Algebra Review and Reference by Zico Kolter 
3 
Lasso regression,
logistic regression, optimization basics, gradient descent and its variants,
step size selection 

4 
Multiclass
classification, evaluation metrics, Introduction to neural networks 
o
Chapter
4.1, PRML, Bishop o
Classification
model metrics: : Precision
and recall, sensitivity
and specificity, ROC
curves 
5 
Neural networks,
Hessian, overfitting in neural networks, convolutional neural networks (CNNs)
and deep learning 
o
Chapters
5.1, 5.2, 5.3, 5.4 of PRML, Bishop o
Neural
networks and backpropagation. o
Roi
Livni (Princeton) lecture
notes on backpropagation. o
CS231n
(Stanford) slides
on backpropagation. o
Matt
Gormley (CMU) slides
on backpropagation o
Convolutional
Neural Networks o
CNN notes from CS231n
(Stanford University), Understanding and
visualizing CNNs, Transfer
learning, CS231n slides
on CNN o
Barnabas
Poczos’s lecture notes on CNN 
6 
Generative models,
Gaussian Discriminant Analysis (GDA), Naïve Bayes Classification 
o
Andrew
Ng (CS229) lecture
Notes on Generative Models 
7 
Understanding
Multivariate Gaussian Distributions, linear regression revisited 
o
Multivariate
Gaussian Distribution o
A short
reference on Multivariate Gaussian Distributions, by Leon Gu, CMU o
Multivariate Gaussian
Distributions, by Chuong B. Do, Stanford University o
Probabilistic
modelling – Linear regression & Gaussian processes by Fredrik Lindsten, Thomas B. Schön, Andreas Svensson
and Niklas Wahlström 
8 
Estimation theory 
o
Estimation
Theory notes, slides
from Cristiano Porciani, Point
estimation notes by Herman Bennett 
9 
Density estimation, Parzen Window, KNN classifier 
o
PRML,
Bishop, Chapter 2.5 o
Lecture
Notes on Nonparametrics by Bruce E. Hansen 
10 
Decision trees,
Random Forests 
o
Chapter
8.18.4 from Pattern Classification 2nd Edition, Duda,
Hard and Stork o
Chapter on Decision
Trees by Lior Rokach
and Oded Maimon o
Classical
paper on Random Forest by Leo Breiman 
11 
Support Vector
Machines (SVMs) 
o
Andrew
Ng course (CS229) lecture
notes on Support Vector Machines (SVMs) o
Chapters
5.1, 5.2, 5.3, 5.4, 5.5 from Stephen Boyd and Lieven Vandenberghe 
12 
Unsupervised
Learning, Principal Component Analysis (PCA) 
o
A Tutorial on Principal Component
Analysis (PCA) by Jonathon Shlens o
Slides
on PCA by Barnabás Póczos 
13 
NonNegative Matrix
Factorization (NMF), Kmeans, Nonlinear dimensionality reduction 
o
Lee,
Daniel D., and H. Sebastian Seung. Algorithms for nonnegative matrix
factorization. In Advances in neural information processing systems, pp.
556562. 2001. o
Lee,
Daniel D., and H. Sebastian Seung. Learning the parts of objects by
nonnegative matrix factorization. Nature 401, no. 6755 (1999): 788. o
Y.
Koren, R. M. Bell, and C. Volinsky.
Matrix factorization techniques for recommender systems. IEEE Computer,
42(8):30–37, 2009. o
PRML,
Bishop, Chapter 9.1 o
A
Global Geometric Framework for Nonlinear Dimensionality Reduction, Joshua
B. Tenenbaum,Vin de Silva,John
C. Langford and Nonlinear
Dimensionality Reduction by Locally Linear Embedding, Sam T. Roweis and Lawrence K. Saul 
14 
Boosting, Generative
adversarial networks 
o
Yoav
Freund and Robert E. Schapire. A short
introduction to boosting. Journal of
Japanese Society for Artificial Intelligence, 14(5):771780, September, 1999. o
VC
Dimension & The Fundamental Theorem, Lecture notes, Class of Roi Livni o
Goodfellow, Ian, Jean PougetAbadie,
Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville,
and Yoshua Bengio. "Generative
adversarial nets." In Advances in neural information processing
systems, pp. 26722680. 2014. 
·
Exams — 45 (Minors — 25,
Major — 20)
·
Assignments/project — 45
(Assignments — 30, Project — 15)
·
Quizzes/home work/class
participation — 10
o
Week 1
o
Learn Python,
NumPy and SciPy
o
Write a matrix multiplication
program in Python using (a) for loops and (b) by using NumPy/SciPy. Plot the Gflops of your programs as a function of matrix sizes
o
Week 2
o
Prove that
J(\theta) is convex in \theta (refer to Andrew Ng lecture notes on linear and
logistic regression for definition of J(\theta).
o
Read Linear
Algebra Review and Reference by Zico Kolter
o
Week 3
o
Read chapters 2.1,
2.2, 2.3, 2.5, 9.2, 9.3 and optionally 9.4 of Convex Optimization text book
o
Week 4
o
Review
probability and statistics
o
Read 1D, 2D and
3D convolutions
o
Learn Keras/Tensorflow