Assignment 3
Topic: Neural Networks
Due on or before:
20 April (Saturday) 2024, 11:59 pm
Maximum Marks: 8
(This assignment owes its genesis and existence to Aman Verma's
creative and ever-inquisitive mind)
This assignment has 2 parts, with the second part being open-ended
(and carries significant weightage). You will be working with the
famous digit classification: MNIST dataset (Link:
https://git-disl.github.io/GTDLBench/datasets/mnist_datasets/.
This dataset has digits 0-9, and your task will be to design
neural network models for performing multi-class digit image
classification. The overall assignment has been designed into 2
sections; you are encouraged to attempt both.
-
For coding: You are free to use any open source
deep-learning framework.
PyTorch
,
TensorFlow
(recommended), JAX
etc.
-
Firstly, familiarize with any of these frameworks.
These enables you to bypass backpropagation, and instead
define a forward pass while these cater for the gradient
calculation and weight update during training.
-
You will have to report results in terms of
common classification metrics (for that you are free
to use any library of choice:
Scikit-learn
is quite
the most useful).
-
The MNIST dataset will have separate
Train/Test sets, you have use those sets only.
Any alteration to that will attract strict
penalties. The purpose of adhering to stringent
dataset protocol is to introduce you to
benchmarking strategies used in machine learning
research.
Part 1 (4 Marks)
MLP-based Network:
Design a simple neural-network for performing
10 class classification. Try to achieve highest classification
performance.
-
Use a suitable loss-function for multi-class classification.
-
Experiment with:
-
Different number of layers
-
Different number of neurons in each layer
-
Training parameters: learning-rate, number of iterations etc.
-
Interpretability: Once you have attained your best
model, try to figure what features it is learning i.e.,
what features are being extracted. For this,
-
First, think of some method which will let you analyse the above.
-
Try to visualize intermediate representations of different layers.
-
Try to make interpretations from the representations.
-
Further, you can report these for both correctly classified and
misclassified examples. What do misclassified examples suggest?
-
Analyze the misclassified examples. What can you infer about the
model's learning i.e., which features is it not able to capture?
(Perform this analysis for the best model only).
-
Put forward conclusion points of the model's shortcomings.
Can you suggest some method to overcome these?
-
Should every training example be given equal weightage? Can you think
of something on these lines?
Part 2 (4 marks)
An attempt towards improvement:
This part is open-ended. You have to design a custom model
(as well as its mathematics!) for MNIST digit image classification.
The target is to achieve better performance. Here, our focus is on
introducing a person to machine learning research. Report on the
mathematical model of the designed architecture, as well as the results
(as reported in Part 1). Your design should involve the following points.
A word of caution:
We are not looking for Convolutional Neural Networks (CNNs)
in this part.
-
Any spatial image has both local and global details. By local details we
refer to information contained in any small neighbourhood of pixels.
For example, in the first image of the numeral `3', the first
(upper) patch and the second (lower) patch suggest the structure
of a `3' by virtue of the neighbourhood structure they capture.
This highlights the importance of local details.
-
Local Details: Using a simple network (for image input)
captures global details, in general. Your architecture should:
-
Incorporate some mechanism to encode local details.
-
Learn these local details.
-
Global Details: Global details are also important!
For example, the second image illustrates the fact that
if the model finds out two endings as captured by the two patches,
it will be able to decipher that it is the image of the numeral `3'.
Thus,
-
Your architecture should try to correlate (learn) amongst local blocks
to capture global features.
-
Unified Architecture:
Now, combine the mechanism to capture both local and global details into
one unified architecture. Perform adequate normalization over the features
(within the architecture), wherever it is required.
After designing
the architecture, you will have to perform the following.
-
Report on the mathematical model.
-
Report the obtained performance in terms of the performance metrics.
Try to compare the performance with that of the model in part 1,
as well as the previous benchmarks.
-
Do you need some other performance metric?
Demo Schedule:
(To be announced)
Venue:
Demos:
Sumantra Dutta Roy,
Department of Electrical Engineering, IIT Delhi, Hauz Khas,
New Delhi - 110 016, INDIA.
sumantra@ee.iitd.ac.in