ELL784 Assignment 3

Assignment 3

Topic: Neural Networks

Due on or before: 20 April (Saturday) 2024, 11:59 pm

Maximum Marks: 8

(This assignment owes its genesis and existence to Aman Verma's creative and ever-inquisitive mind)

This assignment has 2 parts, with the second part being open-ended (and carries significant weightage). You will be working with the famous digit classification: MNIST dataset (Link: https://git-disl.github.io/GTDLBench/datasets/mnist_datasets/. This dataset has digits 0-9, and your task will be to design neural network models for performing multi-class digit image classification. The overall assignment has been designed into 2 sections; you are encouraged to attempt both.

For coding: You are free to use any open source deep-learning framework. PyTorch, TensorFlow (recommended), JAX etc.
Firstly, familiarize with any of these frameworks. These enables you to bypass backpropagation, and instead define a forward pass while these cater for the gradient calculation and weight update during training.
You will have to report results in terms of common classification metrics (for that you are free to use any library of choice: Scikit-learn is quite the most useful).
The MNIST dataset will have separate Train/Test sets, you have use those sets only. Any alteration to that will attract strict penalties. The purpose of adhering to stringent dataset protocol is to introduce you to benchmarking strategies used in machine learning research.

Part 1 (4 Marks)
MLP-based Network: Design a simple neural-network for performing 10 class classification. Try to achieve highest classification performance.

Use a suitable loss-function for multi-class classification.
Experiment with:
- Different number of layers
- Different number of neurons in each layer
- Training parameters: learning-rate, number of iterations etc.
Interpretability: Once you have attained your best model, try to figure what features it is learning i.e., what features are being extracted. For this,
- First, think of some method which will let you analyse the above.
- Try to visualize intermediate representations of different layers.
- Try to make interpretations from the representations.
- Further, you can report these for both correctly classified and misclassified examples. What do misclassified examples suggest?
Analyze the misclassified examples. What can you infer about the model's learning i.e., which features is it not able to capture? (Perform this analysis for the best model only).
Put forward conclusion points of the model's shortcomings. Can you suggest some method to overcome these?
Should every training example be given equal weightage? Can you think of something on these lines?

Part 2 (4 marks) An attempt towards improvement: This part is open-ended. You have to design a custom model (as well as its mathematics!) for MNIST digit image classification. The target is to achieve better performance. Here, our focus is on introducing a person to machine learning research. Report on the mathematical model of the designed architecture, as well as the results (as reported in Part 1). Your design should involve the following points. A word of caution: We are not looking for Convolutional Neural Networks (CNNs) in this part.

Any spatial image has both local and global details. By local details we refer to information contained in any small neighbourhood of pixels. For example, in the first image of the numeral `3', the first (upper) patch and the second (lower) patch suggest the structure of a `3' by virtue of the neighbourhood structure they capture. This highlights the importance of local details.
Local Details: Using a simple network (for image input) captures global details, in general. Your architecture should:
- Incorporate some mechanism to encode local details.
- Learn these local details.
Global Details: Global details are also important! For example, the second image illustrates the fact that if the model finds out two endings as captured by the two patches, it will be able to decipher that it is the image of the numeral `3'. Thus,
- Your architecture should try to correlate (learn) amongst local blocks to capture global features.

Unified Architecture: Now, combine the mechanism to capture both local and global details into one unified architecture. Perform adequate normalization over the features (within the architecture), wherever it is required.

After designing the architecture, you will have to perform the following.

Report on the mathematical model.
Report the obtained performance in terms of the performance metrics. Try to compare the performance with that of the model in part 1, as well as the previous benchmarks.
Do you need some other performance metric?

Demo Schedule:
(To be announced)

Venue:
Demos:

Sumantra Dutta Roy, Department of Electrical Engineering, IIT Delhi, Hauz Khas,

New Delhi - 110 016, INDIA. sumantra@ee.iitd.ac.in

Assignment 3

Topic: Neural Networks Due on or before: 20 April (Saturday) 2024, 11:59 pm

Maximum Marks: 8

Topic: Neural Networks

Due on or before: 20 April (Saturday) 2024, 11:59 pm