Computer Architecture (ELL782)

General Information

No one shall be permitted to audit the course. People are welcome to sit through it, however. The course is open to all suitably inclined Masters and Doctoral students.

Credits: 3 (LTP: 3-0-0) [Slot A]

Schedule for Classes:

Monday
08:00 - 09:30
IIA-201 (Bharti)
Thursday
08:00 - 09:30
IIA-201 (Bharti)

Schedule for Examinations:

Minor 1: 27 Aug (Tue), LH-310, 08:00am-09:00am
Minor 2: 27 Sep (Fri), LH-310, 08:00am-09:00am

Teaching Assistants:

Devyani Agarwal
Asmita Nandkumar Patil


Books, Papers and other Documentation

Textbook:

Reference Books:

Papers:

[Internal Link: IIT Delhi]

The above list is (obviously!) not exhaustive. Other reference material will be announced in the class. The Web has a vast storehouse of tutorial material on Computer Architecture, and other related areas.

Lecture Schedule, Links to Material (where applicable)

Please see the link to the I Sem (Autumn) 2017-2018 offering of this course, for an idea of the approximate structure of the course.
S.No.
Topics
Lectures
Instructor
References/Notes
1
Introduction:
Parallel Computations and Architectures
01-03
SDR
[A. Grama, A. Gupta, G. Karypis, V. Kumar] Chap 2, 4 (the relevant portions, related to what was covered in class)
Microprocessors, Microcontrollers, Models of Computers: the von Neumann and Harvard models, multi-tasking, time-sharing, multiprogramming. Processes and threads (a bare-bones introduction)
22 Jul (Mon) {lecture#01}
SDR
Store-and-forward routing, linear rings, wrap-around meshes, hypercubes. (Cut-through routing is not in course). Ideal time complexity, and pseudo-code for the non-ideal case (processors not equally powerful, links not with the same bandwidth). Basic Communication operations: One-to-all broadcast on a linear ring. Ideal time complexity.
25 Jul (Thu) {lecture#02}
SDR
Basic Communication operations: One-to-all broadcast on a linear ring. pseudo-code for the non-ideal case. The case of a wrap-around mesh: the ideal case. The case of a hypercube. The ideal case, and pseudo-code for the non-ideal case.
Introduction to Flynn's classification of computers: SISD, SIMD, MIMD.
29 Jul (Mon) {lecture#03}
SDR
2
Theory of Pipelining
04-06
SDR
[P. Kogge]
Introduction to linear pipelines (data, instruction). Difference between static and dynamic pipelines. (Only static pipelines will be in the course). Linear pipeline metrics. The basic pipeline scheduling problem. Greedy and non-greedy approaches to pipeline scheduling. Examples with reservation tables A and B (as in the book). The depressing worst-case bound: exponential.
01 Aug (Thu) {lecture#04}
SDR
The MAL Lemma (Lemma 3-1), and its (easy) proof. The need for a better (more compact) representation, than a reservation table: Collision Vectors. State diagrams (modified state diagrams, actually). FSM modelling of the pipeline scheduling problem.
05 Aug (Mon) {lecture#05}
SDR
[Early start: 07:30am]
(to make up for the 19 Sep (Thu) scheduled class to be missed)
Lemma 3-2: finding simple cycles suffices. Lemma 3-3: the upper bound. Proof of both. The combined lemmas, and the best and worst case scenarios.
08 Aug (Thu) {lecture#06}
SDR
[Early start: 07:30am]
(to make up for the 19 Sep (Thu) scheduled class to be missed)
3
RISC Pipelining
06-07
SDR
[Patterson's paper]
RISC Pipelining: history. The trend towards CISC. Factors leading to RISC.
08 Aug (Thu) {lecture#06} (contd.)
SDR
RISC pipeline hazards: structural, data and control. The Delayed Branch and Optimal Delayed Branch.
19 Aug (Mon) {lecture#07}
SDR
[Early start: 07:30am]
(to make up for the 19 Sep (Thu) scheduled class to be missed)
4
RISC Pipelining: Deeper Issues
07-11
SDR
[Hennessy-Patterson]
Introduction to software issues for RISC pipelines. The role of compilers and assemblers, and their optimisations.
22 Aug (Thu) {lecture#08}
SDR
[Early start: 07:30am]
(to make up for the 10 Oct (Thu) scheduled class to be missed)
Compiler and Assembler optimisations (contd). Details on dependences. Introduction to other pipeline scheduling (static) and loop unrolling.
23 Aug (Fri) {lecture#09}
SDR
[Early start: 07:30am]
(to make up for the 10 Oct (Thu) scheduled class to be missed)
---
Minor 1
27 Aug (Tue)
---
---
Loop unrolling and pipeline scheduling: in isolation, and in combination. Example, issues.
Concluding RISC Pipelining: A mere mention of branch predictors.
29 Aug (Thu) {lecture#10}
SDR
[Early start: 07:30am]
(to make up for the 10 Oct (Thu) scheduled class to be missed)
5
Caches
11-15
SDR
[Hennessy-Patterson]
Cache Basics: Appendix B: Review of Memory Hierarchy. Sections B.1 and B.2
Multiprocessor Cache Coherence: Chapter 5: Thread-Level Parallelism. Sections 5.1 and 5.2

Some lecture notes [Internal links]: [a], [b], [c], [d], [e], [f] (IITD only)
---
02 Sep (Mon) {lecture#xx}
---
(No Class)
Caches: reiteration of the von Neumann and Harvard Architectures. Caches for general purpose processors and DSPs. Cache blocks and memory blocks, and address spaces. The four basic cache questions. Cache organisation: direct mapped, fully associative and set associative. Cache block replacement.
05 Sep (Thu) {lecture#11}
SDR
[Early start: 07:30am]
(to make up for the 02 Sep (Mon) scheduled class to be missed)
Write Back and Write Through notes. Write Allocate and No Write Allocate strategies.
09 Sep (Mon) {lecture#12}
SDR
[Early start: 07:30am]
(to make up for the 02 Sep (Mon) scheduled class to be missed)
Two Cache numericals.
1. The two-cache representative numerical. Calcualting the average miss rate and average mean access time
12 Sep (Thu) {lecture#13}
SDR
[Early start: 07:30am]
(to make up for the 02 Sep (Mon) scheduled class to be missed)
Multiprocessor Cache Coherence. Directory Protocols, and Symmptric Multi-Processor system protocols (the second is of practical importance, in multi-core processors). The Write Invalidate Protocol (the one used in most multi-core processors).
16 Sep (Mon) {lecture#14}
SDR
[Early start: 07:30am] (to make up for the 23 Sep (Mon) scheduled class to be missed)
---
19 Sep (Thu) {lecture#xx}
---
(No Class)
---
23 Sep (Mon) {lecture#xx}
---
(No Class)
---
Minor 2
27 Sep (Fri)
---
---
The Write Invalidate Protocol (contd).
30 Sep (Mon) {lecture#15}
SDR
[Early start: 07:30am] (to make up for the 23 Sep (Mon) scheduled class to be missed)
---
10 Oct (Thu) {lecture#xx}
---
(No Class)
6
Vector Processors, SIMD Multimedia Extensions, GPU Architectures
16-22
SDR
[Hennessy-Patterson]
Chapter 4: Data-level Parallelism in Vector, SIMD & GPU Architectures.
The three basic types of SIMD architectures:
1. Vector Processors (Cray)
2. SIMD Multimedia Instruction Set Extensions (Intel)
3. GPUs (NVIDIA)
DAXPY as a common basic operation.
Chaining, Lanes, Convoy, Chime.
8 important vector processor/SIMD questions/concepts.
14 Oct (Mon) {lecture#16}
SDR
[Early start: 07:30am] (to make up for the 23 Sep (Mon) scheduled class to be missed)
SIMD Concept 1: Lanes for > 1 element per cycle
SIMD Concept 2: Vector length registers, handling `non-64' loops
SIMD Concept 3: Conditionals (`IF') inside code to be vectorised
17 Oct (Thu) {lecture#17}
SDR
SIMD Concept 4: Memory banks
For bandwidth to vector load/store units.
21 Oct (Mon) {lecture#18}
SDR
SIMD Concept 5: Multiple Dimensional Matrices Matrix Multiplication.
A cursorial mention of SIMD Concepts 6 and 7: Gather-Scatter (Handling Sparse Matrices) and Compilers for Vector Architectures
24 Oct (Thu) {lecture#19}
SDR
II. SIMD Instruction Set Extensions for Multimedia
3 main omissions compared to Vector Processors (Cray).
History and generations.
Why have they been successful in spite of some obvious limitations?
31 Oct (Thu) {lecture#20}
SDR
III. GPUs (NVIDIA)
An example with a `grid', a thread block, and individual SIMD threads. Memory at different levels.
04 Nov (Mon) {lecture#21}
SDR
III. GPUs (NVIDIA) (contd.)
CUDA coding, closing notes.
07 Nov (Thu) {lecture#22}
SDR
7
Warehouse-Scale Computers
23-23
SDR
[Hennessy-Patterson] Chapter 5
Warehouse-Scale Computers: Clouds: Amazon, Google, Microsoft
11 Nov (Mon) {lecture#23}
SDR
8
Towards Basic Dynamic Scheduling: Scoreboarding, Tomasulo's Algorithm
23-23
SDR
[Hennessy-Patterson] Appendix C
Introducing a basic 5-stage pipeline, and their purpose
11 Nov (Mon) {lecture#24}
SDR
---
14 Nov (Thu) {lecture#xx}
---
(No Class)


Mini Project

... A combination of theoretical work as well as programming work.
Both will be scrutinized in detail for original work and thoroughness.
For the programming part, there will be credit for good coding.
Sphagetti coding will be penalized.
Program correctness or good programming alone will not fetch you full credit ... also required are results of extensive experimentation with varying various program parameters, and explaining the results thus obtained.
The Mini Project will have to be submitted on or before the due date and time.
Late submissions will not be considered at all.
Unfair means will be result in assigning as marks, the number said to have been discovered by the ancient Indians, to both parties (un)concerned.
Mini Project


Examinations and Grading Information

The marks distribution is as follows (out of a total of 100):
Minor I
28
Minor II
28
Mini Project
16
Major
28
Grand Total
100

ELL782 Evaluation: Mini Project Groups, Examination-related Information [Internal Link: IIT Delhi]

Attendance Requirements:

As per Institute rules.
Illness policy: illness to be certified by the IITD Hospital
Attendance in Examinations is Compulsory.

ELL782 Attendance Records (on the moodle page)
[No absent above the permitted maximum 25% (7 absents)]


Course Feedback

Link to Course Feedback Form

Sumantra Dutta Roy, Department of Electrical Engineering, IIT Delhi, Hauz Khas,
New Delhi - 110 016, INDIA. sumantra@ee.iitd.ac.in