Computer Architecture (ELL782)

General Information

No one shall be permitted to audit the course. People are welcome to sit through it, however. The course is open to all suitably inclined Masters and Doctoral students.

Credits: 3 (LTP: 3-0-0) [Slot A]

Schedule for Classes:

Monday	08:00 - 09:30	LH-613
Thursday	08:00 - 09:30	LH-613

Schedule for Examinations:

Minor 1: 01 Sep (Fri), LH-606, 08:00am-09:00am
Minor 2: 07 Oct (Sat), LH-606, 08:00am-09:00am
Major: 21 Nov (Tue), LH-521, 08:00am-10:00am

Teaching Assistants:

Hemant Goyal
Akash Nayak

Books, Papers and other Documentation

Textbook:

J. L. Hennessey, D. A. Patterson. Computer Architecture: A Quantitative Approach. Fifth Edition, Elsevier, 2012.

Reference Books:

D. A. Patterson, J. L. Hennessey. Computer Organization and Design: The Hardware/Software Interface. Third Edition, Elsevier, 2005.
J. P. Hayes. Computer Architecture and Organization. Third Edition, WCB/McGraw-Hill, 1998.
K. Hwang, F. A. Briggs. Computer Architecture and Parallel Processing. McGraw-Hill, 1985.
M. M. Mano. Computer System Architecture. Third Edition, PHI, 1993.
P. Kogge. Architecture of Pipelined Computers. McGraw-Hill, 1977.
W. Stallings. Computer Organization and Architecture. Macmillan Publishing Company, 1986.
W. Stallings. Computer Organization Architecture: Designing for Performance. Sixth Edition, Pearson Education, 2003.
K. Hwang. Advanced Computer Architecture: Parallelism, Scalability, Programmability. McGraw-Hill, 1993.
J. D. Carpinelli. Computer Systems Organization & Architecture. Pearson Education Asia, 2001.
W. Stallings. Reduced Instruction Set Computers. Second Edition, IEEE Computer Society Press, 1990.
B. Govindarajalu. Computer Architecture and Organization: Design Principles and Applications. Tata McGraw-Hill, 2004.
H. S. Stone. (Ed.) Introduction to Computer Architecture. Second Edition, Galgotia Publications Pvt. Ltd., 1990.
A. Grama, A. Gupta, G. Karypis, V. Kumar. Introduction to Parallel Computing. (Second Edition), Addison-Wesley, 2003.
D. P. Bertsekas, J. N. Tsitsiklis. Parallel and Distributed Computation. Prentice-Hall International, Inc., 1989.
C. Hamacher, Z. Vranesic, S. Zaky. Computer Organization. Fifth Edition, McGraw-Hill, 2002.
A. S. Tanenbaum. Structured Computer Organization. Fourth Edition, Pearson Education, 1999.

Papers:

D. A. Patterson. Reduced Instruction Set Computers. Communications of the ACM, vol. 28, no. 1, January 1985, pp. 8-21.

[Internal Link: IIT Delhi]

The above list is (obviously!) not exhaustive. Other reference material will be announced in the class. The Web has a vast storehouse of tutorial material on Computer Architecture, and other related areas.

Lecture Schedule, Links to Material (where applicable)

S.No.	Topics	Lectures	Instructor	References/Notes
1	Introduction: Parallel Computations and Architectures	01-03	SDR	[A. Grama, A. Gupta, G. Karypis, V. Kumar] Chap 2, 4 (the relevant portions, related to what was covered in class)
	Microprocessors, Microcontrollers, Models of Computers: the von Neumann and Harvard models, multi-tasking, time-sharing, multiprogramming. Processes and threads (a bare-bones introduction)	24 Jul (Mon) {lecture#01}	SDR
	Store-and-forward routing, linear rings, wrap-around meshes, hypercubes. (Cut-through routing is not in course). Ideal time complexity, and pseudo-code for the non-ideal case (processors not equally powerful, links not with the same bandwidth). Basic Communication operations: One-to-all broadcast on a linear ring, and wrap-around mesh.	27 Jul (Thu) {lecture#02}	SDR
	One-to-all broadcast on a hypercube. Introduction to linear pipelines (data, instruction). Difference between static and dynamic pipelines. (Only static pipelines will be in the course). Linear pipeline metrics. The basic pipeline scheduling problem. Greedy and non-greedy approaches to pipeline scheduling. Examples with reservation tables A and B (as in the book). The depressing worst-case bound: exponential.	31 Jul (Mon) {lecture#03}	SDR
2	Theory of Pipelining	03-05	SDR	[P. Kogge]
	The MAL Lemma (Lemma 3-1), and its (easy) proof. The need for a better (more compact) representation, than a reservation table: Collision Vectors. State diagrams (modified state diagrams, actually). FSM modelling of the pipeline scheduling problem.	03 Aug (Thu) {lecture#04}	SDR
	Simple cycles and compound cycles. Lemma 3-2, and its proof, and physical significance. Finding all greedy cycles. Lemma 3-3 and its proof. Summary of all the lemmas, and the overall practical significance of the above exercise.	10 Aug (Thu) {lecture#05}	SDR
		14 Aug (Mon)	---	(No class: make-up class on 19 Aug (Sat))
3	RISC Pipelining	06-07	SDR	[Patterson's paper]
	RISC Pipelining: history. The trend towards CISC. Factors leading to RISC.	17 Aug (Thu) {lecture#06}	SDR
	RISC pipeline hazards: structural, data and control. The Delayed Branch and Optimal Delayed Branch. Introduction to software issues for RISC pipelines.	19 Aug (Sat) {lecture#07}	SDR	Make-up class (in lieu of missed 14 Aug (Mon) class): 09:00am-10:30am. Venue: II-241 (EE Committee Room)
4	RISC Pipelining: Deeper Issues	07-11	SDR	[Hennessy-Patterson]
	Software issues for RISC pipelines. Compiler stages. Compiler optimisations.	21 Aug (Mon) {lecture#08}	SDR
	Software issues: Data dependences, Name dependences (Anti-dependences, Output dependences), Control dependences. Pipeline scheduling and loop unrolling: an introduction.	24 Aug (Thu) {lecture#09}	SDR
	Loop unrolling and pipeline scheduling: in isolation, and in combination. Example, issues.	28 Aug (Mon) {lecture#10}	SDR	Early class 07:30am-09:00am
---	Minor 1	01 Sep (Fri)	---	---
5	Caches	11-17	SDR	[Hennessy-Patterson] Cache Basics: Appendix B: Review of Memory Hierarchy. Sections B.1 and B.2 Multiprocessor Cache Coherence: Chapter 5: Thread-Level Parallelism. Sections 5.1 and 5.2 Some lecture notes [Internal links]: [a], [b], [c], [d], [e], [f] (IITD only)
	Concluding RISC Pipelining: A mere mention of branch predictors. Caches: reiteration of the von Neumann and Harvard Architectures. Caches for general purpose processors and DSPs. Cache blocks and memory blocks, and address spaces. The four basic cache questions. Cache organisation: direct mapped, fully associative and set associative. Cache block replacement.	04 Sep (Mon) {lecture#11}	SDR
	Cache block replacement strategies. Write strategies: Write-Back and Write-Through.	07 Sep (Thu) {lecture#12}	SDR
	Write-Allocate and No-Write-Allocate. Two basic Cache formulae, and their physical significance.	11 Sep (Mon) {lecture#13}	SDR
	Split Caches and Unified Caches: the complete loaded example!	14 Sep (Thu) {lecture#14}	SDR
	Multiprocessor Cache Coherence: SMP architecture and its significance, the issue of cache coherence for Write-Back caches, and the popular Write Invalidate Protocol.	18 Sep (Mon) {lecture#15}	SDR
	The Write Invalidate Protocol (contd)	21 Sep (Thu) {lecture#16}	SDR
	The Write Invalidate Protocol (contd)	25 Sep (Mon) {lecture#17}	SDR
6	Vector Processors, SIMD Multimedia Extensions, GPU Architectures	18-26	SDR	[Hennessy-Patterson] Chapter 4
	Vector Processors, SIMD Multimedia Extensions, GPU Architectures: an introduction. History. I. Vector Architectures. VMIPS	28 Sep (Thu) {lecture#18}	SDR
---	Minor 2	06 Oct (Fri)	---	---
	DAXPY as an important operation. Chaining, Lanes, Convoys. Clarifications on the Write Invalidate Protocol for Multiprocessor Cache Coherence.	09 Oct (Mon) {lecture#19}	SDR
	Important Questions for Vector Architectures. 1. Lanes for > 1 element per cycle 2. Vector Length Registers: Handling loops not equal to 64 3.`IF' (conditional) inside code to be vectorised	12 Oct (Thu) {lecture#20}	SDR
	4. Memory Banks: for Bandwidth to Vector Load/Store Units	23 Oct (Mon) {lecture#21}	SDR
	5. Multiple Dimensional Matrices 6. Gather-Scatter 7. Compilers and Vector Computers	26 Oct (Thu) {lecture#22}	SDR
	II. SIMD Instruction Set Extensions. Historical Perspective. Why have they been successful?	30 Oct (Mon) {lecture#23}	SDR
	Discussion on threads (and processes), and virtual memory: instructions can be paused	02 Nov (Thu) {lecture#24}	SDR
	III. Graphics Processing Units (GPUs). `__host__` and `__device__`/`__global__`	06 Nov (Tue) {lecture#25}	SDR
	GPUs: an example of grids, thread blocks and SIMD parallelism with a vector multiplication example.	09 Nov (Thu) {lecture#26}	SDR
7	Warehouse-Scale Computers	27-27	SDR	[Hennessy-Patterson] Chapter 5
	Warehouse-Scale Computers	13 Nov (Mon) {lecture#27}	SDR
	---	16 Nov (Thu) {lecture#28}	SDR
---	Major	21 Nov (Tue)	---	---

Mini Project

... A combination of theoretical work as well as programming work.
Both will be scrutinized in detail for original work and thoroughness.
For the programming part, there will be credit for good coding.
Sphagetti coding will be penalized.
Program correctness or good programming alone will not fetch you full credit ... also required are results of extensive experimentation with varying various program parameters, and explaining the results thus obtained.
The Mini Project will have to be submitted on or before the due date and time.
Late submissions will not be considered at all.
Unfair means will be result in assigning as marks, the number said to have been discovered by the ancient Indians, to both parties (un)concerned. Mini Project

Examinations and Grading Information

The marks distribution is as follows (out of a total of 100):

Minor I	28
Minor II	28
Mini Project	16
Major	28
Grand Total	100

ELL782 Evaluation: Mini Project Groups, Examination-related Information [Internal Link: IIT Delhi]

Attendance Requirements:

As per Institute rules.
Illness policy: illness to be certified by the IITD Hospital
Attendance in Examinations is Compulsory.

ELL782 Complete Attendance Records (24.07.2017-16.11.2017)
[No absent above the permitted maximum 25% (7 absents)]

Course Feedback

Link to Course Feedback Form

Sumantra Dutta Roy, Department of Electrical Engineering, IIT Delhi, Hauz Khas,

New Delhi - 110 016, INDIA. sumantra@ee.iitd.ac.in