Distributed Systems

Course: CSL860
Semester II, 2016-17
Credits: 4 (3-0-2)



Instructor: Prof. Smruti R. Sarangi

Lectures
: Tuesday and Friday: 5 to 6 PM, Wedenesday: 12 to 1 PM,  Bharti Building 106

Course Description: This course will give an introduction to some advanced aspects of distributed systems.

Course Load: 1 Mid-term, 1 End-term, Minor 1(Programming Assignment 1), Minor 2(Programming Assignment 2),
                         Programming Assignment 3

Evaluation: Attendance and class participation (15%), Midterm (15%), End term (20%), Assignment 1 (15%), Assignment 2(15%),
                     Assignment 3 (20%)

Teaching Assistant: Ismi Abidi

Reference Books
[Relevant Reference for Most Concepts] Distributed Systems: An Algorithmic Approach (Sukumar Ghosh, CRC)
[For relevant background] Distributed Systems: Principles and Paradigms (Andrew S. Tanenbaum and Martin V. Steen)
[Reference on distributed algorithms] Distributed Algorithms by Nancy Lynch
    OR
    Introduction to Distributed Algorithms by Gerard Tel
[Reference on advanced OS concepts] Advanced Concepts in Operating Systems (Singhal and Shivaratri)


Lectures and Slides:

Date
Lecture
References
Overview and Background
Jan 3rd
Course Overview
(slides adapted from Prof. Martin Van Steen's
  original slides with consent)

Jan 4th
Design of distributed systems, communication protocols, RPC
1) RPC in C (link)
2) Java RMI (link)
Jan 6th
RPC (demonstration on .Net), Message Queues

Information Storage and Retrieval (Gossiping, P2P Networks, DHTs)
Jan 10th
Epidemic Based Algorithms Epidemic Algorithms for Replicated Database Maintenance
Jan 11th
Gossip Based Algorithms
A Gossip-Style Failure Detection Service
Jan 13th
Napster and Gnutella
Napster and Gnutella: A Comparison of Two Popular Peer to Peer Protocols
Jan 17th
Pastry DHT
Pastry: Scalable, decentralized object location and
routing for large-scale peer-to-peer systems
Jan 18th
Chord DHT
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
Jan 20th
Chord DHT

Jan 24th
Freenet
Freenet: A Distributed Anonymous Information Storage and Retrieval System
Jan 25th
BitTorrent
Wikipedia page
Dissecting BitTorrent: Five Months in a Torrent's Lifetime
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric
Distributed Algorithms
Jan 27th
Logical Clocks, Physical Clocks, GPS
Time, Clocks, and the Ordering of Events in a Distributed System
Feb 7th
Lamport's Mutual Exclusion Algorithm

Feb 8th
Ricart Agarwala Algorithm
Maekawa's Algorithm

Maekawa's Algorithm (paper)
Feb 10th
Token based Mutual Exclusion

Feb 14th
Leader Election

Feb 15th
Leader Election

Feb 17th
Minimum Spanning Tree (GHS Algo.)
Notes can be found in Scopes
Mar 7th
Midterm Solutions

Mar 8th
Consistency
Sequential consistency (tutorial), video (after 44th minute)
Mar 10th
Consistency
Impossibility of Distributed Consensus with One Faulty Process
Mar 14th
Paxos
The Part-Time Parliament
Paxos Made Simple
Mar 15th
Paxos Proof and 2-Phase commit
Fault Tolerance

Mar 17th
3-Phase Commit

Mar 25th
Fault Tolerance
3-Phase Commit, Basic Concepts in Fault Tolerance
The Byzantine Generals Problem
Impossibility of Distributed Consensus with One Faulty Process
Mar 25th
Consistent Checkpoints, Message Logging

Mar 29th
Guest Lecture: Dr. Rishi Sinha (Microsoft, Redmond)
1) Azure Service Cloud
2) RAFT Consensus Algorithm
Azure Service Fabric
RAFT Consensus Protocol
Mar 31st
Guest Lecture: Dr. Subodh Sharma
Slides
Communication Deadlocks in MPI Programs
Practical Applications
Apr 5th
Google Percolator, Slides
Large Scale Incremental Processing Using Distributed Transactions and Notifications
Apr 8th
Google Percolator

Apr 7th
Amazon Dynamo Dynamo: Amazon’s Highly Available Key-value Store
Apr 11th
Filesystems: basic concepts
AFS and NFS : Slides
Slides used at IIT Delhi with
permission of David A. Eckhardt. The original
course web page can be found here.


Apr 12th
AFS and NFS

Apr 14th
Coda 1. Coda: A Highly Available File System for a
Distributed Workstation Environment

2. Fallacies of Distributed Systems
Apr 14th
Corona A High Performance Publish-Subscribe System for
the World Wide Web
Apr 18th
Corona

Apr 19th
Condor Distributed Computing in Practice:
The Condor Experience
Apr 21st
DryadLINQ DryadLINQ: A System for General-Purpose Distributed
Data-Parallel Computing Using a High-Level Language
Apr 25th
Facebook: Photo Storage Finding a Needle in a Haystack: Facebook's Photo Storage
Apr 26th
Facebook: Cassandra Cassandra: A Decentralized Structured Storage System
Apr 28th
Voldemort (LinkedIn)
Project Voldemort -- Distributed Key-Value Storage