COL863: Special Topics in Theoretical Computer Science
Topic: Concentration Inequalities and their Applications in Computer Science
Also listed as COL7188 for 2025 Entry students
I semester: 2025-26
Amitabha Bagchi
Class Timings: M Th 3:30PM-5PM.
Venue: LH 421
Course objectives
The purpose of this course is to discuss various inequalities that help bound the probability that a random variable, or sum of random variables, is far from its expected value. We will show how these inequalities are used in various area of computer science like algorithms, networks, and machine learning.
Topics
Background required: Elementary probability, some linear algebra.
- Some examples of finding tail bounds using elementary methods.
- Markov's inequality and the first moment method; Chebyshev's inequality and the second moment method.
- The Cramer-Chernoff Method; a direct application to approximating Personalized Page Rank.
- The independent Bernuolli variable case; some applications, e.g., randomised rounding etc.
- Hoeffding's inequality; application to random Fourier approximation in Machine Learning.
- Martingales and the Azuma-Hoeffding inequality; application to reachability algorithms for graphs using random walks.
- McDiarmid's inequality; application to proving generalization error bounds in Machine Learning.
- Bernstein's inequality; application to proving the Johnson-Lindenstrauss Lemma.
- The Matrix Bernstein inequality; random Fourier approximation revisited.
If we have time left, and based on interest, we may also cover other topics. A dscussion of other possibilities will be held around the midpoint of the class.
Text
- [BLM16] S. Boucheron, G. Lugosi, P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press, 2016.
Other references
- [Bagchi04] Amitabha Bagchi, Lecture notes for a simple proof (due to Satish Rao) for the "power of two choices" result, 2004.
- [Bagchi20] Amitabha Bagchi, Efficient approximation of kernel functions: Lecture notes for COV883, II Semester 2019-20, arXiv:2005.01566, 2020.
- [Bagchi25] Amitabha Bagchi, Concentration bounds for the running time of randomized quicksort, July 2025.
- [BE02] Olivier Bousquet, Andre Elisseeff, Stability and Generalization. J. Mach. Learn. Res. 2: 499-526 (2002).
- [Bhatia07] Rajendra Bhatia, Positive Definite Matrices, Princeton University Press, 2007.
- [FRCS05] D. Fogaras, B. Racz, K. Csalogany, and T. Sarlos, Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments, Internet Mathematics 2(3):333-358, 2005.
- [Harvey14] Nicholas Harvey, Lecture Notes for CPSC 536N: Randomized algorithms, University of British Columbia, 2014-15.
- [Harvey15] Nicholas Harvey, Lecture 3: Sparsifiers, Notes for lecture given at the Sixth Cargese Workshop on Combinatorial Optimization, Corsica, France, 2015.
- [MR04] Rajeev Motwani and Prabhakar Raghavan, Randomized Algorithms, Cambridge University Press, 2004.
- [MU05] Michael Mitzenmacher and Eli Upfal, Probability and Computing, Cambridge University Press, 2005.
- [Roch22] Sebastien Roch, Modern Discrete Probability: An Essential Toolkit (Lecture Notes), University of Wisconsin, 2022.
- [Tropp2015] Joel A. Tropp, An Introduction to Matrix Concentration Inequalities, Foundations and Trends in Machine Learning, Vol 8, No. 1-2, pp 1-230, 2015. doi:10.1561/2200000048.
Also available at: arXiv:1501.01571 [math.PR].
- [Tu16] Stephen Tu, Class notes on Random Fourier Features (locally mirrored since the original has gone offline), Univesity of California, Berkeley, 2016.
Course calendar
Evaluation
Evaluation will be on the basis of 4 take-home exams, each of equal weight. No in-room exams will be held for this class during the scheduled Midterm and Major exam period given in the institute calendar.
- Minor 1. Due on gradescope by 11:59PM on Saturday, 23 August 2025.
- Minor 2. Due on gradescope by 11:59PM on Saturday, 20 September 2025.
Exam policies
- You are allowed to discuss solutions but must write them out yourself without using an AI tools to write your solution. Please give credit to your classmates where due.
- Latexed solutions are preferred but handwritten is acceptable. The page limit is 2 pages per solution. This has to be respected.
- Late submission is not allowed under any circumstances. No emailed solution will be accepted. These are mutliple day exams so there is no excuse for not submitting on time.
Attendance policy
If your attendance is below 75% at the end of the class (i.e 20 lectures or lower) your 4th take-home exam will not be graded. If your attendance in the first 14 lectures < 50% (6 lectures or lower) you will be withdrawn from the class and none of your remaining papers will be graded.
Note 1: For the purposes of the above policy, any class that has not been held or cancelled by the instructor will be counted as a "Present" for all registered students.
Note 2: Attendance will be taken at 3:35PM, i.e., 5 minutes after the scheduled beginning time for the class. If you reach after your name is called you will be marked absent. No exceptions will be made. To make this policy fair, the roll order will be randomised in each class.
Note 3: Assume that there are n students in the class and that student A's arrival time to class is a random variable X with mean 3:30pm + Y where E[Y] = k (minutes) and Var[Y] = \sigma. Assuming that calling out one student's name takes 3 seconds and that m classes are held, give a tail bound (in terms k, n, \sigma and m) that student A will fall below 75% attendance. Assume that the arrival time for each day is independent of all other arrival times. Given an arbitrary \epsilon > 0, what is the largest value of k for which the probability of A's falling below 75% attendance is at most \epsilon? A complete analysis of this will earn you 1 free attendance :-)
Plagiarism policy
If you copy even a single line from anyone else's paper or any book, website, or any other source (including AI assistants of any kind) your score for the entire course will be set to 0.
Audit policy
Audit pass will be given if
- you have 75% attendance and
- you score marks that would get you to B- or above if you were crediting the course.
Amitabha
Bagchi