Stochastic Control and Decision Theory
Course Outline
General Information (Winter 2026)
- Instructor
-
- Aditya Mahajan
- Office Hours: TBD
- Lectures
-
4:05pm–5:35pm Wednesday, Friday (Wong 1030)
- Prerequisites
-
- ECSE 509 (Graduate Probability)
- Communication
- Use the discussion board on myCourses for all questions related to the course. Only personal emails related to medical exceptions for missing a deliverable will be answered.
Course Content
The objective of the course is to provide the background necessary to read and understand research papers in stochastic control and reinforcement learning. Therefore, the emphasis is on understanding the intuition and the details of the proof. We will go through all the proofs in class.
We will also study examples from different application domains: communications, operations research, control systems, and power systems, with an emphasis on modeling and establishing qualitative properties of the optimal solution.
Course Material
The material for the lecture notes is taken from various sources, including the reference books listed below. If you find any typos/mistakes in the notes, please let me know. Pull requests are welcome.
Reference books
Kumar and Varaiya, Stochastic Systems: Estimation, Identification, and Adaptive Control, Prentice Hall, 1986. Reprinted by SIAM 2015
A gentle introduction which emphaisizes the key conceptual ideas.Bertsekas, Dynamic programming and optimal control, vol 1 and 2, Athena Publications, 2005.
Perhaps the most comprehensive book of different topics in dynamic programming.Puterman, Markov decision processes: discrete time dynamic programming, Wiley 1994.
Excellent source algorithms for perfectly observed systems, in particular, infinite horizon dynamic programs.Ross, Introduction to Stochastic Dynamic Programming, Academic Press, 1983.
Excellent introduction to dynamic programming, from the point-of-view of applied mathematics.Dernardo, Dynamic Programming: Models and Applications, Prentice Hall, 1982.
Excellent introduction to dynamic programming, from the point-of-view of operations research.Powell, Approximate Dynamic Programming, John Wiley and Sons, 2011.
Comprehensive overview of approximate dynamic programmingKrishnamurty, Partially Observable Markov Decision Processes, Cambridge University Press, 2016.
Comprehensive overview of POMDPsSargent and Stachurski, Dynamic Programming, 2023.
Nice summary of DP ideas applied to economic models. Good mix of theory and numerical examples.Kochenderfer, Wheeler, and Wray, Algorithms for decision making, MIT Press, 2022.
Broad introduction to decision making under uncertainty. Lots of nice examples.
Evaluation
Assignments (30%) Weekly homework assignments. Typically, each assignment will consist of four or five questions, out of which one or two randomly selected questions will be graded.
Mid Term (25%) Closed book in-class exam. 13 March (during class time)
Term Project (20%) A month long term project to be done in groups of two. Present one paper or book chapter on any topic of your interest related to the material covered in class.
Final Exam (25%) Closed book, in-person exam. Will be scheduled by the exam office and the dates will be announced later.
The final exam will cover all the material seen in the class during the term.
Marking policy
Assignments must be submitted electronically on myCourses as a PDF. You may write the assignments on paper and then scan them as a PDF (there are several such apps available for all phone platforms), or write on a tablet and convert to PDF, or type using a word processor.
There will no make-up examination for students who miss a mid-term.
Student who miss the exam due to a valid reason (see Faculty of Engineering policy) should notify the instructor within a week of the exam and provide necessary documentation.
If, and only if, proper documentation for a missed exam is presented, the marks for the missed exam will be shifted to the final exam.
Students who miss the mid-term exam for any other reason (e.g., no medical note, going to the exam at the wrong time, or on the wrong day, etc.) will get zero marks on the exam.
Any request for reevaluation of a mid-term or an assignment must be made in writing within a week of its return. Note that requesting a re-grade will mean that you WHOLE assignment or exam will be re-graded.
- Right to submit in English or French written work that is to be graded.
- In accord with McGill University’s Charter of Students’ Rights, students in this course have the right to submit in English or in French any written work that is to be graded.
- Academic Integrity
-
McGill University values academic integrity. Therefore all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures (see McGill’s guide to academic honesty for more information).
L’université McGill attache une haute importance à l’honnêteté académique. Il incombe par conséquent à tous les étudiants de comprendre ce que l’on entend par tricherie, plagiat et autres infractions académiques, ainsi que les conséquences que peuvent avoir de telles actions, selon le Code de conduite de l’étudiant et des procédures disciplinaires (pour de plus amples renseignements, veuillez consulter le guide pour l’honnêteté académique de McGill.)
Course delivery
The course is taught in a “chalk and board” style; there will be no power point presentations. All students are expected to attend lectures and take notes. Partial notes on some of the material will be provided, but are not a substitute for the material covered in class.
© Instructor-generated course materials (e.g., handouts, notes, summaries, exam questions) are protected by law and may not be copied or distributed in any form or in any medium without explicit permission of the instructor. Note that infringements of copyright can be subject to follow up by the University under the Code of Student Conduct and Disciplinary Procedures.
Additional Notes
As the instructor of this course I endeavor to provide an inclusive learning environment. However, if you experience barriers to learning in this course, do not hesitate to discuss them with me or contact the office of Student Accessibility and Achievement.
End-of-course evaluations are one of the ways that McGill works towards maintaining and improving the quality of courses and the student’s learning experience. You will be notified by e-mail when the evaluations are available. Please note that a minimum number of responses must be received for results to be available to students.
How to cite these notes
To cite these lecture notes, please use:
@misc{506notes,
author = {Aditya Mahajan},
title = {Lecture notes on Stochastic Control and Decision Theory},
year = {2026},
howpublished = "\url{https://adityam.github.io/stochastic-control/}",
}