Stochastic Control
Assignments
Assignment 8
Course Outline
Stochastic Optimization
1
Introduction
2
Dealing with observations
3
Certainty equivalence
4
Interchange arguments
MDPs
5
Finite horizon MDPs
6
Optimal gambling
7
Inventory Management
8
Some simplifying structures
9
Monotonicity of value function and optimal policies
10
Power-delay tradeoff
11
Continuous state spaces
12
Service migration in mobile edge computing
13
Reward Shaping
14
Optimal stopping
15
Infinite horizon MDPs
16
MDP algorithms
17
Inventory management (revisted)
18
Computational complexity of value iteration
19
Thirfty and equalizing policies
20
Linear programming formulation
21
Lipschitz MDPs
22
Periodic MDPs
23
Policy Gradient Theorem
POMDPs
24
Introduction
25
Sequential hypothesis testing
Approx DP
26
Approximate dynamic programming
27
Upper bounds on policy loss
28
Model approximation
Risk sensitive MDPs
29
Risk Sensitive Utility
30
Risk Sensitive MDPs
Linear systems
31
Linear quadratic regulation
32
Large scale systems
33
Linear filtering
Stochastic Approximation
34
Stochastic approximation
35
Rate of convergence for stochastic approximation
RL
36
The learning setup
37
Q-Learning
Dec-POMDPs
38
Designer’s Approach
Analysis Appendix
39
Convergence of sequences
40
Inequalities
Probability Appendix
41
Convergence of random variables
42
Sub-Gaussian random variables
43
Change of Measure
44
Integral Probablity Metrics
45
Markov chains
46
Martingales
47
Stochastic stability
Linear Algebra Appendix
48
Some useful matrix relationships
49
Positive definite matrices
50
Matrix trace
51
Infinite product of matrices
52
Singular value decomposition
53
Vector, Banach, and Hilbert spaces
54
Reproducing Kernel Hilbert Space
Convexity Appendix
55
Convex sets and convex functions
56
Convex Duality
References
Assignments
Assignment 1
Assignment 2
Assignment 3
Assignment 4
Assignment 5
Assignment 6
Assignment 7
Assignment 8
Potential Project Topics
Assignments
Assignment 8
Assignment 8
Author
Affiliation
Aditya Mahajan
McGill University
Updated
March 10, 2026
Exercise
16.2
from the notes on
MDP algorithms
.
Assignment 7
Potential Project Topics