Stochastic Control
Assignments
Assignment 8
Course Outline
Stochastic Optimization
1
Introduction
2
Dealing with observations
3
Certainty equivalence
4
Interchange arguments
MDPs
5
Finite horizon MDPs
6
Optimal gambling
7
Inventory Management
8
Some simplifying structures
9
Monotonicity of value function and optimal policies
10
Power-delay tradeoff
11
Continuous state spaces
12
Service migration in mobile edge computing
13
Reward Shaping
14
Optimal stopping
15
Infinite horizon MDPs
16
MDP algorithms
17
Inventory management (revisted)
18
Computational complexity of value iteration
19
Thirfty and equalizing policies
20
Linear programming formulation
21
Lipschitz MDPs
22
Periodic MDPs
23
Policy Gradient Theorem
POMDPs
24
Introduction
25
Sequential hypothesis testing
26
When to observe a Markov chain
Approx DP
27
Approximate dynamic programming
28
Upper bounds on policy loss
29
Model approximation
Risk sensitive MDPs
30
Risk Sensitive Utility
31
Risk Sensitive MDPs
Linear systems
32
Linear quadratic regulation
33
Large scale systems
34
Linear filtering
Stochastic Approximation
35
Stochastic approximation
36
Rate of convergence for stochastic approximation
RL
37
The learning setup
38
Q-Learning
Dec-POMDPs
39
Designer’s Approach
Analysis Appendix
40
Convergence of sequences
41
Inequalities
Probability Appendix
42
Convergence of random variables
43
Sub-Gaussian random variables
44
Change of Measure
45
Integral Probablity Metrics
46
Markov chains
47
Martingales
48
Stochastic stability
Linear Algebra Appendix
49
Some useful matrix relationships
50
Positive definite matrices
51
Matrix trace
52
Infinite product of matrices
53
Singular value decomposition
54
Vector, Banach, and Hilbert spaces
55
Reproducing Kernel Hilbert Space
Convexity Appendix
56
Convex sets and convex functions
57
Convex Duality
References
Assignments
Assignment 1
Assignment 2
Assignment 3
Assignment 4
Assignment 5
Assignment 6
Assignment 7
Assignment 8
Potential Project Topics
Assignments
Assignment 8
Assignment 8
Author
Affiliation
Aditya Mahajan
McGill University
Updated
March 10, 2026
Exercise
16.2
from the notes on
MDP algorithms
.
Assignment 7
Potential Project Topics