16-745: Optimal Control & Reinforcement Learning
Piazza Canvas Gradescope Github YouTube
Welcome to 16-745 Optimal Control and Reinforcement Learning at Carnegie Mellon University!
Course Description
This is a course about how to make robots move through and interact with their environment with speed, efficiency, and robustness. We will survey a broad range of topics from nonlinear dynamics, linear systems theory, classical optimal control, numerical optimization, state estimation, system identification, and reinforcement learning. The goal is to provide students with hands-on experience applying each of these ideas to a variety of robotic systems so that they can use them in their own research.
Prerequisites: Strong linear algebra skills, experience with a high-level programming language like Python, MATLAB, or Julia, and basic familiarity with ordinary differential equations.
Teaching Staff
Zachary Manchester
Instructor
zacm@cmu.edu
Kevin Tracy
Head Teaching Assistant
ktracy@cmu.edu
JJ Lee
Teaching Assistant
jeonghunlee@cmu.edu
Fausto Vega
Teaching Assistant
fvega@andrew.cmu.edu
Arun Bishop
Teaching Assistant
arunleob@cmu.edu
Sam Schoedel
Teaching Assistant
sschoede@andrew.cmu.edu
Logistics
- Lectures will be held Tuesdays and Thursdays 5:00-6:20 PM Eastern time in TEP 1403. Lectures will also be live streamed on zoom and recorded for later viewing. The Zoom link for lectures is available on Canvas.
- Recitation will be held Fridays 11:00-12:00 PM on zoom
- Office hours are here
- Homework assignments will be due by 11:59 PM Eastern time, two weeks after they are assigned.
- Quizzes are released every Friday, due the following Tuesday at 11:59 PM Eastern time.
- GitHub will be used to distribute assignments and GradeScope will be used for submissions.
- Piazza will be used for general discussion and Q&A outside of class and office hours.
- There will be no exams. Instead, students will form groups of up to five to complete a project on a topic of their choice.
Learning Objectives
By the end of this course, students should be able to do the following:
- Analyze the stability of dynamical systems
- Design LQR controllers that stabilize equilibria and trajectories
- Use offline trajectory optimization to design trajectories for nonlinear systems
- Use online convex optimization to implement model-predictive control
- Understand the effects of stochasticity and model uncertainty
- Directly optimize feedback policies when good models are unavailable
Learning Resources
There is no textbook required for this course. Video recordings of lectures and lecture notes will be posted online. Additional references for further reading will be provided with each lecture. Relavent (free) background material is available in the background section of this website.
Homework
Four homeworks will be assigned during the semester. Students will have at least two weeks to complete each assignment. All homework will be distributed and collected using GitHub. Solutions and grades will be returned within one week of homework due dates.
Grading
Grading will be based on:
Weight | Criteria |
---|---|
50% | Project |
40% | Homework |
10% | Quizzes/participation |
Attendance during lectures is not required to earn a full participation grade. Students can also participate through any combination of office hours, Piazza discussions, project presentations, and by offering constructive feedback about the course to the instructors.
Project Guidelines
Students should work in groups of 1--5 to complete a substantial final project. The goal is for students to apply the coarse content to their own research. Project proposals will be solicited on the first homework and topics will be selected in consultation with the instructors.
Project grades will be based on a short presentation given during the last week of class and a final report submitted via Google drive by May 10 Anywhere on Earth. Reports should be written in the form of a 6 page (plus references) ICRA or IROS conference paper using the standard two-column IEEE format. Sections should include an abstract, introduction and/or background to motivate your problem, 2--3 main technical sections on your contributions, conclusions, and references. Grading will be based on the following criteria:
Weight | Criteria |
---|---|
10% | Class presentation |
10% | Adherence to IEEE formatting and length requirements |
10% | Innovation & Creativity: Is what you did new/cool/interesting? Convince me. |
30% | Clarity of presentation: Can I understand what you did from your writing + plots? |
40% | Technical correctness: Are your results reasonable? Is your code correct? |
Course Policies
Late Homework: Students are allowed a budget of 6 late days for turning in homework with no penalty throughout the semester. They may be used together on one assignment, or separately on multiple assignments. Beyond these six days, no other late homework will be accepted.
Accommodations for Students with Disabilities: If you have a disability and are registered with the Office of Disability Resources, I encourage you to use their online system to notify me of your accommodations and discuss your needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.
Statement of Support for Students' Health & Well-Being: Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep, and taking some time to relax. This will help you achieve your goals and cope with stress.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit http://www.cmu.edu/counseling. Consider reaching out to a friend, faculty, or family member you trust for help getting connected to the support that can help.
If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:
CaPS: 412-268-2922
Re:solve Crisis Network: 888-796-8226
If the situation is life threatening, call the police:
On campus: CMU Police: 412-268-2323
Off campus: 911
If you have questions about this or your coursework, please let me know. Thank you, and have a great semester.
Schedule
Week | Dates | Topics | Assignments |
---|---|---|---|
1 | Jan 16 Jan 18 |
Course Overview, & Dynamics Intro Stability, Discrete-Time Dynamics |
Survey HW0 Out |
2 | Jan 23 Jan 25 |
Optimization Intro Numerical Optimization Pt. 1 |
HW0 Due HW1 Out |
3 | Jan 30 Feb 1 |
Numerical Optimization Pt. 2 & Optimal Control Intro Pontryagin, Shooting Methods, & LQR Intro |
|
4 | Feb 6 Feb 8 |
LQR as a QP & Riccati Equation Dynamic Programming & Intro to Convexity |
HW 1 Due, HW 2 Out |
5 | Feb 13 Feb 15 |
Convex Model-Predictive Control Intro to Trajectory Optimization, Iterative LQR, & DDP |
|
6 | Feb 20 Feb 22 |
DDP with Constraints and Free Final Time Direct Trajectory Optimization, Collocation, & SQP |
HW2 Due HW3 Out |
7 | Feb 27 Feb 29 |
Attitude Intro: SO(3) & Quaternions Optimizing with Attitude |
|
8 | Mar 5 Mar 7 |
No Class No Class |
|
9 | Mar 12 Mar 14 |
LQR with Attitude, Quadrotors, & Contact Intro Trajectory Optimization for Hybrid Systems |
HW3 Due HW4 Out |
10 | Mar 19 Mar 21 |
Data-Driven Methods & Iterative Learning Control Stochastic Optimal Control & LQG |
|
11 | Mar 26 Mar 28 |
Robust Control & Minimax DDP RL from an Optimal Control Perspective |
HW4 Due |
12 | Apr 2 Apr 4 |
Practical Tips & Tricks, Control History Case Study: How to Land a Rocket |
|
13 | Apr 9 Apr 11 |
Case Study: How to Drive a Car No Class |
|
14 | Apr 16 Apr 18 |
Case Study: How to Walk TBD |
|
15 | Apr 23 Apr 25 |
Project Presentations Project Presentations |