Skip to content

16-745: Optimal Control & Reinforcement Learning

Piazza Canvas Gradescope Github YouTube

Welcome to 16-745 Optimal Control and Reinforcement Learning at Carnegie Mellon University!

Course Description

This is a course about how to make robots move through and interact with their environment with speed, efficiency, and robustness. We will survey a broad range of topics from nonlinear dynamics, linear systems theory, classical optimal control, numerical optimization, state estimation, system identification, and reinforcement learning. The goal is to provide students with hands-on experience applying each of these ideas to a variety of robotic systems so that they can use them in their own research.

Prerequisites: Strong linear algebra skills, experience with a high-level programming language like Python, MATLAB, or Julia, and basic familiarity with ordinary differential equations.

Teaching Staff

Person's Image

Zachary Manchester

Instructor
zacm@cmu.edu

Person's Image

Kevin Tracy

Head Teaching Assistant
ktracy@cmu.edu

Person's Image

JJ Lee

Teaching Assistant
jeonghunlee@cmu.edu

Person's Image

Fausto Vega

Teaching Assistant
fvega@andrew.cmu.edu

Person's Image

Arun Bishop

Teaching Assistant
arunleob@cmu.edu

Person's Image

Sam Schoedel

Teaching Assistant
sschoede@andrew.cmu.edu

Logistics

  • Lectures will be held Tuesdays and Thursdays 5:00-6:20 PM Eastern time in TEP 1403. Lectures will also be live streamed on zoom and recorded for later viewing. The Zoom link for lectures is available on Canvas.
  • Recitation will be held Fridays 11:00-12:00 PM on zoom
  • Office hours are here
  • Homework assignments will be due by 11:59 PM Eastern time, two weeks after they are assigned.
  • Quizzes are released every Friday, due the following Tuesday at 11:59 PM Eastern time.
  • GitHub will be used to distribute assignments and GradeScope will be used for submissions.
  • Piazza will be used for general discussion and Q&A outside of class and office hours.
  • There will be no exams. Instead, students will form groups of up to five to complete a project on a topic of their choice.

Learning Objectives

By the end of this course, students should be able to do the following:

  1. Analyze the stability of dynamical systems
  2. Design LQR controllers that stabilize equilibria and trajectories
  3. Use offline trajectory optimization to design trajectories for nonlinear systems
  4. Use online convex optimization to implement model-predictive control
  5. Understand the effects of stochasticity and model uncertainty
  6. Directly optimize feedback policies when good models are unavailable

Learning Resources

There is no textbook required for this course. Video recordings of lectures and lecture notes will be posted online. Additional references for further reading will be provided with each lecture. Relavent (free) background material is available in the background section of this website.

Homework

Four homeworks will be assigned during the semester. Students will have at least two weeks to complete each assignment. All homework will be distributed and collected using GitHub. Solutions and grades will be returned within one week of homework due dates.

Grading

Grading will be based on:

Weight Criteria
50% Project
40% Homework
10% Quizzes/participation

Attendance during lectures is not required to earn a full participation grade. Students can also participate through any combination of office hours, Piazza discussions, project presentations, and by offering constructive feedback about the course to the instructors.

Project Guidelines

Students should work in groups of 1--5 to complete a substantial final project. The goal is for students to apply the coarse content to their own research. Project proposals will be solicited on the first homework and topics will be selected in consultation with the instructors.

Project grades will be based on a short presentation given during the last week of class and a final report submitted via Google drive by May 10 Anywhere on Earth. Reports should be written in the form of a 6 page (plus references) ICRA or IROS conference paper using the standard two-column IEEE format. Sections should include an abstract, introduction and/or background to motivate your problem, 2--3 main technical sections on your contributions, conclusions, and references. Grading will be based on the following criteria:

Weight Criteria
10% Class presentation
10% Adherence to IEEE formatting and length requirements
10% Innovation & Creativity: Is what you did new/cool/interesting? Convince me.
30% Clarity of presentation: Can I understand what you did from your writing + plots?
40% Technical correctness: Are your results reasonable? Is your code correct?

Course Policies

Late Homework: Students are allowed a budget of 6 late days for turning in homework with no penalty throughout the semester. They may be used together on one assignment, or separately on multiple assignments. Beyond these six days, no other late homework will be accepted.

Accommodations for Students with Disabilities: If you have a disability and are registered with the Office of Disability Resources, I encourage you to use their online system to notify me of your accommodations and discuss your needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.

Statement of Support for Students' Health & Well-Being: Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep, and taking some time to relax. This will help you achieve your goals and cope with stress.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit http://www.cmu.edu/counseling. Consider reaching out to a friend, faculty, or family member you trust for help getting connected to the support that can help.

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:

CaPS: 412-268-2922

Re:solve Crisis Network: 888-796-8226

If the situation is life threatening, call the police:

On campus: CMU Police: 412-268-2323

Off campus: 911

If you have questions about this or your coursework, please let me know. Thank you, and have a great semester.

Schedule

Week Dates Topics Assignments
1 Jan 16
Jan 18
Course Overview, & Dynamics Intro
Stability, Discrete-Time Dynamics
Survey
HW0 Out
2 Jan 23
Jan 25
Optimization Intro
Numerical Optimization Pt. 1
HW0 Due
HW1 Out
3 Jan 30
Feb 1
Numerical Optimization Pt. 2 & Optimal Control Intro
Pontryagin, Shooting Methods, & LQR Intro
4 Feb 6
Feb 8
LQR as a QP & Riccati Equation
Dynamic Programming & Intro to Convexity

HW 1 Due, HW 2 Out
5 Feb 13
Feb 15
Convex Model-Predictive Control
Intro to Trajectory Optimization, Iterative LQR, & DDP
6 Feb 20
Feb 22
DDP with Constraints and Free Final Time
Direct Trajectory Optimization, Collocation, & SQP
HW2 Due
HW3 Out
7 Feb 27
Feb 29
Attitude Intro: SO(3) & Quaternions
Optimizing with Attitude
8 Mar 5
Mar 7
No Class
No Class
9 Mar 12
Mar 14
LQR with Attitude, Quadrotors, & Contact Intro
Trajectory Optimization for Hybrid Systems
HW3 Due
HW4 Out
10 Mar 19
Mar 21
Data-Driven Methods & Iterative Learning Control
Stochastic Optimal Control & LQG
11 Mar 26
Mar 28
Robust Control & Minimax DDP
RL from an Optimal Control Perspective
HW4 Due
12 Apr 2
Apr 4
Practical Tips & Tricks, Control History
Case Study: How to Land a Rocket
13 Apr 9
Apr 11
Case Study: How to Drive a Car
No Class
14 Apr 16
Apr 18
Case Study: How to Walk
TBD
15 Apr 23
Apr 25
Project Presentations
Project Presentations