Welcome to 16-745: Optimal Control and Reinforcement Learning at Carnegie Mellon University!

Course Description

This is a course about how to make robots move through and interact with their environment with speed, efficiency, and robustness. We will survey a broad range of topics from nonlinear dynamics, linear systems theory, classical optimal control, numerical optimization, state estimation, system identification, and reinforcement learning. The goal is to provide students with hands-on experience applying each of these ideas to a variety of robotic systems so that they can use them in their own research.

Prerequisites: Strong linear algebra skills, experience with a high-level programming language like Python, MATLAB, or Julia, and basic familiarity with ordinary differential equations.

Teaching Staff

Zachary Manchester

Instructor
zacm@cmu.edu

JJ Lee

Head Teaching Assistant
jeonghunlee@cmu.edu

Arun Bishop

Teaching Assistant
arunleob@cmu.edu

Fausto Vega

Teaching Assistant
fvega@andrew.cmu.edu

John Zhang

Teaching Assistant
johnzhang@cmu.edu

Ashley Kline

Teaching Assistant
ankline@cmu.edu

Logistics

Lectures will be held Tuesdays and Thursdays 12:30-1:50 PM EST in GHC 4401. Lectures will also be live streamed on zoom and recorded for later viewing. The Zoom links for lectures and office hours are available on Piazza and Canvas.

Office hours, lecture schedule, and deadlines can be found on the course calendar here
Homework assignments will be due on Thursdays 11:59 PM EST, two weeks after they are assigned.
Quizzes are released every Friday, due the following Tuesday at 11:59 PM EST.
GitHub will be used to distribute assignments and GradeScope will be used for submissions.
Piazza will be used for general discussion and Q&A outside of class and office hours.
There will be no exams. Instead, students will form groups of up to five to complete a project on a topic of their choice.

Learning Objectives

By the end of this course, students should be able to do the following:

Analyze the stability of dynamical systems
Design LQR controllers that stabilize equilibria and trajectories
Use offline trajectory optimization to design trajectories for nonlinear systems
Use online convex optimization to implement model-predictive control
Understand the effects of stochasticity and model uncertainty
Directly optimize feedback policies when good models are unavailable

Learning Resources

There is no textbook required for this course. Video recordings of lectures and lecture notes will be posted online. Additional references for further reading will be provided with each lecture. Relavent (free) background material is available in the background section of this website.

Homework

Four homeworks will be assigned during the semester. Students will have at least two weeks to complete each assignment. All homework will be distributed and collected using GitHub. Solutions and grades will be returned within one week of homework due dates.

Grading

Grading will be based on:

Weight	Criteria
50%	Project
40%	Homework
5%	Quizzes
5%	Participation

Attendance during lectures is not required to earn a full participation grade. Students can also participate through any combination of office hours, Piazza discussions, project presentations, and by offering constructive feedback about the course to the instructors.

Project Guidelines

Students should work in groups of 1--5 to complete a substantial final project. The goal is for students to apply the coarse content to their own research. Project proposals will be solicited on the first homework and topics will be selected in consultation with the instructors.

Project grades will be based on a short presentation given during the last week of class and a final report submitted via Google drive by May 10 Anywhere on Earth. Reports should be written in the form of a 6 page (plus references) ICRA or IROS conference paper using the standard two-column IEEE format. Sections should include an abstract, introduction and/or background to motivate your problem, 2--3 main technical sections on your contributions, conclusions, and references. Grading will be based on the following criteria:

Weight	Criteria
10%	Class presentation
10%	Adherence to IEEE formatting and length requirements
10%	Innovation & Creativity: Is what you did new/cool/interesting? Convince me.
30%	Clarity of presentation: Can I understand what you did from your writing + plots?
40%	Technical correctness: Are your results reasonable? Is your code correct?

Course Policies

Late Homework: Students are allowed a budget of 6 late days for turning in homework with no penalty throughout the semester. They may be used together on one assignment, or separately on multiple assignments. Beyond these six days, no other late homework will be accepted.

Accommodations for Students with Disabilities: If you have a disability and are registered with the Office of Disability Resources, I encourage you to use their online system to notify me of your accommodations and discuss your needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.

Statement of Support for Students' Health & Well-Being: Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep, and taking some time to relax. This will help you achieve your goals and cope with stress.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit http://www.cmu.edu/counseling. Consider reaching out to a friend, faculty, or family member you trust for help getting connected to the support that can help.

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:

CaPS: 412-268-2922

Re:solve Crisis Network: 888-796-8226

If the situation is life threatening, call the police:

On campus: CMU Police: 412-268-2323

Off campus: 911

If you have questions about this or your coursework, please let me know. Thank you, and have a great semester.

Schedule

This is subject to change. HW deadlines will be updated as the semester progresses.

Week	Dates	Topics	Assignments
1	Jan 14 Jan 16	Course Overview, & Dynamics Intro Stability, Discrete-Time Dynamics	Survey HW0 Out
2	Jan 21 Jan 23	Optimization Intro Numerical Optimization Pt. 1	Quiz 1 Due HW0 Due, HW1 Out
3	Jan 28 Jan 30	Numerical Optimization Pt. 2 & Optimal Control Intro Regularization & Merit Functions	Quiz 2 Due
4	Feb 4 Feb 6	Pontryagin & Shooting Methods LQR in 3 Ways	Quiz 3 Due HW1 Due, HW2 Out
5	Feb 11 Feb 13	Dynamic Programming & Intro to Convexity Convex Model-Predictive Control	Quiz 4 Due
6	Feb 18 Feb 20	Intro to Nonlinear Trajectory Optimization Differential Dynamic Programming & iLQR	Quiz 5 Due HW2 Due, HW3 Out
7	Feb 25 Feb 27	No Class Direct Trajectory Optimization, Collocation, & SQP	Quiz 6 Due
8	Mar 3 Mar 5	No Class No Class
9	Mar 11 Mar 13	Attitude Intro: SO(3) & Quaternions Optimizing with Attitude
10	Mar 18 Mar 20	LQR with Attitude, Quadrotors, & Contact Intro Trajectory Optimization for Hybrid Systems	Quiz 7 Due HW3 Due, HW4 Out
11	Mar 25 Mar 27	Data-Driven Methods & Iterative Learning Control Stochastic Optimal Control & LQG	Quiz 8 Due
12	Apr 1 Apr 3	Robust Control & Minimax DDP No Class	Quiz 9 Due
13	Apr 8 Apr 10	Practical Tips & Tricks, RL from an Optimal Control Perspective Case Study: How to Drive a Car	Quiz 10 Due HW4 Due
14	Apr 15 Apr 17	Case Study: How to Land a Rocket Case Study: How to Walk
15	Apr 22 Apr 24	Project Presentations Project Presentations