Course Information


This advanced seminar course takes an in-depth look at latest research in computer vision and related fields. The goal of the course is to expose students to a wide range of topics and trends that include multimodal learning, language and vision, deep reinforcement learning, embodied vision, image synthesis, graph networks and intuitive physics. The students will read, present and critique a curated set of research papers, and complete a semester long project in a topic are that interests them. The course is taught by Aykut Erdem.

Instruction style: During the semester, students are responsible for studying and keeping up with the course material outside of class time. These may involve reading particular book chapters, papers or blogs and watching some video lectures. After the first three lectures, each week a student will present a paper related to the topics of the week.

Time and Location

Lectures: Wednesday at 09:00-11:50 (Room D5)


The course webpage will be updated regularly throughout the semester with lecture notes, presentations, and important deadlines. All other course related communications will be carried out through Piazza. Please enroll it by following the link


This course is designed to familiarize students to the current state of the art so a solid background in computer vision and deep learning is strongly recommended. The course is open to all graduate students in the CENG department. Non-CENG graduate students, however, should ask the course instructor for approval before the add/drop period.Prospective senior undergraduate students may sit in on the class. If you are unsure whether you have the background, consider the following list of prerequisites:

  • Programming (you should be a proficient programmer to implement your course project.)
  • Calculus (differentiation, chain rule) and Linear Algebra (vectors, matrices), Basic Probability and Statistics (random variables, expectations, Bayes rule, conditional probabilities)
  • Optimization (cost functions, taking gradients, regularization)
  • Machine Learning (you can still survive this course without a deep learning course before, but it is highly recommended. Some introductory ML courses are BBM406 Fundamentals of Machine Learning, CMP712 Machine Learning, CMP684 Neural Networks, CMP784 Deeo Learning.)
  • Computer Vision (you should have some familiarity with problems in computer vision. Some introductory CV and related courses are BBM416 Fundamentals of Computer Vision, CMP719 Computer Vision, BBM413 Fundamentals of Image Processing and CMP717 Image Processing)

Course Requirements and Grading

Grading for CMP722 will be based on

  • Paper critiques (21%) (3% per paper)
  • Paper presentations (28%) (12% overview, 8% pros, and 8% cons)
  • Project presentations (23%) (5% proposal, 8% update, and 10% final presentation)
  • Project reports and code (28%) (8% progress report, 10% final report, 10% demo) ,


Date Topic Assignments
Feb 27 Introduction to the course
Mar 6 Neural Networks Basics, Spatial Processing with CNNs
Mar 13 Sequential Processing with NNs, Attention Paper selections
Mar 20 Discussions on project proposals
Mar 27 Multimodality
Apr 3 Language and vision
Apr 10 Deep reinforcement learning
Apr 17 Embodied vision
Apr 24 Project progress presentations
May 1 No class Project progress reports due
May 8 Image synthesis
May 15 Graph networks
May 22 Modeling the Physical World
May 29 Final project presentations Final project report due
Detailed Syllabus and Lectures


Reference Books

  • Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press, 2016 (draft available online)

Similar Courses

Deep Learning Frameworks