BIL 722: Advanced Topics in Computer Vision
(Deep Learning for Computer Vision)

Spring 2016

By Sylwia Bartyzel from unsplash.com
Google's Deep Dream re-interprets Georges Seurat's A Sunday Afternoon on the Island of La Grande Jatte (1884). The generated image courtesy of Alex Korbonits.

Course Information

Description

This is a graduate seminar course exploring recent advances in computer vision, with a special focus on deep learning. In particular, the class will take an in-depth look at common deep architectures and their applications to various problems in computer vision. The topics include image/scene/video classification, object detection, segmentation, action/activity recognition, image captioning and visual question answering.

Time and Location

Lectures: Tuesday at 13:30-16:30 (Room D9)

Course Instructor

Aykut Erdem's avatar

Aykut Erdem

Email: aykut@cs.hacettepe.edu.tr
Homepage http://web.cs.hacettepe.edu.tr/~aykut
Office Hour: By appointment (Send email)

Communication

The course webpage will be updated regularly throughout the semester with lecture notes, presentations, assignments and important deadlines. All other course related communications will be carried out through Piazza. Please enroll it by following the link https://piazza.com/hacettepe.edu.tr/spring2016/bil722.

Pre-requisites

Courses in computer vision and/or machine learning (e.g. BBM 406, BBM 416, BIL 712, BIL 719). Good programming skills for the assignment(s) and the course project.

Course Requirements and Grading

Grading for BIL 722 will be based on

  • homework (20%)
  • course project (done in pairs) (presentation and reports) (40%),
  • paper presentations (25%),
  • class participation (attendance, participation in discussions, response papers) (15%).

Schedule

Date Topic Reading Presenters
Feb 9 Background and Basics [slides]
course information, what is deep learning, linear classification, nearest neighbor classfiers, hyperparameter search, cross-validation, loss functions, stochastic gradient descent
Aykut Erdem
Feb 16 Training Neural Networks [slides]
feedforward neural networks, activation functions, backpropagation, hyperparameter optimization, weight initialization, batch normalization, dropout
Erkut Erdem
Feb 23 Convolutional Neural Networks (ConvNets) [slides]
Caffe Tutorial [slides]
Aykut Erdem

Cagdas Bak
Mar 1 Course Project [slides], ConvNets In Practice: Image Classification Hilal Ergun Akyuz

Mehmet Gunel

Semih Yagcioglu
Mar 8 ConvNets In Practice: Scene Classification and Object Detection
TensorFlow Tutorial [slides]
Bora Celikkale

Kemal Cizmeciler

Goksu Erdogan

M. Kerim Yucel
Mar 15 ConvNets In Practice: Segmentation Cagdas Bak

Mehmet Gunel

Goksu Erdogan
Mar 22 ConvNets In Practice: Video Classification
Theano Tutorial [slides]
Cemil Zalluhoglu

Iman Rezazadeh

Cagdas Bak

Semih Yagcioglu
Mar 29 ConvNets In Practice: Misc Aysun Kocak

Okay Arik

Berkan Demirel
Apr 5 Recurrent Neural Networks (RNNs) [slides]
backpropagation through time (BTT), memory units, LSTMs
Nazli Ikizler Cinbis
Apr 12 RNNs In Practice: Language and Vision
Keras Tutorial [slides]
Berkan Demirel

Mert Kilickaya

Muhammet Ali Asan
Apr 19 Progress Presentations
Apr 26 RNNs In Practice: Video Classification Ozge Yalcinkaya

Mehmet Kerim Yucel

Ezgi Peksen Soysal
May 3 RNNs In Practice: Object Recognition and Segmentation Ceren Guzel Turhan

Semih Yagcioglu

Okay Arik
May 10 Unsupervised Deep Learning
Boltzmann machines and log-bilinear models, autoencoders
Hilal Ergun Akyuz

Levent Karacan

back to top

Presentations

Depending on the class enrollment, each student is required to present one or two papers over the course of the semester. Each presentation should be clear, well organized and very technical, and roughly 30 minutes long. The presenter should read the assigned paper in detail and be prepared to effectively lead the class discussion on the paper.

To prepare your presentation, you can use any presentation tool (e.g., Powerpoint, Keynote, LaTex) provided that the tool has options to export the slides to PDF. You are allowed to reuse the material already exist on the web as long as you clearly cite the source of the media that you have used in your presentation. Extra credit will be awarded to those students who also conduct some experiments demonstrating how the method works in practice.

Deadline: You should meet with the instructor 3-4 days before the presentation date to discuss your slides, and the presentation should be submitted by the night before the class.

Suggested Outline:

  • High-level overview of the paper (main contributions)
  • Problem statement and motivation (clear definition of the problem, why it is interesting and important)
  • Key technical ideas (overview of the approach)
  • Experimental set-up (datasets, evaluation metrics, applications)
  • Strengths and weaknesses (discussion of the results obtained)
  • Connections with other work (how it relates to other approaches, its similarities and differences)
  • Future direction (open research questions)

The presentations will be graded according to this rubric.

back to top

Homework

Due: March 15, 2016 (12:30pm)

In this homework, you will learn, through a first-hand experience, how to fine-tune a pre-trained model to classify cultural events on the image data from ChaLearn Looking at People 2015 Challenge (CVPR 2015).

In particular, the purpose of this homework is to make you familiarize with fundamentals of training and understanding convolutional networks, namely

  • applying dropout, batch normalization and data augmentation to reduce overfitting,
  • combining models into ensembles to improve the performance,
  • using transfer learning to adapt a pre-trained model to a new dataset,
  • using data gradients to visualize saliency maps

You can use the deep learning framework of your choice (e.g. Caffe, Torch, Theano, Keras, etc.) as long as your implementation meet the requirements stated above.

For more details on the homework, see this page.

back to top

Course Project

The students taking the course are required to complete a research oriented project. The students can work individually or in pairs. The course project may involve

  • Design of a novel approach and its experimental analysis, or
  • An extension to a recent study of non-trivial complexity and its experimental analysis.
For a detailed description of the course project and the related schedule, see this page. In preparing your progress and final project reports, you should use the provided LaTeX template and submit them electronically in PDF format.

back to top

Resources

Reference Books

  • Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press (in preparation) (draft available online)

Related Classes

Software

Datasets

back to top


© 2016 Hacettepe University