*what is self-supervised learning, self-supervised learning in NLP, self-supervised learning in vision*

Please study the following material in preparation for the class:

- Self-Supervised Representation Learning, Lilian Weng.

- Yann LeCun's PAISS 2019 talk on Self-Supervised Learning

- Distributed Representations of Words and Phrases and their Compositionality, Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean, NIPS 2013.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, NAACL 2019.
- Context Encoders: Feature Learning by Inpainting, Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, Alexei A. Efros, CVPR 2016.
- Unsupervised Visual Representation Learning by Context Prediction, Carl Doersch, Abhinav Gupta, Alexei A. Efros
- Unsupervised Representation Learning by Predicting Image Rotations, Spyros Gidaris, Praveer Singh, Nikos Komodakis, ICLR 2018.
- Representation Learning with Contrastive Predictive Learning, Aaron van den Oord, Yazhe Li, Oriol Vinyals, ICLR 2018.
- A Simple Framework for Contrastive Learning of Visual Representations, Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton, arXiv preprint arXiv:2002.05709, 2020.
- Revisiting Self-Supervised Visual Representation Learning , Alexander Kolesnikov, Xiaohua Zhai, Lucas Beyer, CVPR 2019.

*motivation for variational autoencoders (VAEs), mechanics of VAEs, separatibility of VAEs, training of VAEs,evaluating representations, vector Quantized Variational Autoencoders (VQ-VAEs)*

Please study the following material in preparation for the class:

- Tutorial on Variational Autoencoders, Carl Doersch.
- An Introduction to Variational Autoencoders, Diederik P. Kingma, Max Welling.

- Generative networks (variational autoencoders and GANs), Pascal Poupart

- [Blog post] Intuitively Understanding Variational Autoencoders, Irhum Shafkat.
- [Blog post] A Beginner's Guide to Variational Methods: Mean-Field Approximation, Eric Jang.
- [Blog post] Tutorial - What is a variational autoencoder?, Jaan Altosaar
- beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework, Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, Alexander Lerchner, ICLR 2017.
- Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations, Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem.
- Generating Diverse High-Fidelity Images with VQ-VAE-2, Ali Razavi, Aaron van den Oord, Oriol Vinyals.

*generative adversarial networks (GANs), conditional GANs, applications of GANs, normalizing flows*

Please study the following material in preparation for the class:

- NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow
- Generative Adversarial Networks: An Overview, Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A Bharath
- How to Train a GAN? Tips and tricks to make GANs work, Soumith Chintala, Emily Denton, Martin Arjovsky, Michael Mathieu
- [Blog post] Normalizing Flows Tutorial, Part 1: Distributions and Determinants, Eric Jang
- [Blog post] Normalizing Flows Tutorial, Part 2: Modern Normalizing Flows, Eric Jang
- [Blog post] Flow-based Deep Generative Models, Lilian Weng

- NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow
- A primer on normalizing flows, Laurent Dinh

- [Blog post] How to Train a GAN? Tips and tricks to make GANs work, Soumith Chintala, Emily Denton, Martin Arjovsky and Michael Mathieu.
- [Blog post] The GAN Zoo, Avinash Hindupur
- [Blog post] GAN Playground, Reiichiro Nakano
- [Blog post] GANs comparison without cherry-picking, Junbum Cha
- [Twitter thread] Thread on how to review papers about generic improvements to GANs, Ian Goodfellow
- Normalizing Flows: An Introduction and Review of Current Methods, Ivan Kobyzev, Simon J.D. Prince, and Marcus A. Brubaker, arXiv preprint, arXiv:1908.09257, 2020.
- Normalizing Flows for Probabilistic Modeling and Inference, George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan, arXiv preprint, arXiv:1912.02762, 2019
- [Blog post] Glow: Better Reversible Generative Models, OpenAI
- Density estimation using Real NVP, Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio, ICLR 2017.

*unsupervised representation learning, sparse coding, autoencoders, autoregressive models*

Please study the following material in preparation for the class:

- Chapter #13 of the Deep Learning text book.
- Chapter #14 of the Deep Learning text book.

- Foundations of Unsupervised Deep Learning, Ruslan Salakhutdinov
- Autoregressive Generative Models with Deep Learning, Hugo Larochelle

- Pixel Recurrent Neural Networks, Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglum ICML2016.
- Conditional Image Generation with PixelCNN Decoders, Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu, NIPS2016.
- Unsupervised Feature Learning and Deep Learning, Andrew Ng.
- [Blog post] Unsupervised Sentiment Neuron, Alec Radford, Ilya Sutskever, Rafal Jozefowicz, Jack Clark and Greg.

*content-based attention, location-based attention, soft vs. hard attention, self-attention, attention for image captioning, transformer networks*

Please study the following material in preparation for the class:

- Attention and Augmented Recurrent Neural Networks, Chris Olah and Shan Carter. Distill, 2016
- [Blog post] The Illustrated Transformer, Jay Alammar

- Recurrent Neural Networks and Language Models, Richard Socher
- Attention and Memory in Deep Learning, Alex Graves
- Attention is all you need attentional neural network models, Łukasz Kaiser

- Neural Machine Translation by Jointly Learning to Align and Translate, D. Bahdanau, K. Cho, Y. Bengio, ICLR 2015
- Sequence Modeling with CTC, Awni Hannun, Distill, 2017
- Recurrent Models of Visual Attention, V. Mnih, N. Heess, A. Graves, K. Kavukcuoglu, NIPS 2014
- DRAW: a Recurrent Neural Network for Image Generation, K. Gregor, I. Danihelka, A. Graves, DJ Rezende, D. Wierstra, ICML 2015
- Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, NIPS 2017
- [Blog post] What is DRAW (Deep Recurrent Attentive Writer)?, Kevin Frans
- [Blog post] The Transformer Family, Lilian Weng

*sequence modeling, recurrent neural networks (RNNs), RNN applications, vanilla RNN, training RNNs, long short-term memory (LSTM), LSTM variants, gated recurrent unit (GRU)*

Please study the following material in preparation for the class:

- Chapter #10 of the Deep Learning text book.
- Section 5 of Generating Sequence with Recurrent Neural Networks, A. Graves, ArXiV

- Efstratios Gavves and Max Welling's Lecture 8

- [Blog post] Understanding LSTM Networks, Chris Olah.
- [Blog post] The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy.
- Learning Long-Term Dependencies with Gradient Descest is Difficult, Yoshua Bengio, Patrice Simard, and Paolo Frasconi.
- Long Short-Term Memory, Sepp Hochreiter and Jürgen Schmidhuber.

*transfer learning, interpretability, visualizing neuron activations, visualizing class activations, pre-images, adversarial examples, adversarial training*

Please study the following material in preparation for the class:

- Matthew D Zeiler and Rob Fergus, Visualizing and Understanding Convolutional Networks, ECCV 2014.
- Christian Szegedy et al. Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199v4

- Andrej Karpathy's Stanford CS231n Lecture 9

- [Blog post] Understanding Neural Networks Through Deep Visualization, Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson.
- [Blog post] The Building Blocks of Interpretability, Chris Olah, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye and Alexander Mordvintsev.
- [Blog post] Feature Visualization, Chris Olah, Alexander Mordvintsev and Ludwin Schubert.
- [Blog post] An Overview of Early Vision in InceptionV1, Chris Olah, Nick Cammarata, Ludwig Schubert, Gabriel Goh, Michael Petrov, Shan Carter.
- [Blog post] OpenAI Microscope.
- [Blog post] Breaking Linear Classifiers on ImageNet, Andrej Karpathy.
- [Blog post] Attacking machine learning with adversarial examples, OpenAI.

*convolution layer, pooling layer, evolution of depth, design guidelines, residual connections, semantic segmentation networks, object detection networks, backpropagation in CNNs*

Please study the following material in preparation for the class:

- Chapter #9 of the Deep Learning text book.

- Andrej Karpathy's Stanford CS231n Lecture 7
- Justin Johnson's Stanford CS231n Lecture 8
- Kaiming He's tutorial on Deep Residual Networks

- Andrej Karpathy's CS231n notes on Convolutional Networks.
- Hiroshi Kuwajima’s Memo on Backpropagation in Convolutional Neural Networks.
- A guide to convolution arithmetic for deep learning, Vincent Dumoulin and Francesco Visin.
- Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review, Waseem Rawat and Zenghui Wang.
- [Blog post] Understanding Convolutions, Christopher Olah.
- [Blog post] Deconvolution and Checkerboard Artifacts, Augustus Odena, Vincent Dumoulin, Chris Olah.
- [Blog post] Deep Learning for Object Detection: A Comprehensive Review, Joyce Xu.
- [Blog post] A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN, Dhruv Parthasarathy

*data preprocessing, weight initialization, normalization, regularization, model ensembles, dropout, optimization methods*

Please study the following material in preparation for the class:

- Chapter #7 and Chapter #8 of the Deep Learning text book.

- Efstratios Gavves' Lecture 3.

- Stochastic Gradient Descent Tricks, Leon Bottou.
- Section 3 of Practical Recommendations for Gradient-Based Training of Deep Architectures, Yoshua Bengio.
- Troubleshooting Deep Neural Networks: A Field Guide to Fixing Your Model, Josh Tobin.
- [Blog post] Initializing neural networks, Katanforoosh & Kunin, deeplearning.ai.
- [Blog post] Parameter optimization in neural networks, Katanforoosh et al., deeplearning.ai.
- [Blog post] The Black Magic of Deep Learning - Tips and Tricks for the practitioner, Nikolas Markou.
- [Blog post] An overview of gradient descent optimization algorithms, Sebastian Ruder.
- [Blog post] Why Momentum Really Works, Gabriel Goh

*feed-forward neural networks, activation functions, chain rule, backpropagation, computational graph, automatic differentiation, distributed word representations*

Please study the following material in preparation for the class:

- Chapter 6 of the Deep Learning text book.
- Yoav Goldberg's A Primer on Neural Network Models for Natural Language Processing, 3 to 6

- Hugo Larochelle’s video lectures, 1.1 to 1.6, 2.1 to 2.7

- Hinton's Coursera class on Neural Networks, Lecture 1 to 3.
- [Blog post] Neural Networks, Manifolds, and Topology, Christopher Olah.
- [Blog post] Calculus on Computational Graphs: Backpropagation, Christopher Olah.
- Chapter 16 of Jurafsky and Martin's Speech and Language Processing book (3rd Edition draft)

*types of machine learning problems, linear models, loss functions, linear regression, gradient descent, overfitting and generalization, regularization, cross-validation, bias-variance tradeoff, maximum likelihood estimation*

Please study the following material in preparation for the class:

- Chapter 5 of the Deep Learning text book.

- Machine Learning, Doina Precup (Deep Learning Summer School, Montreal 2016)

- A few useful things to know about machine learning, P. Domingos. Communications of the ACM, 55 (10), 78-87, 2012.

*course information, what is deep learning, a brief history of deep learning, compositionality, end-to-end learning, distributed representations*

Please study the following material in preparation for the class:

- Chapter 1 of the Deep Learning text book.
- [Blog post] AI Winter. How Canadians contributed to end it?, Pavan Mirla.
- The Bandwagon, Claude E. Shannon. IRE Transactions on Information Theory, Vol. 2, Issue 3, 1956
- Chapter 1: The Philosophy and the Approach of David Marr's Vision, 1982.

- The unreasonable effectiveness of deep learning in artificial intelligence, Terrence J. Sejnowski, PNAS, 2020.
- Deep Learning, Yann LeCun, Yoshio Bengio, Geoffrey Hinton. Nature, Vol. 521, 2015.
- Deep Learning in Neural Networks: An Overview, Juergen Schmidhuber. Neural Networks, Vol. 61, pp. 85–117, 2015.
- On the Origin of Deep Learning, Haohan Wang and Bhiksha Raj, arXiv preprint arXiv:1702.07800v4, 2017