Schedule

  • Event
    Date
    Description
    Description
  • Lecture
    09/02/2025
    Tuesday
    Lecture 0: Course Overview and Logistics

    Lecture Notes:

  • Lecture
    09/02/2025
    Tuesday
    Lecture 1: Why Deep Learning

    Lecture Notes:

    Further Reads:

  • Lecture
    09/02/2025
    Tuesday
    Lecture 2: Machine Learning vs Analysis

    Lecture Notes:

  • Lecture
    09/02/2025
    Tuesday
    Lecture 3: ML Component 1 - Data

    Lecture Notes:

    Further Reads:

  • Session
    09/02/2025 15:00
    Tuesday
    First Lecture
  • Lecture
    09/05/2025
    Friday
    Lecture 4: Supervised, Unsupervised and Semi-supervised

    Lecture Notes:

  • Lecture
    09/05/2025
    Friday
    Lecture 5: Components 2 and 3: Model and Loss

    Lecture Notes:

    Further Reads:

  • Lecture
    09/05/2025
    Friday
    Lecture 6: First Example -- Classification by Perceptron

    Lecture Notes:

    Further Reads:

    • Binary Classification: Chapter 5 - Sections 5.1 and 5.2 of [BB]
    • McCulloch-Pitts Model: Paper A logical calculus of the ideas immanent in nervous activity published in the Bulletin of Mathematical Biophysics by Warren McCulloch and Walter Pitts in 1943, proposing a computational model for neuron. This paper is treated as the pioneer study leading to the idea of artificial neuron
  • Lecture
    09/05/2025
    Friday
    Lecture 7: Recap -- Law of Large Numbers

    Lecture Notes:

    Further Reads:

  • Lecture
    09/09/2025
    Tuesday
    Lecture 8: Training via Empirical Risk Minimization

    Lecture Notes:

    Further Reads:

    • Overview on Risk Minimization: Paper An overview of statistical learning theory published as an overview of his life-going developments in ML in the IEEE Transactions on Neural Networks by Vladimir N. Vapnik in 1999
  • Lecture
    09/09/2025
    Tuesday
    Lecture 9: Training Perceptron Machine

    Lecture Notes:

    Further Reads:

  • Lecture
    09/09/2025
    Tuesday
    Lecture 10: From Perceptron to NNs -- Universal Approximation

    Lecture Notes:

    Further Reads:

    • Universal Approximation: Paper Approximation by superpositions of a sigmoidal function published in Mathematics of Control, Signals and Systems by George V. Cybenko in 1989
  • Assignment
    09/12/2025
    Friday
    Assignment #1 - Fundamentals of Machine Learning released!
  • Lecture
    09/12/2025
    Friday
    Lecture 11: Deep Neural Networks

    Lecture Notes:

    Further Reads:

    • DNNs: Chapter 6 - Sections 6.2 and 6.3 of [BB]
  • Lecture
    09/12/2025
    Friday
    Lecture 12: Iterative Optimization by Gradient Descent

    Lecture Notes:

    Further Reads:

  • Lecture
    09/16/2025
    Tuesday
    Lecture 13: More on Gradient Descent

    Lecture Notes:

    Further Reads:

  • Lecture
    09/16/2025
    Tuesday
    Lecture 14: Forward Propagation in MLPs

    Lecture Notes:

    Further Reads:

  • Lecture
    09/19/2025
    Friday
    Lecture 15: Training Neural Networks via GD

    Lecture Notes:

    Further Reads:

  • Lecture
    09/19/2025
    Friday
    Lecture 16: Chain Rule on Computation Graph

    Lecture Notes:

    Further Reads:

  • Lecture
    09/19/2025
    Friday
    Lecture 17: Backward Pass on Computation Graph

    Lecture Notes:

    Further Reads:

  • Assignment
    09/21/2025
    Sunday
    Project Proposal released!
  • Lecture
    09/23/2025
    Tuesday
    Lecture 18: Backpropagation over MLP

    Lecture Notes:

    Further Reads:

    • Backpropagation: Chapter 8 of [BB]
    • Backpropagation of Error Paper Learning representations by back-propagating errors published in Nature by D. Rumelhart, G. Hinton and R. Williams in 1986 advocating the idea of systematic gradient computation of a computation graph
  • Lecture
    09/23/2025
    Tuesday
    Lecture 19: First Neural Classifier

    Lecture Notes:

    Further Reads:

  • Lecture
    09/26/2025
    Friday
    Lecture 20: Multiclass Classification

    Lecture Notes:

    Further Reads:

  • Lecture
    09/26/2025
    Friday
    Lecture 21: Stochastic Gradient Descent

    Lecture Notes:

    Further Reads:

  • Due
    09/26/2025 23:59
    Friday
    Assignment #1 due
  • Lecture
    09/30/2025
    Tuesday
    Lecture 22: Mini-batch SGD and Complexity-Variance Tradeoff

    Lecture Notes:

    Further Reads:

  • Lecture
    09/30/2025
    Tuesday
    Lecture 23: Evaluation and Generalization Measures

    Lecture Notes:

    Further Reads:

    • Generalization: Chapter 6 of the Book Patterns, predictions, and actions: A story about machine learning by Moritz Hardt and B. Recht published in 2021
  • Lecture
    09/30/2025
    Tuesday
    Lecture 24: Linear and Sub-linear Convergence Speed

    Lecture Notes:

    Further Reads:

    • Notes on Optimizers Lecture notes of the course Optimization for Machine Learning by Ashok Cutkosky in Boston University: A good resource for optimizers
  • Assignment
    10/01/2025
    Wednesday
    Assignment #2 - Feedforward Neural Networks released!
  • Lecture
    10/03/2025
    Friday
    Lecture 25: Optimizer Boosting -- Scheduling, Momentum and Rprop Ideas

    Lecture Notes:

    Further Reads:

    • Learning Rate Scheduling Paper Cyclical Learning Rates for Training Neural Networks published in Winter Conference on Applications of Computer Vision (WACV) by Leslie N. Smith in 2017 discussing learning rate scheduling
    • Rprop Paper A direct adaptive method for faster backpropagation learning: the RPROP algorithm published in IEEE International Conference on Neural Networks by M. Riedmiller and H. Braun in 1993 proposing Rprop algorithm
  • Lecture
    10/03/2025
    Friday
    Lecture 26: RMSprop and Adam

    Lecture Notes:

    Further Reads:

    • RMSprop Lecture note by GEoffrey Hinton proposing RMSprop
    • RMSprop Analysis Paper RMSProp and equilibrated adaptive learning rates for non-convex optimization by Y. Dauphin et al. published in 2015 talking about RMSprop and citing Honton’s lecture notes
    • Adam Paper Adam: A Method for Stochastic Optimization published in 2014 by D. Kingma and J. Ba proposing Adam
  • Lecture
    10/03/2025
    Friday
    Lecture 27: Overfitting

    Lecture Notes:

    Further Reads:

  • Due
    10/06/2025 23:59
    Monday
    Proposal due
  • Lecture
    10/07/2025
    Tuesday
    Lecture 28: Sources of Overfitting

    Lecture Notes:

    Further Reads:

  • Lecture
    10/07/2025
    Tuesday
    Lecture 29: Regularization

    Lecture Notes:

    Further Reads:

    • Overfitting and Regularization: Chapter 9 - Sections 9.1 to 9.3 of [BB]
    • Tikhonov Paper Tikhonov Regularization and Total Least Squares published in 1999 by G. Golub et al. illustrating the Tikhonov Regularization work
    • Lasso Paper Regression Shrinkage and Selection Via the Lasso published in 1996 by R. Tibshirani proposing the legendary Lasso
  • Lecture
    10/07/2025
    Tuesday
    Lecture 30: Dropout

    Lecture Notes:

    Further Reads:

    • Dropout 1 Paper Improving neural networks by preventing co-adaptation of feature detectors published in 2012 by G. Hinton et al. proposing Dropout
    • Dropout 2 Paper Dropout: A Simple Way to Prevent Neural Networks from Overfitting published in 2014 by N. Srivastava et al. providing some analysis and illustrations on Dropout
  • Lecture
    10/10/2025
    Friday
    Lecture 31: Statistical Viewpoint on Data

    Lecture Notes:

    Further Reads:

    • Data: Chapter 8 of the Book Patterns, predictions, and actions: A story about machine learning by Moritz Hardt and B. Recht published in 2021
    • Data Processing in Python Open Book Minimalist Data Wrangling with Python by Marek Gagolewski going through data processing in Python
  • Lecture
    10/10/2025
    Friday
    Lecture 32: Normalization

    Lecture Notes:

    Further Reads:

    • Normalization Paper Is normalization indispensable for training deep neural network? published in 2020 by J. Shao et al. discussing the meaning and effects of normalization
  • Lecture
    10/14/2025
    Tuesday
    Lecture 33: Batch Normalization

    Lecture Notes:

    Further Reads:

    • Batch-Norm Paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift published in 2015 by S. Ioffe and C. Szegedy proposing Batch Normalization
    • Batch-Norm Meaning Paper How Does Batch Normalization Help Optimization? published in 2018 by S. Santurkar et al. discussing why Batch Normalization works: they claim that the main reason is that loss landscape is getting much smoother
  • Lecture
    10/14/2025
    Tuesday
    Lecture 34: Why Convolution?

    Lecture Notes:

    Further Reads:

    • Hubel and Wiesel Study Paper Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex published in 1962 by D. Hubel and T. Wiesel elaborating their finding on visual understanding
    • Neocognitron Paper Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position published in 1980 by _K. Fukushima _ proposing the Neocognitron as a computational model for visual learning
    • Backpropagating on LeNet Paper Backpropagation Applied to Handwritten Zip Code Recognition published in 1989 by Y. LeCun et al. developing backpropagation for LeNet
    • LeNet Paper Gradient-Based Learning Applied to Document Recognition published in 1998 by Y. LeCun et al. discussing LeNet
  • Due
    10/15/2025 23:59
    Wednesday
    Assignment #2 due
  • Lecture
    10/17/2025
    Friday
    Lecture 35: Quick Preview on CNN

    Lecture Notes:

    Further Reads:

  • Lecture
    10/17/2025
    Friday
    Lecture 36: Convolution Operation and Resampling

    Lecture Notes:

    Further Reads:

  • Lecture
    10/17/2025
    Friday
    Lecture 37: Padding and Multichannel Convolution

    Lecture Notes:

    Further Reads:

  • Lecture
    10/21/2025
    Tuesday
    Lecture 38: Pooling and Flattening

    Lecture Notes:

    Further Reads:

  • Lecture
    10/21/2025
    Tuesday
    Lecture 39: Deep CNNs

    Lecture Notes:

    Further Reads:

  • Lecture
    10/21/2025
    Tuesday
    Lecture 40: Example of VGG-16

    Lecture Notes:

    Further Reads:

    • VGG Paper Very Deep Convolutional Networks for Large-Scale Image Recognition published in 2014 by K. Simonyan and A. Zisserman proposing VGG Architectures
  • Exam
    10/24/2025 11:00
    Friday
    Midterm

    Topics:

    • The exam is 3 hours long
    • No programming questions
    • Starts at 11:00 AM
  • Lecture
    11/04/2025
    Tuesday
    Lecture 41: Backpropagation Through CNNs

    Lecture Notes:

    Further Reads:

    • LeCun’s Paper Paper Gradient-based learning applied to document recognition published in 2002 by Y. LeCun et al. summarizing the learning process in CNN
    • Efficient Backpropagation on CNN Paper High Performance Convolutional Neural Networks for Document Processing published in 2006 by K. Chellapilla et al. discussing efficient backpropagation on CNNs.
  • Lecture
    11/07/2025
    Friday
    Lecture 42: Vanishing Gradient in Deep Networks

    Lecture Notes:

    Further Reads:

    • ResNet Paper Deep Residual Learning for Image Recognition published in 2015 by K. He et al. proposing ResNet
  • Lecture
    11/07/2025
    Friday
    Lecture 43: Skip Connection and ResNet

    Lecture Notes:

    Further Reads:

    • ResNet Paper Deep Residual Learning for Image Recognition published in 2015 by K. He et al. proposing ResNet
    • ResNet-1001 Paper Identity Mappings in Deep Residual Networks published in 2016 by K. He et al. demonstrating how deep ResNet can go
    • U-Net Paper U-Net: Convolutional Networks for Biomedical Image Segmentation published in 2015 by O. Ronneberger et al. proposing U-Net
    • DenseNet Paper Densely Connected Convolutional Networks published in 2017 by H. Huang et al. proposing DenseNet
  • Lecture
    11/11/2025
    Tuesday
    Lecture 44: Processing Sequence Data

    Lecture Notes:

    Further Reads:

    • Jordan Network Paper Attractor dynamics and parallelism in a connectionist sequential machine published in 1986 by M. Jordan proposing his RNN
    • Elman Network Paper Finding structure in time published in 1990 by J. Elman proposing a revision to Jordan Network
  • Lecture
    11/11/2025
    Tuesday
    Lecture 45: Sequence Processing by Recursion

    Lecture Notes:

    Further Reads:

    • BPTT Paper Backpropagation through time: What it does and how to do it published in 2002 by P. Werbos explaining BPTT
  • Lecture
    11/14/2025
    Friday
    Lecture 46: Different Sequence Problems

    Lecture Notes:

    Further Reads:

    • Seq Models Article The Unreasonable Effectiveness of Recurrent Neural Networks written in May 2015 by A. Karpathy discussing different types of sequence problems
  • Lecture
    11/14/2025
    Friday
    Lecture 47: Backpropagation Through Time

    Lecture Notes:

    Further Reads:

    • Vanishing Gradient with BPTT Paper On the difficulty of training recurrent neural networks published in 2013 by R. Pascanu et al. discussing challenges in training with BPTT
    • Truncated BPTT Paper An efficient gradient-based algorithm for on-line training of recurrent network trajectories published in 1990 by R. Williams and J. Peng explaining truncated BPTT
  • Lecture
    11/14/2025
    Friday
    Lecture 48: Gating Principle

    Lecture Notes:

    Further Reads:

    • Gating Principle Chapter Long Short-Term Memory published in 2012 in book Supervised Sequence Labelling with Recurrent Neural Networks by A. Graves explaining Gating idea

Tutorial Schedule