Schedule
Course Calendar by Week
| Week # | Date | Notes | Posted | Deadline |
| 1 | Jan 05 - Jan 09 | |||
| 2 | Jan 12 - Jan 16 | Assignment 1: Basics | ||
| 3 | Jan 19 - Jan 23 | |||
| 4 | Jan 26 - Jan 30 | Assignment 2: FNNs | Assignment 1: Basics | |
| 5 | Feb 02 - Feb 06 | Project: Proposal | ||
| 6 | Feb 09 - Feb 13 | Assignment 2: FNNs | ||
| 7 | Feb 16 - Feb 20 | Reading Week-- No Lectures | ||
| 8 | Feb 23 - Feb 27 | Midterm Exam on Feb 26 | Assignment 3: CNNs | |
| 9 | Mar 02 - Mar 06 | Assignment 3: CNNs | ||
| 10 | Mar 09 - Mar 13 | Assignment 4: Sequence Models | Project: Progress Briefing | |
| 11 | Mar 16 - Mar 20 | |||
| 12 | Mar 23 - Mar 27 | Assignment 4: Sequence Models | ||
| 13 | Mar 30 - Apr 03 | |||
| 14 | Apr 06 - Apr 10 | Examination Time -- No Lectures | Project: Presentation | Project: Final Report and Source Codes |
Deliverables with Deadlines
| Item | Date Posted | Deadline |
| Assignment 1 | Jan 15, 2026 | Jan 29, 2026 |
| Assignment 2 | Jan 29, 2026 | Feb 12, 2026 |
| Proposal | Feb 06, 2026 | |
| Assignment 3 | Feb 12, 2026 | Mar 05, 2026 |
| Midterm Exam | Feb 26, 2026 | |
| Project Briefing | Mar 12, 2026 | |
| Assignment 4 | Mar 12, 2026 | Mar 26, 2026 |
| Project Presentation | Apr 07, 2026 | |
| Report Submission | Apr 10, 2026 |
Tutorial Schedule
| Week # | Date | Topic |
| 1 | Jan 08, 2026 | First Week - No Tutorial |
| 2 | Jan 15, 2026 | Basics of Python, e.g., NumPy, SciKitLearn, MatplotLib, with Key Implementation Tricks |
| 3 | Jan 22, 2026 | Autograd by PyTorch and Its Implementation |
| 4 | Jan 29, 2026 | MLP Implementation |
| 5 | Feb 5, 2026 | Regularization, Dropout, and Batch Normalization |
| 6 | Feb 12, 2026 | Midterm Review |
| 7 | Feb 19, 2026 | Reading Week - No Tutorial |
| 8 | Feb 26, 2026 | CNN Implementation |
| 9 | Mar 5, 2026 | Skip Connection and ResNet |
| 10 | Mar 12, 2026 | RNNs and Gating Architectures, i.e., GRU and LSTM |
| 11 | Mar 19, 2026 | Attention and Transformer |
| 12 | Mar 26, 2026 | Autoencoding and Variational Autoencoders |
| 13 | Apr 02, 2026 | Last Week - No tutorial (Reserved for Makeup) |
Detailed Calendar by Session
-
EventDateDescriptionDescription
-
Session01/06/2026 13:00
TuesdayFirst Lecture -
Lecture01/06/2026
TuesdayLecture 0: Course Overview and LogisticsLecture Notes:
-
Lecture01/06/2026
TuesdayLecture 1: Introduction and DL ComponentsLecture Notes:
Further Reads:
- Motivation: Chapter 1 - Section 1.1 of [BB]
- Review on Linear Algebra: Chapter 2 of [GYC]
- ML Components: Chapter 1 - Sections 1.2.1 to 1.2.4 of [BB]
-
Lecture01/08/2026
ThursdayLecture 2: Classification via PerceptronLecture Notes:
Further Reads:
- Binary Classification: Chapter 5 - Sections 5.1 and 5.2 of [BB]
- McCulloch-Pitts Model: Paper A logical calculus of the ideas immanent in nervous activity published in the Bulletin of Mathematical Biophysics by Warren McCulloch and Walter Pitts in 1943, proposing a computational model for neuron. This paper is treated as the pioneer study leading to the idea of artificial neuron –>
-
Lecture01/08/2026
ThursdayLecture 3: Training via Empirical Risk MinimizationLecture Notes:
Further Reads:
- Overview on Risk Minimization: Paper An overview of statistical learning theory published as an overview of his life-going developments in ML in the IEEE Transactions on Neural Networks by Vladimir N. Vapnik in 1999
-
Lecture01/13/2026
TuesdayLecture 4: Multiple Layers of PerceptronsLecture Notes:
Further Reads:
- Perceptron Simulation Experiments: Paper Perceptron Simulation Experiments presented by Frank Rosenblatt in Proceedings of IRE in 1960
- Perceptron: Chapter 1 - Section 1.2.1 of [Ag]
-
Lecture01/13/2026
TuesdayLecture 5: Universal Approximation Theorem and Deep NNsLecture Notes:
Further Reads:
- Universal Approximation: Paper Approximation by superpositions of a sigmoidal function published in Mathematics of Control, Signals and Systems by George V. Cybenko in 1989
- DNNs: Chapter 6 - Sections 6.2 and 6.3 of [BB]
-
Assignment01/15/2026
ThursdayAssignment #1 - Fundamentals of Computational Learning released! -
Lecture01/20/2026
TuesdayLecture 6: Iterative Optimization by Gradient DescentLecture Notes:
Further Reads:
- Gradient-based Optimization: Chapter 4 - Sections 4.3 and 4.4 of [GYC]
- Gradient Descent: Chapter 7 - Sections 7.1 and 7.2 of [BB]
-
Lecture01/20/2026
TuesdayLecture 07: More on Gradient Descent -
Lecture01/22/2026
ThursdayLecture 08: Forward Propagation in MLPsLecture Notes:
Further Reads:
-
Lecture01/22/2026
ThursdayLecture 09: Computing Gradient on Graph -
Lecture01/27/2026
TuesdayLecture 10: BackpropagationLecture Notes:
Further Reads:
- Backpropagation: Chapter 6 - Section 6.5 of [GYC]
- Backpropagation: Chapter 8 of [BB]
-
Lecture01/27/2026
TuesdayLecture 11: Backpropagation over MLPLecture Notes:
Further Reads:
- Backpropagation: Chapter 8 of [BB]
- Backpropagation of Error Paper Learning representations by back-propagating errors published in Nature by D. Rumelhart, G. Hinton and R. Williams in 1986 advocating the idea of systematic gradient computation of a computation graph
-
Assignment01/29/2026
ThursdayAssignment 2: MLPs released! -
Lecture01/29/2026
ThursdayLecture 12: Neural Classifier -
Lecture01/29/2026
ThursdayLecture 13: Multiclass Classification -
Due01/29/2026 23:59
ThursdayAssignment #1 due -
Assignment01/30/2026
FridayProject Proposal released! -
Lecture02/03/2026
TuesdayLecture 14: Stochastic Gradient Descent and Learning Curves -
Lecture02/03/2026
TuesdayLecture 15: Linear and Sub-linear Convergence SpeedLecture Notes:
Further Reads:
- Notes on Optimizers Lecture notes of the course Optimization for Machine Learning by Ashok Cutkosky in Boston University: A good resource for optimizers
-
Lecture02/05/2026
ThursdayLecture 16: Practical OptimizersLecture Notes:
Further Reads:
- Learning Rate Scheduling Paper Cyclical Learning Rates for Training Neural Networks published in Winter Conference on Applications of Computer Vision (WACV) by Leslie N. Smith in 2017 discussing learning rate scheduling
- Rprop Paper A direct adaptive method for faster backpropagation learning: the RPROP algorithm published in IEEE International Conference on Neural Networks by M. Riedmiller and H. Braun in 1993 proposing Rprop algorithm
-
Lecture02/05/2026
ThursdayLecture 17: Overfitting and Regularization -
Due02/06/2026 23:59
FridayProposal Due -
Lecture02/10/2026
TuesdayLecture 18: Dropout and Data AugmentationLecture Notes:
Further Reads:
- Dropout 1 Paper Improving neural networks by preventing co-adaptation of feature detectors published in 2012 by G. Hinton et al. proposing Dropout
- Dropout 2 Paper Dropout: A Simple Way to Prevent Neural Networks from Overfitting published in 2014 by N. Srivastava et al. providing some analysis and illustrations on Dropout
- Data: Chapter 8 of the Book Patterns, predictions, and actions: A story about machine learning by Moritz Hardt and B. Recht published in 2021
- Data Processing in Python Open Book Minimalist Data Wrangling with Python by Marek Gagolewski going through data processing in Python
-
Lecture02/10/2026
TuesdayLecture 19: NormalizationLecture Notes:
Further Reads:
- Data Processing in Python Open Book Minimalist Data Wrangling with Python by Marek Gagolewski going through data processing in Python
-
Lecture02/12/2026
ThursdayLecture 20: Batch NormalizationLecture Notes:
Further Reads:
- Batch-Norm Paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift published in 2015 by S. Ioffe and C. Szegedy proposing Batch Normalization
- Batch-Norm Meaning Paper How Does Batch Normalization Help Optimization? published in 2018 by S. Santurkar et al. discussing why Batch Normalization works: they claim that the main reason is that loss landscape is getting much smoother
-
Lecture02/12/2026
ThursdayLecture 21: Convolutional LayersLecture Notes:
Further Reads:
- Hubel and Wiesel Study Paper Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex published in 1962 by D. Hubel and T. Wiesel elaborating their finding on visual understanding
- Neocognitron Paper Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position published in 1980 by _K. Fukushima _ proposing the Neocognitron as a computational model for visual learning
- Backpropagating on LeNet Paper Backpropagation Applied to Handwritten Zip Code Recognition published in 1989 by Y. LeCun et al. developing backpropagation for LeNet
- LeNet Paper Gradient-Based Learning Applied to Document Recognition published in 1998 by Y. LeCun et al. discussing LeNet
- Convolution: Chapter 9 - Sections 9.1 and 9.2 of [GYC]
-
Due02/12/2026 23:59
ThursdayAssignment #2 Due -
Lecture02/24/2026
TuesdayLecture 22: Multi-channel Convolution and PoolingLecture Notes:
Further Reads:
- Convolution: Chapter 9 - Sections 9.1 and 9.2 of [GYC]
- Multi-channel Convolution: Chapter 10 - Sections 10.2.3 to 10.2.5 of [BB]
- Pooling: Chapter 10 - Section 10.2.6 of [BB]
- Flattening: Chapter 10 - Sections 10.2.7 and 10.2.8 of [BB]
-
Lecture02/24/2026
TuesdayLecture 23: Deep CNNsLecture Notes:
Further Reads:
- Convolution: Chapter 9 - Sections 9.4 and 9.6 of [GYC]
- VGG Paper Very Deep Convolutional Networks for Large-Scale Image Recognition published in 2014 by K. Simonyan and A. Zisserman proposing VGG Architectures
-
Exam02/26/2026 13:00
ThursdayMidtermTopics:
- The exam is 3 hours long
- No programming questions
- Starts at 1:00 PM
-
Lecture03/03/2026
TuesdayLecture 24: Backpropagation Through CNNsLecture Notes:
Further Reads:
- LeCun’s Paper Paper Gradient-based learning applied to document recognition published in 2002 by Y. LeCun et al. summarizing the learning process in CNN
- Efficient Backpropagation on CNN Paper High Performance Convolutional Neural Networks for Document Processing published in 2006 by K. Chellapilla et al. discussing efficient backpropagation on CNNs.
-
Lecture03/05/2026
ThursdayLecture 25: Vanishing Gradient in Deep NetworksLecture Notes:
Further Reads:
- ResNet Paper Deep Residual Learning for Image Recognition published in 2015 by K. He et al. proposing ResNet
-
Lecture03/05/2026
ThursdayLecture 26: Skip Connection and ResNetLecture Notes:
Further Reads:
- ResNet Paper Deep Residual Learning for Image Recognition published in 2015 by K. He et al. proposing ResNet
- ResNet-1001 Paper Identity Mappings in Deep Residual Networks published in 2016 by K. He et al. demonstrating how deep ResNet can go
- U-Net Paper U-Net: Convolutional Networks for Biomedical Image Segmentation published in 2015 by O. Ronneberger et al. proposing U-Net
- DenseNet Paper Densely Connected Convolutional Networks published in 2017 by H. Huang et al. proposing DenseNet
-
Lecture03/10/2026
TuesdayLecture 27: RNNsLecture Notes:
Further Reads:
- Jordan Network Paper Attractor dynamics and parallelism in a connectionist sequential machine published in 1986 by M. Jordan proposing his RNN
- Elman Network Paper Finding structure in time published in 1990 by J. Elman proposing a revision to Jordan Network
-
Lecture03/10/2026
TuesdayLecture 28: Learning through TimeLecture Notes:
Further Reads:
- BPTT Paper Backpropagation through time: What it does and how to do it published in 2002 by P. Werbos explaining BPTT
- Seq Models Article The Unreasonable Effectiveness of Recurrent Neural Networks written in May 2015 by A. Karpathy discussing different types of sequence problems
-
Lecture03/12/2026
ThursdayLecture 29: Training RNNsLecture Notes:
Further Reads:
- Vanishing Gradient with BPTT Paper On the difficulty of training recurrent neural networks published in 2013 by R. Pascanu et al. discussing challenges in training with BPTT
- Truncated BPTT Paper An efficient gradient-based algorithm for on-line training of recurrent network trajectories published in 1990 by R. Williams and J. Peng explaining truncated BPTT
-
Lecture03/12/2026
ThursdayLecture 30: Gated ArchitecturesLecture Notes:
Further Reads:
- Gating Principle Chapter Long Short-Term Memory published in 2012 in book Supervised Sequence Labelling with Recurrent Neural Networks by A. Graves explaining Gating idea
- LSTM Paper Long short-term memory published in 1997 by S. Hochreiter and J. Schmidhuber proposing LSTM
- GRU Paper On the Properties of Neural Machine Translation: Encoder-Decoder Approaches published in 2014 by K. Cho et al. proposing GRU
-
Lecture03/17/2026
TuesdayLecture 31: Correspondence Problem and CTCLecture Notes:
Details:
- CTC Algorithm This is a recorded lecture on CTC
Further Reads:
- CTC Paper Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks published in 2006 by A. Graves et al. proposing CTC Algorithm
-
Lecture03/17/2026
TuesdayLecture 32: Seq2Seq - Part I: Language ModelLecture Notes:
Further Reads:
