Schedule

Event

Date

Description

Description
Session

05/06/2025 22:00
Tuesday

First Lecture
Lecture

05/06/2025
Tuesday

Lecture 0: Course Overview and Logistics
Lecture Notes:
- Chapter 0
Lecture

05/06/2025
Tuesday

Lecture 1: Tokenization and Embedding
Lecture Notes:
- Chapter 1 - Section 1 Pgs 1:18
Further Reads:
- Tokenization: Chapter 2 of [JM]
- Embedding: Chapter 6 of [JM]
- Original BPE Algorithm: Original BPE Algorithm proposed by Philip Gage in 1994
- BPE for Tokenization: Paper Neural machine translation of rare words with subword units by Rico Sennrich, Barry Haddow, and Alexandra Birch presented in ACL 2016 that adapted BPE for NLP
Lecture

05/08/2025
Thursday

Lecture 2: Language Distribution and Bi-Gram Model
Lecture Notes:
- Chapter 1 - Section 1 Pgs 18:32
Further Reads:
- LMs: Chapter 12 of [BB] Section 12.2
- N-Gram LMs: Chapter 3 of Speech and Language Processing; Section 3.1 on N-gram LM
- Maximum Likelihood: Chapter 2 of [BB] Sections 12.1 – 12.3
Lecture

05/08/2025
Thursday

Lecture 3: Recurrent LMs
Lecture Notes:
- Chapter 1 - Section 1 Pgs 32:42
Further Reads:
- Recurrent LMs: Chapter 8 of [JM]
- LSTM LMs: Paper Regularizing and Optimizing LSTM Language Models by Stephen Merity, Nitish Shirish Keskar, and Richard Socher published in ICLR 2018 enabling LSTMs to perform strongly on word-level language modeling
- High-Rank Recurrent LMs: Paper Breaking the Softmax Bottleneck: A High-Rank RNN Language Model by Zhilin Yang, Zihang Dai, Ruslan Salakhutdinov, and William W. Cohen presented at ICLR 2018 proposing Mixture of Softmaxes (MoS) and achieving state-of-the-art results at the time
Lecture

05/13/2025
Tuesday

Lecture 4: Context Extraction via Self-Attention
Lecture Notes:
- Chapter 1 - Section 2
Further Reads:
- Transformer Paper: Paper Attention Is All You Need! published in 2017 that made a great turn in sequence processing
- Transformers: Chapter 9 of [JM]
- Transformers: Chapter 12 of [BB] Section 12.1
Lecture

05/13/2025
Tuesday

Lecture 5: Transformer LM
Lecture Notes:
- Chapter 1 - Section 2
Further Reads:
- Transformer LMs: Chapter 12 of [BB] Section 12.3
- LLMs via Transformers: Chapter 10 of [JM]
Lecture

05/15/2025
Thursday

Lecture 6: LLM Examples
Lecture Notes:
- Chapter 1 - Section 3
Further Reads:
- GPT-1: Paper Improving Language Understanding by Generative Pre-Training by Alec Radford et al. (OpenAI, 2018) that introduced GPT-1 and revived the idea of pretraining transformers as LMs followed by supervised fine-tuning
- GPT-2: Paper Language Models are Unsupervised Multitask Learners by Alec Radford et al. (OpenAI, 2019) that introduces GPT-2 with 1.5B parameter trained on web text
- GPT-3: Paper Language Models are Few-Shot Learners by Tom B. Brown et al. (OpenAI, 2020) that introduces GPT-3, a 175B-parameter transformer LM
- GPT-4: GPT-4 Technical Report by OpenAI (2023) that provides an overview of GPT-4’s capabilities
- The Pile: Paper The Pile: An 800GB Dataset of Diverse Text for Language Modeling by Leo Gao et al. presented in 2020 introductin dataset The Pile
- Documentation Debt: Paper Addressing “Documentation Debt” in Machine Learning Research: A Retrospective Datasheet for BookCorpus by Jack Bandy and Nicholas Vincent published in 2021 discussing the efficiency and legality of data collection by looking into BookCorpus
Lecture

05/15/2025
Thursday

Lecture 7: Pre-training vs Fine-tuning
Lecture Notes:
- Chapter 1 - Section 3
Further Reads:
- SSL: Paper Semi-supervised Sequence Learning by Andrew M. Dai et al. published in 2015 that explores using unsupervised pretraining followed by supervised fine-tuning; this was an early solid work advocating pre-training idea for LMs
- GPT-1: Paper Improving Language Understanding by Generative Pre-Training by Alec Radford et al. (OpenAI, 2018) that introduced GPT-1 and revived the idea of pretraining transformers as LMs followed by supervised fine-tuning
Lecture

05/15/2025
Thursday

Lecture 8: Statistical View and LoRA
Lecture Notes:
- Chapter 1 - Section 3
Further Reads:
- LMs: Chapter 12 of [BB] Section 12.3.5
- LoRA: Paper LoRA: Low-Rank Adaptation of Large Language Models by Edward J. Hu et al. presented at ICLR in 2022 introducing LoRA
Assignment

05/20/2025
Tuesday

Assignment #1 - Language Modeling released!

[Assignment #1 - Language Modeling]
Lecture

05/20/2025
Tuesday

Lecture 9: Prompt Design
Lecture Notes:
- Chapter 1 - Section 3
Further Reads:
- Chain-of-Thought: Paper Chain-of-Thought Prompting Elicits Reasoning in Large Language Models by Jason Wei et al. presented at NeurIPS in 2022 introducing chain-of-thought prompting
- Prefix-Tuning: Paper Prefix-Tuning: Optimizing Continuous Prompts for Generation by Xiang Lisa Li et al. presented at ACL in 2021 proposing prefix-tuning approach for prompting
- Prompt-Tuning: Paper The Power of Scale for Parameter-Efficient Prompt Tuning by B. Lester et al. presented at EMNLP in 2021 proposing the prompt tuning idea, i.e., learning to prompt
- Zero-Shot LLMs: Paper Large Language Models are Zero-Shot Reasoners by T. Kojima et al. presented at NeurIPS in 2022 studying zero-shot learning with LLMs
Lecture

05/20/2025
Tuesday

Lecture 10: Data Generation Problem - Basic Definitions
Lecture Notes:
- Chapter 2 - Section 1
Further Reads:
- Probabilistic Model: Chapter 2 of [BB] Sections 2.4 to 2.6
- Statistics: Chapter 3 of [M] Sections 3.1 to 3.3
Lecture

05/22/2025
Thursday

Lecture 11: Discriminative vs Generative Learning
Lecture Notes:
- Chapter 2 - Section 2
Further Reads:
- Discriminative and Generative Models: Chapter 5 of [BB]
Lecture

05/22/2025
Thursday

Lecture 12: Naive Bayes - Most Basic Generative Model
Lecture Notes:
- Chapter 2 - Section 3
Further Reads:
- Naive Bayes: Paper Idiot’s Bayes—Not So Stupid After All? by D. Hand and K. Yu published at Statistical Review in 2001 discussing the efficiency of Naive Bayes for classification
- Naive Bayes vs Linear Regression: Paper On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes by A. Ng and M. Jordan presented at NeurIPS in 2001 elaborating the data-efficiency efficiency of Naive Bayes and asymptotic superiority of Logistic Regression
- Generative Models – Overview: Chapter 20 of [M] Sections 20.1 to 20.3
Lecture

05/27/2025
Tuesday

Lecture 13: Explicit Distribution Learning - Sampling
Lecture Notes:
- Chapter 3 - Section 1
Further Reads:
- Sampling Overview: Chapter 14 of [BB]
- Sampling The book Pattern Recognition and Machine Learning by Christopher Bishop. Read Chapter 11 to know about how challenging sampling from a distribution is
- Sampling Methods: Chapter 17 of [GYC] Sections 17.1 and 17.2
Lecture

05/27/2025
Tuesday

Lecture 14: Maximum Likelihood Learning
Lecture Notes:
- Chapter 3 - Section 1
Further Reads:
- KL Divergence and MLE: Chapter 5 of [M] Sections 5.1 to 5.2
- MLE: Chapter 5 of [GYC] Section 5.5
- Maximum Likelihood Learning The book Information Theory, Inference, and Learning Algorithms by David MacKay which discusses MLE for clustering in Chapter 22
Lecture

05/27/2025
Tuesday

Lecture 15: Autoregressive Modeling
Lecture Notes:
- Chapter 3 - Section 2
Further Reads:
- Autoregressive Models: Chapter 22 of [M]
Lecture

05/29/2025
Thursday

Lecture 16: Computational AR Models
Lecture Notes:
- Chapter 3 - Section 2
Further Reads:
- Autoregressive Models: Chapter 22 of [M]
Lecture

05/29/2025
Thursday

Lecture 17: PixelRNN
Lecture Notes:
- Chapter 3 - Section 3
Further Reads:
- PixelRNN and PixelCNN: Paper Pixel Recurrent Neural Networks by A. Oord et al. presented at ICML in 2016 proposing PixelRNN and PixelCNN
Lecture

06/03/2025
Tuesday

Lecture 18: Masked AR Models - PixelCNN and ImageGPT
Lecture Notes:
- Chapter 3 - Section 3
Further Reads:
- PixelRNN and PixelCNN: Paper Pixel Recurrent Neural Networks by A. Oord et al. presented at ICML in 2016 proposing PixelRNN and PixelCNN
- ImageGPT: Paper Generative Pretraining from Pixels by M. Chen et al. presented at ICML in 2020 proposing ImageGPT
Lecture

06/03/2025
Tuesday

Lecture 19: Energy Based Models - Boltzmann Distribution
Lecture Notes:
- Chapter 3 - Section 4
Further Reads:
- EBMs: Chapter 24 of [M]
- Partition Function and Normalizing: Chapter 16 of [GYC] Section 16.2
- Universality of EBMs Paper Representational power of restricted Boltzmann machines and deep belief networks, by N. Le Roux and Y. Bengio published at Neural Computation in 2008 elaborating the representational power of EBMs *Tutorial on EBMs Survey A Tutorial on Energy-Based Learning, by Y. LeCun et al. published in 2006
Lecture

06/05/2025
Thursday

Lecture 20: Computational EBMs - Training and Sampling
Lecture Notes:
- Chapter 3 - Section 4
Further Reads:
- EBMs: Chapter 24 of [M]
- Partition Function and Normalizing: Chapter 16 of [GYC] Section 16.2 *Tutorial on EBMs Survey A Tutorial on Energy-Based Learning, by Y. LeCun et al. published in 2006
Lecture

06/05/2025
Thursday

Lecture 21: MCMC Algorithms - Gibbs Sampling
Lecture Notes:
- Chapter 3 - Section 4
Further Reads:
- MCMC Algorithms: Chapter 12 of [M] Sections 12.3, 12.6 and 12.7
- Gibbs Sampling and Langevin: Chapter 14 of [BB]
- Anatomy of MCMC Paper On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models published by E. Nijkamp et al. in AAAI 2020 looking on the stability of training by MCMC algorithms
Due

06/05/2025 23:59
Thursday

Assignment #1 due
Lecture

06/10/2025
Tuesday

Lecture 22: MCMC - Langevin and Conservative Divergence
Lecture Notes:
- Chapter 3 - Section 4
Further Reads:
- Gibbs Sampling and Langevin: Chapter 14 of [BB]
- Conservative Divergence Paper Training Products of Experts by Minimizing Contrastive Divergence, by G. Hinton published at Neural Computation in 2002 proposing the idea of Conservative Divergence
- Training by MCMC Paper Implicit Generation and Generalization in Energy-Based Models published by Y. Du and I. Mordatch in NeurIPS 2019 discussing efficiency of MCMC algorithms for EBM training
- Improved CD Paper Improved Contrastive Divergence Training of Energy-Based Models published by Y. Du et al. in ICML 2021 proposing an efficient training based on Hinton’s CD ideal
Lecture

06/10/2025
Tuesday

Lecture 23: Latent Space
Lecture Notes:
- Chapter 3 - Section 5
Further Reads:
- Latent Variable: Chapter 16 of [BB] Sections 16.2
Lecture

06/10/2025
Tuesday

Lecture 24: Normalizing Flow
Lecture Notes:
- Chapter 3 - Section 5
Further Reads:
- Normalizing Flow: Chapter 18 of [BB]
Lecture

06/12/2025
Thursday

Lecture 25: Learning Flow
Lecture Notes:
- Chapter 3 - Section 5
Further Reads:
- Flow-based Models: Chapter 23 of [M]
- Tutorial on Normalizing Flow Paper Normalizing Flows for Probabilistic Modeling and Inference published by G. Papamakarios et al. at JMLR in 2021 discussing the training and inference of flow-based models
Lecture

06/12/2025
Thursday

Lecture 26: NICE, RealNVP and Glow
Lecture Notes:
- Chapter 3 - Section 5
Further Reads:
- NICE Paper NICE: Non-linear Independent Components Estimation published by L. Dinh et al. at ICLR in 2015 proposing the NICE model
- Real NVP Paper Density estimation using Real NVP published by L. Dinh et al. at ICLR in 2017 proposing the Real NVP model
- Glow Paper Glow: Generative Flow with Invertible 1x1 Convolutions published by D. Kingma and P. Dhariwal at NeurIPS in 2018 proposing the Glow model
Lecture

06/12/2025
Thursday

Lecture 27: Introduction to GAN
Lecture Notes:
- Chapter 4 - Section 1
Further Reads:
- Tutorial on GANs Tutorial Generative Adversarial Networks given by I. Goodfellow at NeurIPS in 2016
Assignment

06/17/2025
Tuesday

Assignment #2 - Explicit Methods for Generation released!

[Assignment #2 - Explicit Methods for Generation]
Lecture

06/17/2025
Tuesday

Lecture 28: Vanilla GAN
Lecture Notes:
- Chapter 4 - Section 1
Further Reads:
- GANs Paper Generative Adversarial Nets published by I. Goodfellow et al. at NeurIPS in 2014 proposing GANs
Lecture

06/17/2025
Tuesday

Lecture 29: Implicit MLE via GAN
Lecture Notes:
- Chapter 4 - Section 2
Further Reads:
- GANs Paper Generative Adversarial Nets published by I. Goodfellow et al. at NeurIPS in 2014 proposing GANs
- Tutorial on GANs Tutorial Generative Adversarial Networks given by I. Goodfellow at NeurIPS in 2016
Lecture

06/19/2025
Thursday

Lecture 30: Wasserstein Distance
Lecture Notes:
- Chapter 4 - Section 3
Further Reads:
- W-GANs Paper Wasserstein GAN published by M. Arjovsky et al. at ICML in 2017 proposing Wasserstein GANs
- Tutorial on GANs Tutorial Generative Adversarial Networks given by I. Goodfellow at NeurIPS in 2016
Lecture

06/19/2025
Thursday

Lecture 31: Wasserstein GAN
Lecture Notes:
- Chapter 4 - Section 3
Further Reads:
- W-GANs Paper Wasserstein GAN published by M. Arjovsky et al. at ICML in 2017 proposing Wasserstein GANs
Lecture

06/19/2025
Thursday

Lecture 32: GAN Samples
Lecture Notes:
- Chapter 4 - Section 4
Further Reads:
- DCGAN Paper Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks published by A. Radford et al. at ICLR in 2016 proposing DCGAN
- StyleGAN Paper A Style-Based Generator Architecture for Generative Adversarial Networks published by T. Karras et al. at IEEE CVF in 2019 proposing Style GAN
- BigGAN Paper Large Scale GAN Training for High Fidelity Natural Image Synthesis published by A. Brock et al. at ICLR in 2019 proposing BigGAN
- SAGAN Paper Self-Attention Generative Adversarial Networks published by H. Zhang et al. at ICML in 2019 proposing Self-Attention GAN
Exam

06/24/2025 18:00
Tuesday

Midterm
Topics:
- The exam covers Chapters 1 to 3
- The exam is 3 hours long
- No programming questions
- Starts at 6:00 PM in EX-320
Lecture

07/03/2025
Thursday

Lecture 33: Probabilistic Latent-Space Generation
Lecture Notes:
- Chapter 5 - Section 1
Further Reads:
- Probabilistic Latent: Chapter 16 of [BB] Sections 16.1 and 16.2
- Mixture Models Paper On the number of components in a Gaussian mixture model published by G. McLachlan and S. Rathnayake in 2014 reviewing some key properties of Gaussian mixtures and their approximation power
Lecture

07/03/2025
Thursday

Lecture 34: Variational Inference
Lecture Notes:
- Chapter 5 - Section 2
Further Reads:
- ELBO: Chapter 16 of [BB] Section 16.3
- VI for Likelihood The early paper Computing Upper and Lower Bounds on Likelihoods in Intractable Networks published by T. Jaakkola and M. Jordan at UAI in 1996
- Tutorials on VI Review paper Variational Inference: A Review for Statisticians published by D. Blei, A. Kucukelbir, and J. McAuliffe in 2016 giving a good overview on VI framework
- Introduction to VI Book An Introduction to Variational Autoencoders written by D. Kingma and M. Welling and published by NOW in 2019
Assignment

07/05/2025
Saturday

Project Briefing released!

[Project Briefing]
Due

07/07/2025 23:59
Monday

Assignment #2 due
Lecture

07/08/2025
Tuesday

Lecture 35: Variational Autoencoding
Lecture Notes:
- Chapter 5 - Section 3
Further Reads:
- AE with VI Paper Auto-Encoding Variational Bayes published by D. Kingma and M. Welling in 2014 proposing VAE
- Stachastic Generation by VAE Paper Stochastic Backpropagation and Approximate Inference in Deep Generative Models published by D. Rezende et al in 2014 proposing VAE in parallel
Lecture

07/08/2025
Tuesday

Lecture 36: Encoding and Decoding in VAEs
Lecture Notes:
- Chapter 5 - Section 3
Further Reads:
- AE with VI Paper Auto-Encoding Variational Bayes published by D. Kingma and M. Welling in 2014 proposing VAE
- Introduction to VAE Book An Introduction to Variational Autoencoders written by D. Kingma and M. Welling and published by NOW in 2019
Lecture

07/08/2025
Tuesday

Lecture 37: Training VAEs by ELBO Maximization
Lecture Notes:
- Chapter 5 - Section 3
Further Reads:
- AE with VI Paper Auto-Encoding Variational Bayes published by D. Kingma and M. Welling in 2014 proposing VAE
- Introduction to VAE Book An Introduction to Variational Autoencoders written by D. Kingma and M. Welling and published by NOW in 2019
Lecture

07/10/2025
Thursday

Lecture 38: Gradient on Stochastic Graph by Importance Sampling and Reparameterization
Lecture Notes:
- Chapter 5 - Section 3
Further Reads:
- AE with VI Paper Auto-Encoding Variational Bayes published by D. Kingma and M. Welling in 2014 proposing VAE
- Introduction to VAE Book An Introduction to Variational Autoencoders written by D. Kingma and M. Welling and published by NOW in 2019 read Reparameterization section
Lecture

07/10/2025
Thursday

Lecture 39: Training Computational VAE
Lecture Notes:
- Chapter 5 - Section 3
Further Reads:
- AE with VI Paper Auto-Encoding Variational Bayes published by D. Kingma and M. Welling in 2014 proposing VAE
- Stachastic Generation by VAE Paper Stochastic Backpropagation and Approximate Inference in Deep Generative Models published by D. Rezende et al in 2014 proposing VAE in parallel
Lecture

07/10/2025
Thursday

Lecture 40: VAE Challenges and Variants of VAE
Lecture Notes:
- Chapter 5 - Section 4
Further Reads:
- DCVAE Paper Semi-Supervised Learning with Deep Generative Models published by D. Kingma et al. in 2014 implementing a Deep Convolutional VAE
- Transformer VAE Paper Transformer VAE: A Hierarchical Model for Structure-Aware and Interpretable Music Representation Learning published by J. Jiang et al in ICASSP 2020 proposing a Transformer based VAE
- VAE with VampPrior Paper VAE with a VampPrior published by J. Tomczak and M. Welling in 2017 proposing VAE with general latent prior
Lecture

07/10/2025
Thursday

Lecture 41: Vector-Quantized VAE
Lecture Notes:
- Chapter 5 - Section 4
Further Reads:
- VQ-VAE Paper Neural Discrete Representation Learning published by D. Kingma and M. Welling in NeurIPS 2017 proposing VQ-VAE
Lecture

07/15/2025
Tuesday

Lecture 42: Generation by Langevin Dynamics
Lecture Notes:
- Chapter 6 - Section 1
Further Reads:
- Score Matching Paper Estimation of non-normalized statistical models by score matching published by A. Hyvärinen in 2005 proposing the computational score matching
- Langevin Generation Paper Generative Modeling by Estimating Gradients of the Data Distribution published by Song and Ermon in NeurIPS 2019 explaining the score matching and its application to generation by Langevin Dynamics
Lecture

07/15/2025
Tuesday

Lecture 43: Score Function
Lecture Notes:
- Chapter 6 - Section 1
Further Reads:
- Score Matching Paper Estimation of non-normalized statistical models by score matching published by A. Hyvärinen in 2005 proposing the computational score matching
Lecture

07/15/2025
Tuesday

Lecture 44: Learning to Diffuse by Score Matching
Lecture Notes:
- Chapter 6 - Section 1
Further Reads:
- Score Matching Paper Estimation of non-normalized statistical models by score matching published by A. Hyvärinen in 2005 proposing the computational score matching
Due

07/16/2025 23:59
Wednesday

Project Briefing due
Lecture

07/17/2025
Thursday

Lecture 45: Bayes Optimal and Computational Denoising
Lecture Notes:
- Chapter 6 - Section 2
Lecture

07/17/2025
Thursday

Lecture 46: Generic Diffusion SDE and Its Reverse
Lecture Notes:
- Chapter 6 - Section 2
Further Reads:
- Reverse-time Diffusion Paper Reverse-time diffusion equation models published in Elsevier by B. Anderson in 1982 esplaining the reverse-time diffusion process
- SDE Approach Paper Maximum Likelihood Training of Score-Based Diffusion Models by Song et al. in NeurIPS 2021 explaining the DPM and DDPM from inverse Diffusion viewpoint
Lecture

07/17/2025
Thursday

Lecture 47: Building SDE and Its Reverse for Data Generation
Lecture Notes:
- Chapter 6 - Section 2
Further Reads:
- SDE Approach Paper Maximum Likelihood Training of Score-Based Diffusion Models by Song et al. in NeurIPS 2021 explaining the DPM and DDPM from inverse Diffusion viewpoint
Lecture

07/17/2025
Thursday

Lecture 48: Diffusion Score Matching
Lecture Notes:
- Chapter 6 - Section 2
Further Reads:
- DSM Paper Estimation of non-normalized statistical models by score matching published in Neural Computation by Pascal Vincent in 2011 proposing the denoising approach for score estimation (DSM)
- SDE Approach Paper Maximum Likelihood Training of Score-Based Diffusion Models by Song et al. in NeurIPS 2021 explaining the DPM and DDPM from inverse Diffusion viewpoint
Assignment

07/22/2025
Tuesday

Assignment #3 - Implicit Methods for Generation released!

[Assignment #3 - Implicit Methods for Generation]
Lecture

07/22/2025
Tuesday

Lecture 49: Summary of Score Matching and SDE Limitations
Lecture Notes:
- Chapter 6 - Section 3
Further Reads:
- SDE Approach Paper Maximum Likelihood Training of Score-Based Diffusion Models by Song et al. in NeurIPS 2021 explaining the DPM and DDPM from inverse Diffusion viewpoint
Lecture

07/22/2025
Tuesday

Lecture 50: Probabilistic Diffusion - Forward Process
Lecture Notes:
- Chapter 6 - Section 3
Further Reads:
- DPM Paper Deep Unsupervised Learning using Nonequilibrium Thermodynamics published by J. Sohl-Dickstein et al. in ICML 2015 proposing DPM framework for generation (from Variational Inference)
Lecture

07/22/2025
Tuesday

Lecture 51: Reverse Probabilistic Diffusion via MLE
Lecture Notes:
- Chapter 6 - Section 3
Further Reads:
- DPM Paper Deep Unsupervised Learning using Nonequilibrium Thermodynamics published by J. Sohl-Dickstein et al. in ICML 2015 proposing DPM framework for generation (from Variational Inference)
Lecture

07/24/2025
Thursday

Lecture 52: Risk Function for Probabilistic Diffusion
Lecture Notes:
- Chapter 6 - Section 3
Further Reads:
- DPM Paper Deep Unsupervised Learning using Nonequilibrium Thermodynamics published by J. Sohl-Dickstein et al. in ICML 2015 proposing DPM framework for generation (from Variational Inference)
Lecture

07/24/2025
Thursday

Lecture 53: Computational DPMs
Lecture Notes:
- Chapter 6 - Section 4
Further Reads:
- DPM Paper Deep Unsupervised Learning using Nonequilibrium Thermodynamics published by J. Sohl-Dickstein et al. in ICML 2015 proposing DPM framework for generation (from Variational Inference)
Lecture

07/24/2025
Thursday

Lecture 54: Denoising DPMs
Lecture Notes:
- Chapter 6 - Section 4
Further Reads:
- DDPM Paper Denoising Diffusion Probabilistic Models published by J. Ho et al. in NeurIPS 2020 proposing DDPM framework
- Improved DDPM Paper Improved Denoising Diffusion Probabilistic Models published by A. Nichol and P. Dhariwal in ICML 2021 proposing improvements to DDPM
Lecture

07/29/2025
Tuesday

Lecture 55: Denoising Diffusion Implicit Models
Lecture Notes:
- Chapter 6 - Section 4
Further Reads:
- DDIM Paper Denoising Diffusion Implicit Models published by J. Song et al. in ICLR 2021 proposing DDIM framework
Lecture

07/29/2025
Tuesday

Lecture 56: Some Known Diffusion Models
Lecture Notes:
- Chapter 6 - Section 5
Further Reads:
- Stable Diffusion Paper High-Resolution Image Synthesis with Latent Diffusion Models published by R. Rombach et al. in IEEE CVPR 2022 proposing Stable Diffusion
- CVL Group Page of the research group Computer Vision & Learning Group in LMU Munich which developed Stable Diffusion
- Imagen Paper Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding published by C. Saharia et al. in 2022 at Google proposing Imagen model
- DALL-E Page of the DALL-E project by OpenAI
Lecture

07/29/2025
Tuesday

Lecture 57: Multimodality and Conditional Generative Models
Lecture Notes:
- Chapter 7
Further Reads:
- Text-to-Image Paper Learning Transferable Visual Models From Natural Language Supervision published by A. Radford et al. in ICML 2021 proposing visual data generation from raw text
- Survey on Multimodal Models Paper Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions published by P. Liang et al. in 2023
Lecture

07/29/2025
Tuesday

Lecture 58: Computational Conditioning via Embedding
Lecture Notes:
- Chapter 7
Further Reads:
- FiLM Paper FiLM: Visual Reasoning with a General Conditioning Layer published by E. Perez et al. in AAAI 2018 proposing FiLM
- Cross-Attention Conditioning Paper Multi-Modality Cross Attention Network for Image and Sentence Matching published by X. Wei et al. in IEEE CVPR 2020 proposing a cross-attention based approach for conditioning
Lecture

07/29/2025
Tuesday

Lecture 59: Final Words
Lecture Notes:
- Chapter 7
Due

08/04/2025 23:59
Monday

Assignment #3 due

Tutorial Schedule

Session	Topics	Tutor
Tutorial 1	PyTorch Overview -- Tokenization and Embedding	A. Mobasheri
Tutorial 2	Transformers and Large Language Models	A. Mobasheri
Tutorial 3	Auto-regressive Models	M. Safavi
Tutorial 4	Energy-based Models	A. Mobasheri
Tutorial 5	Generative Adversarial Networks \| Exam Overview	M. Safavi
	Reading Week & Exam - No Lecture	N/A
Tutorial 6	Variational Inference and VAEs	A. Mobasheri
Tutorial 7	Diffusion Models I	M. Safavi
Tutorial 8	Sample Project Demo	A. Mobasheri
Tutorial 9	Diffusion Models II	M. Safavi
Tutorial 10	Advances and Practical Considerations	M. Safavi