| Week |
Date |
Topic |
Readings |
Presenter |
| W1 |
8/24 |
Course Overview |
|
Instructor |
|
8/26 |
NLP Basics |
Machine Learning Basics, Word Representations |
Instructor |
| W2 |
8/31 |
NLP Basics |
Language Modeling, Convolutional Neural Network |
Instructor |
|
9/2 |
NLP Basics |
Recurrent Neural Network, Attention |
Instructor |
| W3 |
9/7 |
Labor Day (No Class) |
|
|
|
9/9 |
NLP Basics |
Transformers, Pre-Training |
Instructor |
|
9/10 |
LaTeX Assignment Due |
|
|
| W4 |
9/14 |
NLP Basics |
Large Language Models, Text Similarity |
Instructor |
|
9/16 |
NLP Basics |
Retrieval-Augmented Generation, Vision-Language Models |
Instructor |
| W5 |
9/21 |
Alignment, Post-Training |
Training language models to follow instructions with human feedback, NeurIPS 2022
Direct Preference Optimization: Your Language Model is Secretly a Reward Model, NeurIPS 2023
SimPO: Simple Preference Optimization with a Reference-Free Reward, NeurIPS 2024
|
Instructor |
|
9/23 |
Test-Time Scaling, Reasoning Models |
s1: Simple test-time scaling, EMNLP 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning, arXiv 2025
Understanding R1-Zero-Like Training: A Critical Perspective, COLM 2025
|
Instructor |
| W6 |
9/28 |
Model Efficiency |
LoRA: Low-Rank Adaptation of Large Language Models, ICLR 2022
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity, JMLR 2022
QLoRA: Efficient Finetuning of Quantized LLMs, NeurIPS 2023
CaM: Cache Merging for Memory-efficient LLMs Inference, ICML 2024
|
|
|
9/30 |
Bias Detection and Mitigation |
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings, NeurIPS 2016
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints, EMNLP 2017
BLIND: Bias Removal With No Demographics, ACL 2023
On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning, ACL 2023
|
|
|
10/1 |
Literature Review Due |
|
|
| W7 |
10/5 |
Invited Talk (Remote) |
|
|
|
10/7 |
Kuan-Hao is traveling (No Class) |
|
|
|
10/8 |
Project Proposal Due |
|
|
| W8 |
10/12 |
Project Highlight Presentation (Remote) |
|
|
|
10/14 |
Adversarial Attacks and Jailbreaking |
Universal Adversarial Triggers for Attacking and Analyzing NLP, EMNLP 2019
BERT-ATTACK: Adversarial Attack Against BERT Using BERT, EMNLP 2020
Towards Robustness Against Natural Language Word Substitutions, ICLR 2021
Universal and Transferable Adversarial Attacks on Aligned Language Models, arXiv 2023 [Present]
|
|
| W9 |
10/19 |
AI-Generated Text Detection |
Defending Against Neural Fake News, NeurIPS 2019
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature, ICML 2023
Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature, ICLR 2024
A Watermark for Large Language Models, ICML 2023 [Present]
|
|
|
10/21 |
Hallucinations |
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models, EMNLP 2023
How Language Model Hallucinations Can Snowball, ICML 2024
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps, EMNLP 2024
Chain-of-Verification Reduces Hallucination in Large Language Models, ACL-Findings 2024 [Present]
|
|
| W10 |
10/26 |
Model Interpretability, Model Steering |
Scaling and evaluating sparse autoencoders, ICLR 2025
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation, ICLR 2025
Steering Llama 2 via Contrastive Activation Addition, ACL 2024
Steering Vector Fields for Context-Aware Inference-Time Control in Large Language Models, arXiv 2026 [Present]
|
|
|
10/28 |
Model Editing, Model Unlearning |
Locating and Editing Factual Associations in GPT, NeurIPS 2022
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning, COLM 2024
LLM Unlearning via Neural Activation Redirection, NeurIPS 2025 [Present]
|
|
| W11 |
11/2 |
Vision-Language Alignment |
When and why vision-language models behave like bags-of-words, and what to do about it?, ICLR 2023
What's "up" with vision-language models? Investigating their struggle with spatial reasoning, EMNLP 2023
Hidden in plain sight: VLMs overlook their visual representations, COLM 2025
Seeing but Not Believing: Probing the Disconnect Between Visual Attention and Answer Correctness in VLMs, ICLR 2026 [Present]
|
|
|
11/4 |
Multimodal Reasoning |
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs, CVPR 2024
Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAG, ICML 2025
CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning, ICLR 2025
Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space, arXiv 2025 [Present]
|
|
|
11/5 |
Midterm Report Due |
|
|
| W12 |
11/9 |
Agents, Model Memory |
Toolformer: Language Models Can Teach Themselves to Use Tools, NeurIPS 2023
ReAct: Synergizing Reasoning and Acting in Language Models, ICLR 2023
MemoryBank: Enhancing Large Language Models with Long-Term Memory, AAAI 2025
A-MEM: Agentic Memory for LLM Agents, NeurIPS 2025 [Present]
|
|
|
11/11 |
Long-Context Understanding |
Lost in the Middle: How Language Models Use Long Contexts, TACL 2023
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization, ACL-Findings 2024
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models, NAACL 2024
Efficient Streaming Language Models with Attention Sinks, ICLR 2024 [Present]
|
|
| W13 |
11/16 |
Multilingual Models |
Cross-Lingual Ability of Multilingual BERT: An Empirical Study, ICLR 2020
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models, ACL 2024
Blessing of Multilinguality: A Systematic Analysis of Multilingual In-Context Learning, ACL-Findings 2025
How do Large Language Models Handle Multilingualism?, NeurIPS 2024 [Present]
|
|
|
11/18 |
Non-Autoregressive Generation |
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads, ICML 2024
Syntactically Supervised Transformers for Faster Neural Machine Translation, ACL 2019
Large Language Diffusion Models, NeurIPS 2025
Insertion Transformer: Flexible Sequence Generation via Insertion Operations, ICML 2019 [Present]
|
|
| W14 |
11/23 |
Invited Talk (Remote) |
|
|
|
11/25 |
Reading day (No Class) |
|
|
| W15 |
11/30 |
Final Presentations (Remote) |
|
|
|
12/2 |
Final Presentations (Remote) |
|
|
|
12/6 |
Final Report Due |
|
|