Week |
Date |
Topic |
Details |
Note |
W1 | 8/19 |
Course Overview [slides] |
|
|
| 8/21 |
Natural Language Processing Basics [slides] |
Common NLP Tasks, Training Pipelines, Word Representations |
|
| 8/23 |
Natural Language Processing Basics [slides] |
Word Representations, Tokenization |
|
W2 | 8/26 |
Natural Language Processing Basics [slides] |
Tokenization, Convolutional Neural Network, Recurrent Neural Network, Long Short-Term Memory |
|
| 8/28 |
Natural Language Processing Basics [slides] |
Long Short-Term Memory, Attention, Transformers |
|
| 8/30 |
Natural Language Processing Basics [slides] |
Transformers, Contextualized Representations, Pre-Training |
|
W3 | 9/2 |
Labor Day (No Class) |
|
|
| 9/4 |
Natural Language Processing Basics [slides] |
Pre-Training, Language Models |
|
| 9/6 |
Natural Language Processing Basics [slides] |
Large Language Models, Prompting, In-Context Learning, Instruction Tuning |
|
W4 | 9/9 |
Adversarial Attacks and Defenses [slides] |
[Instructor] Generating Natural Language Adversarial Examples, EMNLP 2018
[Instructor] BERT-ATTACK: Adversarial Attack Against BERT Using BERT, EMNLP 2020
[Instructor] Universal Adversarial Triggers for Attacking and Analyzing NLP, EMNLP 2019
|
Summary Due |
| 9/11 |
Adversarial Attacks and Defenses [slides] |
[Instructor] Certified Robustness to Adversarial Word Substitutions, EMNLP 2019
[Instructor] Towards Robustness Against Natural Language Word Substitutions, ICLR 2021
[Instructor] Universal and Transferable Adversarial Attacks on Aligned Language Models, arXiv 2023
|
|
| 9/13 |
Adversarial Attacks and Defenses |
[Student] Adversarial Example Generation with Syntactically Controlled Paraphrase Networks, NAACL 2018
[Student] Jailbreaking Black Box Large Language Models in Twenty Queries, arXiv 2023
|
|
W5 | 9/16 |
Backdoor Attacks and Data Poisoning [slides] |
[Instructor] Weight Poisoning Attacks on Pre-trained Models, ACL 2020
[Instructor] Concealed Data Poisoning Attacks on NLP Models, NAACL 2021
[Instructor] Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer, EMNLP 2021
|
Summary Due |
| 9/18 |
Backdoor Attacks and Data Poisoning [slides] |
[Instructor] Poisoning Language Models During Instruction Tuning, ICML 2023
[Instructor] Rethinking Stealthiness of Backdoor Attack against NLP Models, EMNLP 2021
[Instructor] ONION: A Simple and Effective Defense Against Textual Backdoor Attacks, EMNLP 2021
|
|
| 9/20 |
Backdoor Attacks and Data Poisoning |
[Student] Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder, EMNLP-Findings 2020
[Student] RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models, EMNLP 2021
|
|
W6 | 9/23 |
AI-Generated Text Detection [slides] |
[Instructor] Defending Against Neural Fake News, NeurIPS 2019
[Instructor] DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature, ICML 2023
[Instructor] Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature, ICLR 2024
|
Summary Due |
| 9/25 |
AI-Generated Text Detection [slides] |
[Instructor] A Watermark for Large Language Models, ICML 2023
[Instructor] SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation, NAACL 2024
|
|
| 9/27 |
AI-Generated Text Detection |
[Student] RADAR: Robust AI-Text Detection via Adversarial Learning, NeurIPS 2023
[Student] Paraphrasing Evades Detectors of AI-Generated Text, But Retrieval is An Effective Defense, NeurIPS 2023
|
|
W7 | 9/30 |
Model Uncertainty [slides] |
[Instructor] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016
[Instructor] Calibration of Pre-trained Transformers, EMNLP 2020
[Instructor] Uncertainty Estimation in Autoregressive Structured Prediction, ICLR 2021
|
Summary Due |
| 10/2 |
Model Uncertainty [slides] |
[Instructor] Teaching Models to Express Their Uncertainty in Words, TMLR 2022
[Instructor] Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback, EMNLP 2023
[Instructor] R-Tuning: Instructing Large Language Models to Say `I Don't Know', NAACL 2024
|
|
| 10/4 |
Model Uncertainty |
[Student] Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation, ICLR 2023
[Student] Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling, ICML 2024
|
|
W8 | 10/7 |
Fall Break (No Class) |
|
|
| 10/9 |
Invited Talk (Remote) |
Machine Unlearning: the general theory and LLM practice for privacy
Speaker: Eli Chien
|
|
| 10/11 |
Team Project Highlights |
Team Project Highlights |
|
W9 | 10/14 |
Model Explainability and Interpretability [slides] |
[Instructor] Rationalizing Neural Predictions, EMNLP 2016
[Instructor] “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, EMNLP 2016
[Instructor] Towards Explainable NLP: A Generative Explanation Framework for Text Classification, ACL 2019
|
Summary Due |
| 10/16 |
Model Explainability and Interpretability [slides] |
[Instructor] A Unified Approach to Interpreting Model Predictions, NeurIPS 2017
[Instructor] Understanding Black-box Predictions via Influence Functions, ICML 2017
[Instructor] Chain-of-Thought Prompting Elicits Reasoning, arXiv 2022
|
|
| 10/18 |
Model Explainability and Interpretability |
[Student] Reframing Human-AI Collaboration for Generating Free-Text Explanations, NAACL 2022
[Student] Self-Consistency Improves Chain of Thought Reasoning in Language Models, ICLR 2023
|
|
W10 | 10/21 |
Bias Detection and Mitigation [slides] |
[Instructor] Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings, NeurIPS 2016
[Instructor] The Woman Worked as a Babysitter: On Biases in Language Generation, EMNLP 2019
[Instructor] On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning, ACL 2023
|
Summary Due |
| 10/23 |
Bias Detection and Mitigation [slides] |
[Instructor] Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints, EMNLP 2017
[Instructor] Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal, ACL 2022
[Instructor] BLIND: Bias Removal With No Demographics, ACL 2023
|
|
| 10/25 |
Bias Detection and Mitigation |
[Student] Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection, ACL 2020
[Student] From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models, ACL 2023
|
|
W11 | 10/28 |
Human Preference Alignment [slides] |
[Instructor] Fine-Tuning Language Models from Human Preferences, arXiv 2019
[Instructor] Training language models to follow instructions with human feedback, NeurIPS 2022
|
Summary Due |
| 10/30 |
Human Preference Alignment [slides] |
[Instructor] Direct Preference Optimization: Your Language Model is Secretly a Reward Model, NeurIPS 2023
[Instructor] mDPO: Conditional Preference Optimization for Multimodal Large Language Models, arXiv 2024
[Instructor] KTO: Model Alignment as Prospect Theoretic Optimization, ICML 2024
|
|
| 11/1 |
Human Preference Alignment |
[Student] SimPO: Simple Preference Optimization with a Reference-Free Reward, arXiv 2024
[Student] Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models, ICML 2024
|
|
W12 | 11/4 |
Hallucinations and Misinformation Control [slides] |
[Instructor] Do Language Models Know When They're Hallucinating References?, EACL 2024
[Instructor] How Language Model Hallucinations Can Snowball, ICML 2024
[Instructor] SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models, EMNLP 2023
|
Summary Due |
| 11/6 |
Hallucinations and Misinformation Control [slides] |
[Instructor] Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024
[Instructor] FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation, EMNLP 2023
|
|
| 11/8 |
Hallucinations and Misinformation Control |
[Student] SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency, EMNLP 2023
[Student] Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension, ICML 2024
|
|
W13 | 11/11 |
Robustness of Multimodal Models (Remote) [slides] |
[Instructor] Learning Transferable Visual Models From Natural Language Supervision, ICML 2021
[Instructor] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation, ICML 2022
[Instructor] Visual Instruction Tuning, NeurIPS 2023
|
Summary Due |
| 11/13 |
Robustness of Multimodal Models (Remote) [slides] |
[Instructor] When and why vision-language models behave like bags-of-words, and what to do about it?, ICLR 2023
[Instructor] Text encoders bottleneck compositionality in contrastive vision-language models, EMNLP 2023
[Instructor] Paxion: Patching Action Knowledge in Video-Language Foundation Models, NeurIPS 2023
|
|
| 11/15 |
Robustness of Multimodal Models |
[Student] Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models, ICML 2024
[Student] On the Robustness of Large Multimodal Models Against Image Adversarial Attacks, CVPR 2024
|
|
W14 | 11/18 |
Robustness of Multimodal Models |
[Student] CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning, ICCV 2023
[Student] Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs, CVPR 2024
|
|
| 11/20 |
Project Presentations |
|
|
| 11/22 |
Project Presentations |
|
|
W15 | 11/25 |
Project Presentations |
|
|
| 11/27 |
Reading Day (No Class) |
|
|
| 11/29 |
Thanksgiving (No Class) |
|
|