CSCE 638 - Natural Language Processing: Foundation and Techniques (Spring 2026)
Course Information
Lectures
- Time: Monday/Wednesday 4:10pm – 5:25pm
- Location: HRBB 124
Instructor
- Kuan-Hao Huang
- Email: khhuang [at] tamu [dot] edu
- Office: PETR 219
- Office Hour: Wednesday 2pm – 3pm
Teaching Assistant
- Rusali Saha
- Email: rs0921 [at] tamu [dot] edu
- Office: PETR 330
- Office Hour: Tuesday 11am – 12pm
Question Handling
- Please send emails to csce638-ta-26s [at] list [dot] tamu [dot] edu
Readings (Optional)
- Speech and Language Processing (3rd ed. draft), Dan Jurafsky and James H. Martin.
Grading
Grade
- Assignments (31%)
- Assignment 0 (1%) [1/29]
- Assignment 1 (10%) [2/10]
- Assignment 2 (10%) [3/3]
- Assignment 3 (10%) [3/24]
- Quizzes (40%)
- Quiz 1 (10%) [2/4]
- Quiz 2 (10%) [2/25]
- Quiz 3 (10%) [3/18]
- Quiz 4 (10%) [4/13]
- Course Project (26%)
- Project Proposal (3%) [3/6]
- Project Presentation (10%) [4/20, 4/22, 4/27]
- Project Report (13%) [4/30]
- Participation (3%)
Late Policy
- Assignment 1-3, Project Proposal, Project Report
- 1 day late: 10% penalty
- 2 days late: 20% penalty
- 3 days late: 30% penalty
- 4 days late: 50% penalty
- 5 or more days late: 100% penalty
- Assignment 0, Project Presentation
- No late submissions allowed
Schedule
| Week | Date | Topic | Readings | Note | |
|---|---|---|---|---|---|
| W1 | 1/12 | L1 | Course Overview [slides] | ||
| 1/14 | L2 | Machine Learning Basics, Text Classification [slides] |
Logistic Regression Neural Networks |
||
| W2 | 1/19 | Martin Luther King, Jr. Day (No Class) | |||
| 1/21 | L3 | Word Representations [slides] |
Word2Vec GloVe fastText |
Assignment 0 | |
| W3 | 1/26 | Class Cancelled Due to Weather | |||
| 1/28 | L4 | Tokenization, Language Modeling, Decoding Methods [slides] |
Byte-Pair Encoding N-Gram LM Smoothing LM Decoding Neural Language Models |
Assignment 1 | |
| 1/29 | Assignment 0 Due | ||||
| W4 | 2/2 | L5 | Convolutional Neural Network, Recurrent Neural Network [slides] |
TextCNN LSTM |
|
| 2/4 | L6 | Sequential Labeling, Sequence-to-Sequence, Attention |
Sequence-to-Sequence Attention-Based RNN |
Quiz 1 (L1-L5) | |
| W5 | 2/9 | L7 | Transformers |
Attention Is All You Need The Annotated Transformer The Illustrated Transformer Positional Encoding |
|
| 2/10 | Assignment 1 Due | ||||
| 2/11 | L8 | Transformers |
Longformer Relative Positional Encoding RoFormer |
||
| W6 | 2/16 | L9 | Contextualized Representations, Pre-Training |
ELMo BERT RoBERTa BART T5 |
|
| 2/18 | L10 | Pre-Training, Large Language Models |
GPT-3 In-Context Learning Chain-of-Thought |
||
| W7 | 2/23 | L11 | Parameter-Efficient Fine-Tuning, Model Distillation |
Prompt Tuning Prefix Tuning Adapter MoE LoRA Distilling Neural Networks |
|
| 2/25 | L12 | Evaluation |
MMLU Humanity's Last Exam LLM-as-a-Judge |
Quiz 2 (L6-L10) | |
| W8 | 3/2 | L13 | Insturction Tuning, Alignment, Post-Training |
Flan-T5 RLHF/PPO DPO |
|
| 3/3 | Assignment 2 Due | ||||
| 3/4 | L14 | Large Reasoning Models, Test-Time Scaling, Decoding |
DeepSeek-R1 Test-Time Scaling |
||
| 3/6 | Project Proposal Due | ||||
| W9 | 3/9 | Spring Break (No Class) | |||
| 3/11 | Spring Break (No Class) | ||||
| W10 | 3/16 | L15 | Text Similarity, Retrieval-Augmented Generation |
Sentence-BERT SimCSE DPR RAG |
|
| 3/18 | L16 | Tool-Augmented Language Models, Agents |
ToolLLM AutoGen ReAct |
Quiz 3 (L11-L15) | |
| W11 | 3/23 | L17 | Multilingual Language Models |
NLLB XLM-R XTREME Multilingual LLMs Thinking |
|
| 3/24 | Assignment 3 Due | ||||
| 3/25 | L18 | Vision-Language Models |
CNN-RNN VisualBERT ViT CLIP BLIP-2 LLaVA |
||
| W12 | 3/30 | L19 | Adversarial Attack and Defense |
Word Replacement Attack Paraphrase Attack Jailbreaking LLMs Data Poisoning Attack |
|
| 4/1 | L20 | AI-Generated Text Detection |
Grover DetectGPT Fast-DetectGPT Watermarking |
||
| W13 | 4/6 |
Invited Talk (Remote) Title: TBD Speaker: Jindong Wang, Assistant Professor, The College of William & Mary |
|||
| 4/8 | L21 | Non-Autoregressive Generation |
Medusa NAT SynST Insertion Transformer LLaDA |
||
| W14 | 4/13 | L22 | Bias Mitigation, Hallucinations |
Bias in Word Embeddings WinoBias Geo-Bias Hallucination Snowball Context-Aware Decoding |
Quiz 4 (L16-L20) |
| 4/15 |
Invited Talk (Remote) Title: TBD Speaker: Ben Zhou, Assistant Professor, Arizona State University |
||||
| W15 | 4/20 | Project Presentations (Remote) | |||
| 4/22 | Project Presentations (Remote) | ||||
| W16 | 4/27 | Project Presentations (Remote) | |||
| 4/29 | Reading Day (No Class) | ||||
| 4/30 | Project Report Due |