Skip to content

d-noe/NLP_DH_PSL_Fall2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Natural Language Processing (NLP) — DH PSL, Fall 2025

This repository hosts material for 4x3hours lectures in the context of the Introduction to Natural Language Processing (NLP) class from PSL's Master of Digital Humanities, Fall 2025.

  1. Week 1 (29/10)
  2. Week 2 (05/11)
  3. Week 3 (12/11)
  4. Week 4 (19/11)

The code and notebooks for the tutorials and hands-on sessions are provided in the code folder. The data used for these sessions is described and stored in data.

Week 1 (29/10): Modeling Language: Towards Contextualized Representations

To go further

Want more hands-on? Check the To go further section in code folder.

Week 2 (05/11): Discovering Structure: Semantic Spaces & Unsupervised Modeling

To go further

Dimensionality Reduction:

  • (Coenen & Pierce, 2019): Understanding UMAP: explanations and visual demonstration of UMAP (compared with t-SNE).

Topic Modeling:

  • (Churchill & Singh, 2021): The Evolution of Topic Modeling.
  • (Li et al., 2024): Applying Topic Modeling to Literary Analysis: A Review.
  • (Gillings & Hardie, 2022): The interpretation of topic models for scholarly analysis: An evaluation and critique of current practice.
  • (Antoniak, 2023): Topic Modeling for the People: an interesting blogpost by Maria Antoniak, sharing a set of steps that you can follow to get coherent topics from most datasets, primarily focusing on LDA. It provides as well many additional references to dig deeper.
  • (Egger & Yu, 2022): A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts.
  • Evaluation Concerns:

Want more hands-on? Check the To go further section in code folder.

Week 3 (12/11): Learning Patterns: Supervised Tasks and Adaptation

To go further

Text Classification for DH

  • (Bamman et al., 2024): On Classification with Large Language Models in Cultural Analytics.
  • (Lassen et al., 2024): Literary Canonicity and Algorithmic Fairness: The Effect of Author Gender on Classification Models.

Fairness & Bias

Interpretability

Want more hands-on? Check the To go further section in code folder.

Week 4 (19/11): Large Language Models: Foundations, Generation & Beyond

To go further

LLMs:

  • (Cho et al., 2024): Interactive visualisation, with explanations, of the inner working of causal language models. Very nice visualisation and summary of transformer-based LMs! 👀 If you like these visualisation, check also the LLM Visualization by Brendan Bycroft.
  • (Zhao et al., 2023): A Survey of Large Language Models. — Comprehensive review of recent advances related to LLMs, background, key findings, mainstream techniques, etc.

LLMs, Biases and Fairness

Want more hands-on? Check the To go further section in code folder.

About

Repository for an Introduction to Modern NLP Methods. 4 x 3(+) hours sessions (including lectures, tutorials & hands-on exercises).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors