Jenny Kunz


Interpretable and explainable NLP.

Interpretable and explainable NLP

Many current Natural Language Processing (NLP) models are large neutral networks that are pre-trained on huge amounts of unlabeled data. How these models store, combine and use information from this self-supervised training is still largely obscure. I develop techniques that probe how linguistic information is structured within the model, and what the limitations of current models are.

Another research interest of mine are self-rationalizing models that generate free-text explanations along with their predictions. While textual explanations are flexible and easy to understand, they come with challenges such as a speculative relation to the prediction and the inheritance of possibly undesirable properties of human explanations, such as the ability to convincingly justify wrong predictions. I work on the evaluation and control of such explanations, and on the relation between explanation design and utility.

PhD thesis

CV in brief

  • Bachelor’s degree in Computer Science from Humboldt University of Berlin (2016).
  • Master’s degree in Language Technology from Uppsala University (2018).
  • PhD Student at LiU (2019-today)
  • Best paper award for our Paper “Human Ratings Do Not Reflect Downstream Utility: A Study of Free-Text Explanations for Model Predictions” at BlackboxNLP 2022.
  • Teaching Assistant: Text Mining, Language Technology, Natural Language Processing, Language and Computers, Neural Networks and Deep Learning.



Jenny Kunz (2024) Understanding Large Language Models: Towards Rigorous and Targeted Interpretability Using Probing Classifiers and Self-Rationalisation
Jenny Kunz, Oskar Holmström (2024) The Impact of Language Adapters in Cross-Lingual Transfer for NLU
Marc Braun, Jenny Kunz (2024) A Hypothesis-Driven Framework for the Analysis of Self-Rationalising Models


Oskar Holmström, Jenny Kunz, Marco Kuhlmann (2023) Bridging the Resource Gap: Exploring the Efficacy of English and Multilingual LLMs for Swedish Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), p. 92-110


Jenny Kunz, Martin Jirénius, Oskar Holmström, Marco Kuhlmann (2022) Human Ratings Do Not Reflect Downstream Utility: A Study of Free-Text Explanations for Model Predictions Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, p. 164-177, Article 2022.blackboxnlp-1.14

About the division

Colleagues at AIICS

About the department