Interpretable and explainable NLP

Many current Natural Language Processing (NLP) models are large neutral networks that are pre-trained on huge amounts of unlabeled data. How these models store, combine and use information from this self-supervised training is still largely obscure. I develop techniques that probe how linguistic information is structured within the model, and what the limitations of current models are.

Another research interest of mine are self-rationalizing models that generate free-text explanations along with their predictions. While textual explanations are flexible and easy to understand, they come with challenges such as a speculative relation to the prediction and the inheritance of possibly undesirable properties of human explanations, such as the ability to convincingly justify wrong predictions. I work on the evaluation and control of such explanations, and on the relation between explanation design and utility.

CV in brief

  • Bachelor’s degree in Computer Science from Humboldt University of Berlin (2016).
  • Master’s degree in Language Technology from Uppsala University (2018).
  • PhD Student at LiU (2019-today)
  • Best paper award for our Paper “Human Ratings Do Not Reflect Downstream Utility: A Study of Free-Text Explanations for Model Predictions” at BlackboxNLP 2022.
  • Teaching Assistant: Text Mining, Language Technology, Natural Language Processing, Language and Computers, Neural Networks and Deep Learning.




About the division

Colleagues at AIICS

About the department