Another research interest of mine are self-rationalizing models that generate free-text explanations along with their predictions. While textual explanations are flexible and easy to understand, they come with challenges such as a speculative relation to the prediction and the inheritance of possibly undesirable properties of human explanations, such as the ability to convincingly justify wrong predictions. I work on the evaluation and control of such explanations, and on the relation between explanation design and utility.

Jenny Kunz
PhD student
Interpretable and explainable NLP.