Seeded topic models for vast text archives

Researchers from sociology and statistics implement a scalable seeded topic model that extracts interpretable meaning structures in perhaps the largest text corpus ever analyzed in the social sciences. The authors use their methodology to measure shared understandings of immigration in the Swedish news media during 1945–2019. The semi-supervised text model is open to use for all.

Illustration from research data

The evolution of media frames of immigration.

Sociologists are discussing the need for more formal ways to extract meaning from text. The semi-supervised seeded topic model allows sociological knowledge to be infused into the computational learning of meaning structures. Seed words help crystallize topics around known concepts, while utilizing topic models’ functionality to identify associations in text based on word co-occurrences. The method estimates a concept’s shared interpretation (or framing) via its associations with other frequently co-occurring topics. In a case study, we extract longitudinal measures of shared interpretations of immigration from a vast corpus of millions of Swedish newspaper articles from the period 1945–2019. We infer turning points that partition discourse into meaningful eras and locate Sweden’s era of multicultural ideals that could have coined its tolerant reputation abroad.

For researchers interested in the running of seeded topic models on very large text data, we developed an R package, available on GitHub:

GitHub

Read or download the article

Hurtado Bodell, M., Magnusson, M., & Keuschnigg, M. (2024). Seeded Topic Models in Digital Archives: Analyzing Interpretations of Immigration in Swedish Newspapers, 1945–2019. Sociological Methods and Research.

SageJournals

More about computation text analysis

Computational Text Analysis

Computational analysis offers new ways to derive meaning from text. We use large corpora of text as social sensors to measure what people feel, think, and talk about, which allows us to track the emergence of shared social understandings.

Organisation

Illustration of people surrounded by data

The Institute for Analytical Sociology (IAS)

IAS conduct cutting-edge research on important social, political and cultural matters. The research is sociological - in its original and broadly conceived meaning.

Department of Management and Engineering (IEI)

The Department of Management and Engineering (IEI) strengthens and develops tomorrow’s industry, business world and society by ground-breaking research, education and innovation.

Studenthuset on Campus Valla in Linköping

About LiU

Linköping University, LiU, offers innovative education and boundary-crossing research. The students are among the most desirable in the labour market and international rankings consistently place LiU as a leading global university.

Seeded topic models for vast text archives: Measuring interpretations of immigration in 75 years of Swedish newspaper reporting

Miriam Hurtado Bodell

Read or download the article

More about computation text analysis

Computational Text Analysis

Organisation

The Institute for Analytical Sociology (IAS)

Department of Management and Engineering (IEI)

About LiU

Tags

Read or download the article

More about computation text analysis

Computational Text Analysis

Organisation

The Institute for Analytical Sociology (IAS)

Department of Management and Engineering (IEI)

About LiU

Tags

Share on