In a paper in the Journal of Computational Social Science, Marc Keuschnigg, Niclas Lovsjö, and Peter Hedström (2018) discuss machine learning, agent-based simulations, large online experiments, and big data. They make a case for the theory-grounded approach of analytical sociology, which can push computational social science to move beyond descriptions of macro patterns and predictions of individual-level outcomes to bottom-up explanations of social phenomena.
Benjamin Jarvis and Martin Arvidsson join Keuschnigg and Hedström in extending this perspective in a chapter in Routledge’s Handbook of Computational Social Science (2021). They deepen the discussion of empirically calibrated agent-based simulations, emphasizing their potential for assessing the macro-level implications of behavioral or policy interventions that would be unethical or infeasible to test in the real world. They also promote the use of automated methods of natural language processing (NLP) in sociological research. Computational methods like NLP may bridge a divide between “qualitative” and “quantitative” research traditions by enabling the analysis of text data—and potentially audio and visual data—at larger scales and with greater methodological transparency.
Arvidsson takes the lead in Taha Yasseri’s Handbook of Computational Social Science (Edgar Elgar 2023). With his colleagues, he examines how to devise and test mechanism-based explanations of collective phenomena by integrating the explanatory principles of analytical sociology with the methods and data of computational social science. Large, time-stamped, relational datasets, increased computing power, and machine learning enable analytical sociologists to (i) assess the interdependencies between social actors, (ii) trace the dynamics implied by such interdependencies, and (iii) study their cumulative effects. Integrating analytical sociology and computational social science narrows the gap between theoretical and empirical work in sociology, in ways that should be instructive for the field of computational social science at large.
In the Oxford Handbook of the Sociology of Machine Learning, Arvidsson and Keuschnigg (2023) discuss how applying machine learning approaches to digital trace data—including text data—can improve the estimation of social influence effects. Social influence—how individuals affect each other’s behavior—plays a pivotal role in a wide variety of social phenomena and is an enduring concern in the social sciences. But empirical identification of social influence effects is challenging outside of experimental settings because people form relationships and join social groups at least partly based on unobserved, homophilous preferences. Sophisticated algorithms can take advantage of the high temporal and contextual granularity of digital trace data to account for homophilous selection processes, reducing biases when estimating social influence effects. This contributes to sociological theory building by grounding explanatory models in empirical data, leading to better understandings of how various macro-phenomena, such as inequality, segregation, and polarization, emerge from social interdependence, network effects, and critical masses.