Interventions with Data

The aim of the research was to think through new ways of approaching so called ‘big data’, the kinds of digital traces like social media, feedback forms, online transactions, tracking data, GPS and open government data which are currently being gathered and analysed by governments, private companies and researchers.

A network depicting election spending in UK and US elections 2014-6.A network depicting election spending in UK and US elections 2014-6. Political entities (in red) are connected to recipients (in green) by spending and sized by the number of transactions.The Economist famously called data the new oil, a new natural resource which has spawned the creation of new industries and infrastructures. Data flows between different companies as if through gushing pipelines and becomes stored in unstructured ‘data lakes’ and deposited in ‘data warehouses’. While we should remain sceptical about claims to the new-ness of this data, and especially to discourses which refer to data as somehow ‘naturally occurring’, it is hard to deny that the hype around data is creating real transformations and anxieties in industry, governments and academia: redistributing roles and responsibilities between them.

But to continue The Economist’s somewhat forced metaphor: if data is the new oil then there are important questions to be raised about how it is refined, turned into useful products and kept from leaking everywhere. Much of this data is processed with techniques like machine learning and artificial intelligence which attempt to automatically spot patterns in these massive datasets. These algorithms and the systems built around them are used to distribute resources, assign credit scores and deliver all sorts of content to our many devices. While these techniques have made great strides, they have also been accused of being reductionist or even dangerous. Legal Scholar Frank Pasqual argues that these systems are ‘black boxed’, their inner workings are unavailable for scrutiny, either to the people they effect or, sometimes even their creators. Cathy O’Neil has even gone so far as to describe the algorithmic systems used to process this data and as ‘weapons of math destruction’.

Emerging Alternatives

We are in urgent need of alternative modes of analysis but these are not readily forthcoming. Many of the innovations in this area come from computer science or the natural sciences, when most of the data in question is at least nominally ‘social’ in character, involving interactions between humans or between humans and computers. Many of the sharpest critiques of these techniques come from philosophers, anthropologists and qualitative social scientists who have much to say about data and complex social phenomena, but who are also reluctant to get their hands dirty and experiment with computational techniques themselves. This is further hampered by old, somewhat outdated, splits between quantitative and qualitative methods, hermeneutic and positivist traditions in the social sciences, and tensions between adjacent disciplines. There are countless frameworks for diffusing these epistemological and disciplinary tensions from mixed methods to grounded theory but these often presume the different camps and different methods as stable, singular and separate entities to begin with, rather than examining the tensions and negotiations in practice.

Some researchers in the field of Science and Technology Studies (STS) have in recent years begun to experiment with the use of data visualisations to aide qualitative analyses and forms of public engagement, often involving social media or other publicly available data. Visualisations in general may offer interesting alternatives because, while they necessarily involve algorithms and metrics, they foreground the role of (equipped) human interpretation in the process. They also, it is claimed, open up the research process to a wider array of less technically-minded participants and topic experts, while at the same time, their seductive and flashy character creates new problems and blind spots to contend with. While there is a substantial literature about the implications of data visualisations for the social sciences, and implications for resolving ‘quantitative’ and ‘qualitative’ tensions, there are few studies of how these tools might upset disciplinary identities and routines in situated encounters.

Interventions with Data was a project which ran from June - December 2017.
Funded by a Research Initiation Grant from Riksbanken Jubileefund.

Workshops

The project consisted of a series of three workshops dealing with different types of digital data

In these workshops we hoped to extend some of these experiments with more interpretivist visualisation techniques and apply them to different types of data (other than social media). The interventions in the title refers to the idea that we might learn something different about automated techniques of data analysis by trying (and sometimes, failing) to use them rather than just describing how they are currently used. Rather than detached observation, we actively engineered situations to trail alternative ways of analysing data.

We chose three areas of social life which are being transformed by ‘big data’:

  • political campaigns
  • health data
  • academic metrics and rankings

The aim of each workshops was two-fold: to produce or mock up tools, approaches or visualisations and to reflect on the problems which emerge when different types of researchers use these techniques. While many researchers claim to be trans-disciplinary or beyond these tensions, we felt it was important to dwell on and explore these potential pitfalls and barriers to collaboration, while at the same time not presuming these splits as natural or given.

The workshops were all three days and were held at the Visualisation Centre in Linköping University’s Norrköping Campus. They were modelled on the format of a hackathon, a gathering of programmers and topic experts over 2-3 days. However, there were a few key differences.

Firstly, hackathons are often marked by imbalances between programmers and less tech-savvy participants. We thus made efforts to spend more time on defining problems and research questions and resisting the drive to swap these problems for technically solvable ones. Secondly, while we often started with pre-prepared data sets, the question of which data or what techniques we should use were purposely left open.Thirdly, the objective of the workshops was not to make something that definitively ‘works’ but to learn about the substantive topic and the research process through our sometimes-fumbling attempts to use digital tools.

The workshops involved academics from Science and Technology Studies (STS), medical sociology, medicine, media studies, anthropology, information systems, computer science and library sciences. All the data we used was publicly available, but it still raised ethical questions.

Technology and Social Change