The DATA LAB is a forum for collaborative co-creation and experimentation with diverse forms of doing and communicating academic knowledge.

We meet across disciplinary, institutional and career-stage boundaries to inspire informal scholarly conversations, share experience, experiment, talk, learn together, and develop critical approaches to themes that concern the politics of “data” and of digital technologies in our scholarly practice.

The DATA LAB aims to facilitate new conversations and collaborations and will evolve in through ideas and suggested themes from the participants in workshops. The informal workshop format complements the existing traditional seminar groups and text focused work and allows for new ways to be creative, explorative and playful within academia. We cover a broad variety of contexts such as, but not limited to, everyday life, the production of knowledge and culture, urban governance, health, environmental and energy politics, or warfare.

We are inspired by STS, feminist technoscience, media studies, anthropology, philosophy of technology, critical data studies, visualisation research, the digital humanities, digital sociology and related fields.

Travel support for junior researchers is available - please enquire for more details datalab@liu.se.

When?

Mondays or Wednesdays, 3-4 times per term.

Organise a DataLab event

Here's what we offer:

Room & material

Physical meeting room, coffee breaks and materials for a workshop (within limits).

Event support

Support with administration, communication and publication of the event.

Experience

Experienced organising team to support you with developing your event.

What could you do?

  • Test a new idea, methodology or material for working and thinking with data
  • Organise a workshop, or a seminar related to a topic you are researching, curious about or interested in
  • Something else data-related that you want to explore with your peers?

Who can do a DataLab Session?

You! Scholars and practitioners at all career stages and from a variety of disciplines are welcome to reach out! Please send us an outline or get in touch for an informal chat.

Unsure if your idea fits? Reach out to us, and we will be happy to discuss them with you!

New publication

Beyond academic publics: conversations about scholarly collaborations with cultural institutions

The publication is the result of a collaboration between the Tema Datalab at Linköping University and the Hub for Digital Welfare at Södertörn University.

Publication Beyond academic publics Tema's DataLab

In the spring of 2023, we invited scholars at different career stages from our departments to share their experiences of collaborating with cultural institutions in their research and communication. We wanted to create an inclusive space to talk about the process of enacting such collaborations in practice. And, we wanted to learn from each other about the possibilities and challenges that are part of materializing such collaborations. Our primary interest was not the final product, outcome or success of such collaborations but what it meant for scholars at different career stages, with diverse personal interests, life and professional experience to start, become part of and complete a collaboration in a “good” way.

To the publication (pdf, 8 MB)

Organisers

Previous events

2024

DataLab 13. What is good data? Exploring metrics and methods for assessing the quality of (synthetic) data 

Date and time: 17 December 2024, 9.30-12.00, I:205, I building, Campus Valla, Linköping

Chaired by Maria Eidenskog, with Ericka Johnson, Saghi Hajisharif, H Devinney and Isto Huvila.

The challenge of knowing how to measure the quality of data is shared in all work related to data, for us as scientists from different disciplines as well as outside the academic world. Knowing how to measure quality becomes even more difficult when the data is synthetic, i.e. artificially generated to mimic real-world data patterns and characteristics. In this workshop we will explore what is included, and excluded, when assessing the quality of data by turning to the ongoing work to develop methods and metrics for evaluating synthetic data. We’ll start with insights from three perspectives:

  • Ericka Johnson and Saghi Hajisharif (Linköping University) will discuss approaches to identifying and measuring bias in synthetic data.
  • H. Devinney (Linköping University) will share computer science methods used to evaluate synthetic data for different desired qualities.
  • Isto Huvila (Uppsala University) will discuss "paradata," examining how the documentation of practices and processes varies between synthetic and non-synthetic data in medicine and what this can tell us about quality metrics.

Following the talks, we’ll move on to a liquid lab exercise where we creatively and jointly explore our experiences and ideas on assessing data, both synthetic and non-synthetic. All are welcome!

DataLab 12. The ”Rawness” of the Data: a special session in collaboration with the Tenth Swedish Language Technology Conference

Date and time: 27 November 2024, 9.30-12, SH63, Studenthuset, Campus Valla

This is a special session of the DataLab in collaboration with the Tenth Swedish Language Technology Conference (https://sltc2024.github.io/) that aims to provoke discussion and reflection around the notion of “raw” data. Bowker’s 2008 maxim that “raw data is both an oxymoron and a bad idea” has been enthusiastically taken up and discussed within Critical Data Studies and surrounding fields. However, this challenge to the “rawness” of data doesn’t always map well onto other disciplines, where “raw” data are essential to supporting claims to transparency or reproducibility.

To kick-start our discussion, participants will be provided in advance with copies of a book chapter about “The Rawness of Data” from the recently published volume Behind the Science: The invisible work of data management in Big Science (Harrison 2024) which examines how experimental data will be produced and managed at the European Spallation Source, and the now well-known article “Data Sheets for Datasets” (Gebru et al. 2018), which advocates for better and more transparent documentation about how datasets are produced.

With these resources in hand, we’ll be asking: What do you think of as your “raw” data? When is the moment of “data birth” (Leonelli & Tempini 2020) in your research and what happens after that? What disciplinary conventions are there that determine appropriate processing of your data, and when are your data considered to be “cooked”? How are concerns about raw data, transparency and reproducibility connected in your research field?


DataLab 11. Vocabularies for the algorithmic age (part of the Swedish STS Conference)

Date and time: 4 October 2024, 10:30 – 12:00, The Museum of Work, Norrköping. A part of the Swedish STS Conference.

A workshop with: Minna Ruckenstein (Helsinki University), Dorthe Brogård Kristensen (University of Southern Denmark), Katherine Harrison (Linköping University) and Julia Velkova (Linköping University)

This workshop is about the vocabularies that we need to describe issues brought by digitalisation and datafication in their manifold manifestations. While much scholarly and public debate is centered on emerging technologies, their qualities, imperfections, transformative capacities and authority, these debates are often powerfully shaped through industry buzzwords and hypes that often come to frame too much scholarly vocabularies and approaches. In this workshop we discuss how the vocabularies we use matter, when and why, and explore what terms we might need for capturing, describing and intervening better for transformative relations with data, digital industries and for the making of collective liveable futures with data.

The workshop will feature a series of short, 5-7min long igniting talks and provocations that address a selected keyword by an invited speaker, followed by an interactive group exercise-discussion with paper and pen. We also discuss how words/terms can act as devices of interference, and transformative critical engagement for scholarship, industry and policy on data-related developments.

Short Inspirations/Provocations by: Minna Ruckenstein (Helsinki University), Julia Velkova (Linköping University), Katherine Harrison (Linköping University), Dorthe Brogård Kristensen (University of Southern Denmark).

The workshop was organised by TEMA's DataLab, the REIMAGINE ADM and the REPAIR research projects.

DataLab 10. Synthetic Data for State Statistics/Census

Time and place: 10 June 2024, 13.15-16.00. Campus Norrköping and online.

Census data represent a social contract between citizen and state, with the collection, use and management of these kind of data reflecting spatial and temporal specificities of individual nation states’ policies and practices. Synthetic data, through older techniques of statistical disclosure control, have long been used to guarantee confidentiality as a key part of this contract. New techniques such as differential privacy are changing how state statistics data are prepared and released, challenging common expectations of what such data represent and of their connection to reality. In this workshop, we ask:

  • What kinds of knowledge do state statistics/census data promise? How are/were these affected through the use of statistical disclosure control techniques?
  • What are the impacts of new techniques to produce synthetic census data?
  • Can the trajectory of synthetic data within the census domain give us some insights into what we may see evolving in other contexts of application?

Poster for DataLab seminar

DataLab 9. Using generative AI for irreductionist visualization in science

Time and place: 22 May 14.00-16.00, Universitetsklubben conference room.

Chaired by: Mathieu Jacomy from Aalborg University

In this seminar, we will explore using AI and algorithms in science through case studies and practical experiments. The seminar will take the form of an interactive discussion where we will draw tools at some point. Bring your laptop!
Mathieu will start by showcasing a mapping of AI and algorithms in science where an embedding model, network analysis and visualization techniques, prompt engineering and a large language model was used to produce a map of 2M articles. Generative AI has been used to summarize clusters of articles and produce about 8K annotations that make the map readable and relatable. The resulting landscape can be used as a boilerplate for the collaborative exploration of the controversiality of algorithms and AI in science. We could call this kind of visualization "irreductionist," following Latour's word, because it intentionally refrains from aggregating data beyond the point of oversimplification. Network visualization, dimensionality reduction (T-SNE, UMAP), and of course cartography, allow us to live with the trouble of empiricism: real world data is not only uncertain, but often ambiguous, equivocal, and/or polyvalent. Datascapes are often hard to understand by themselves and expensive to annotate, but generative AI is making a decisive change to that economy.

Our experiments show that generative AI can be acceptably good at summarizing large sets of documents, but at certain conditions. What does it mean to summarize documents? How to test and validate the results? How to approach iterating over the prompt design? Can we identify the factors that ensure good-enough results? Why would a LLM succeed at that task while it is bad at so many other things? What infrastructure to use for such computations? We will discuss those questions and explore the conditions necessary to make generative AI a worthy ally in building new methods, techniques and tools for the computational social science and humanities.

Datlab poster

DataLab 8. Machinic landscapes: Art practices at the intersection of machinic logic and natural forms

Time and place: 15th April 2024, 13.15 – 15.00, Tvärsnittet, Kopparhammaren, Campus Norrköping and on zoom.

Seminar with Lila Lee-Morrison, Lund University, co-hosted by Tema Culture and Society (IKOS), the Eco- and Bioart Lab, and the Data Lab (Tema).

Abstract

This talk sketches an aesthetics of the Anthropocene by analyzing artistic engagements with advanced tools of visual calculation used in the context of the environment. It explores what I term, machinic landscapes which are landscape works made by contemporary artists that involve a confrontation between two mediums, that of the logic and the representational mechanisms of machine vision technology and that of the infinite representational forms and processes found in nature and the environment and the multiple layers of representation that emerge and further generate through it. The merging of two theoretical directions of the “machinic” and “landscape” are operationalized in this analysis to counter binaries between nature and technology and foregrounds ways in which environmental forms are integrated into the functioning of a technical, aesthetic gesture by machine. I focus on the work of contemporary artists Mishka Henner, Daniel Lefcourt, and Davide Quayola, each of whom works with various digital visualizing programs often utilized in the fields of landscape architecture and in the geosciences such as heightmapping, LiDAR, 3D scanning, and satellite imaging systems and recontextualize their aesthetic output. These artists experiment with the parameters of these technologies and address some of the aesthetic challenges of the Anthropocene—specifically, issues of latency, entanglement, and scale. Drawing on landscape theory and philosophy of technology to frame this study, this talk revisits the art historical genre of landscape as a vehicle to explore contemporary aesthetic and political dimensions of our relationship to and within the environment.

Bio

Lila Lee-Morrison is a writer, scholar and art historian. Her research interests focus on the visual culture of machine vision, intersections of art and technology and socio-political
agencies of the image. She has written about the visual politics of drone warfare systems, representations of the body through biometric technologies, media representations of the
immigration crisis, ethics of the image and on contemporary art practices as sources of theoretical engagement. She graduated a PhD from Lund University in Sweden in Art History and Visual Culture studies with a published dissertation titled, Portraits of Automated Facial Recognition: On Machinic Ways of Seeing the Face (Transcript Verlag, 2019) that was recognized as providing a novel perspective at the intersection of visual culture studies, philosophy, computer sciences and art history. She has been invited to give public talks internationally on subjects ranging from the intersection of art and AI and the instrumentality of contemporary art production in relation to technology. She has written for Artforum, Theory, Culture and Society and been published by published by MIT Press, Liverpool University Press and Brill Publishing. She recently co-edited two special theme issues with Media+Environment (2023) and the Journal of Media Art Study and Theory (2022). She presently holds a position as a postdoctoral fellow at Lund University in the Dept of Sociology under the ERC funded project, “Show and Tell: Scientific representation, algorithmically generated visualizations, and evidence across epistemic cultures.”

DataLab 7. Synthetic Data for Social Science Research

Time and place: 10 April, 13:15 – 15.00, Universitetsklubben conference room and Zoom

The seminar was held in collaboration with the Operationalising Ethics for AI project.

Seminar content and focus

Despite the immense proliferation of digital data in the world, researchers are often faced with the challenges of obtaining enough data, ensuring diversity in their datasets and attending to privacy concerns. Recently, synthetic data has been proposed as a solution to many of these challenges. With the advent of Large Language Models, social science researchers have considered opportunities for enhancing social science research data through LLM-generated content. We will explore the possibilities/challenges of using synthetic data produced by Large Language Models for qualitative research. Producing synthetic qualitative research data could potentially enable more diverse collection of material for social science researchers working in areas where interviewees/participants are hard to reach. But it also opens up a whole new set of methodological and ethical questions about data and data collection. Moreover, where in other types of synthetic data pipelines researchers typically maintain significant control over how the data are created, production of synthetic qualitative data using LLMs precludes such control and brings up significant concerns for how to evaluate utility of such data. Come and join the seminar, where reflections on your own research practices and challenges are warmly welcome!

About Synthetic Data Seminar series

This seminar series explores the risks, possibilities, and promises of synthetic data across different application areas. Given the increasingly complicated regulatory environment around data use and AI systems, what kinds of risks are addressed, created or made possible through synthetic data? Where there is much excitement about synthetic data in the machine learning community, there is also apprehension and caution. There is a proliferation of synthetic data generation libraries and pipelines becoming available to the technical community. These promise to get beyond the triple challenges of privacy, bias, and data scarcity, but warrant a critical discussion about how and to what extent these challenges are being addressed. This seminar series explores what the state of the art in synthetic data currently is, and what critical, legal, and ethical issues synthetic data techniques may encounter. This Synthetic Data Seminar series is organised by the Operationalising Ethics for AI project.

DataLab 6. Vocabularies for Thinking with Data

Time and place: 6 March 2024, 13:15 – 15:00, Universitetsklubben conference room.

Chaired by: Julia Velkova and Ericka Johnson

During this DataLab session we present a book, eat semlor and talk about the vocabularies that we use, not use or need to describe issues brought by digitalisation and datafication. We ponder upon how do the vocabularies that we use matter, when and why. What new words, terms and fields do we need? Industries generate a buzz word a day – just think of "smart", "AI", "cloud", "digital twin”. Scholars are following by generating subfields and their own buzz words for studying digital matters – think of critical infrastructure/app/platform/algorithm/data/software/... studies. What are your buzz words, vocabularies and favourite terms? Why and when did we start calling software an app, a storage closet a "cloud", an algorithm “a robot”? What are the terms that you use and why?

We kick-off the conversation with the notion of the backend, and a brief presentation of the newly published book Media Backends: Digital Infrastructures and Sociotechnical Relations (Edited by Lisa Parks, Julia Velkova and Sander De Ridder). Everyone is welcome!

Datalab poster

2023

DataLab 5. Synthetic Data in Smart Cities/Digital Twins

Time and place: 29 November 2023, 9.00–12.00, TEM21.

What is synthetic data and how does it matter? At this workshop, we will discuss the risks, possibilities and promises of synthetic data across different application areas. Given the increasingly complicated regulatory environment around data use and AI systems, what kinds of risks are addressed, created or made possible through synthetic data? Where there is much excitement about synthetic data in the machine learning community, there is also apprehension and caution. There is a proliferation of synthetic data generation libraries and pipelines becoming available to the technical community. These promise to get beyond the triple challenges of privacy, bias, and data scarcity, but warrant a critical discussion about how and to what extent these challenges are being addressed. We will discuss what the state of the art in synthetic data currently is, and what critical, legal, and ethical issues synthetic data techniques may encounter.

DataLab 5 poster

DataLab 4. Data and Ethics Workshop

Time and place: 20 October, Digimaker (Studenthuset), 9:30 – 14:00 including lunch

We are kicking-off the fall term sessions of TEMAs DATA LAB with a workshop on Data & Ethics to take place on 20 October 9:30 – 14:00 on campus (DigiMaker at Studenthuset). During the workshop we will test, explore and discuss the Data Ethics Decision Aid framework developed by scholars at the Data School at Utrecht University. The framework is described as an aid “for reviewing government data projects that considers their social impact, the embedded values and the government’s responsibilities in times of data-driven public management”, and “a useful process for ethical evaluation of data projects for public management and an effective tool for creating awareness of ethical issues in data practices.” (read more about it here: https://link.springer.com/article/10.1007/s10676-020-09577-5)

We invite you to join the workshop either in the role of a tester, bringing a case to test the tool on; or in the role of discussant-participant. There are no strict requirements to be a tester, it is enough to have a project/case that involves handling data and ethics that you are willing to try the tool on during the workshop. Testers could also be organisations that you are working with and who have to deal with data and ethics. It is also perfectly fine to join as a participant/discussant if you are interested but do not have a specific project going on.

DataLab 4 poster

DataLab 3. Scholarly research in plaintext: using the zettlekasten method, markdown, and pandoc to organize a sustainable scholarly workflow

Time and place: 17 May 2023, 0:30-12:00, in Forum

The workshop will be led by our colleague Charles Berret (post-doc at TEMA-G/Media and Information Technologies) and the topic is “Scholarly Research in Plaintext: Using the Zettelkasten Method, Markdown, and Pandoc to Organize a Sustainable Scholarly Workflow”.

Abstract: Every researcher needs a system to organize their work, but many tools and platforms end up working against us. The purpose of this workshop is to examine how the tools we use impact how we conduct scholarly research, especially when those tools are a source of friction with the mental models that best match our projects, practices, and materials. Focusing specifically on text-based research practices, we will explore the Zettelkastenmethod as a platform-independent, open-source workflow developed in the Digital Humanities to support scholars in gathering, organizing, and developing ideas from notes, to drafts, to manuscripts.

DataLab 3 poster

DataLab2. Beyond academic publics: collaborating with cultural institutions in research and communication

Time and place: 15 March 2023, 9:30 - 12:00 with coffee and lunch, DigiMaker at Studenthuset, floor 5

Chaired by Anne Kaun, Södertörn University, Julia Velkova and Maria Eidenskog, TEMA-T. How to engage cultural institutions in our research and our communication about it? This workshop gathers researchers who have actively engaged with cultural institutions to co-produce and /or disseminate knowledge. We share different experiences and jointly explore possibilities, challenges and future avenues ahead. Join the session if you are interested, regardless of your previous experience in collaborating with publics beyond academia. We aim to create a space for open and creative thinking that inspires ongoing and future projects and collaborations.

With contributions by Maria Arnelid, TEMA G, Amanda Lagerkvist, Uppsala University, Dominika Lisy, TEMA G, Marko Marila, TEMA T, Anna Storm, TEMA T, Jenny Sjöholm, TEMA T.

DataLab 2 poster

DataLab 1. Engaging with apps: methods and questions with Darcy Parks & Julia Velkova

Time and place: 25 January 2023, 9:30 - 12:00 with coffee, Lethe

What do apps do? How do we analyze relations mediated by apps? What methods exist out there, and what methods can we use to approach a social practice via apps? Come to the first session of the Data Lab to explore apps through a hands-on workshop and a discussion about ways of knowing and engaging with apps. Everyone is welcome! Bring your laptop!

DataLab 1 poster