This project is led by the Swedish National Courts Administration, in collaboration with Linköping University (LiU), the Swedish Defense Research Agency, and the Swedish Police Authority. AI Academy participants Nils Alenäs, Aleksi Maxim Andreev and Victor Lagerbring are working on the project. Its goal is to automate the protection and anonymization of sensitive information in legal documents.
Today, anonymization, sometimes called “masking”, is done manually by lawyers and administrative staff. This process takes a lot of time and resources. Automating it would free up valuable time for legal professionals and make the process faster, more accurate, and consistent across all courts. If successful, this solution could also be used by large companies, government agencies, and even the military and changing how sensitive data is handled everywhere.
The project uses AI technology based on Swedish BERT, a language model trained to understand Swedish text. It combines this with NER (Named Entity Recognition), which is a technique for finding specific types of information in text, such as names of people (PERSON), organizations (ORG), locations (LOC), and codes. Once these details are detected, they will be anonymized using methods like pseudonymization (replacing real names with fake ones) and format-preserving masking (keeping the same structure so the document looks natural). If the system is unsure, a human will review the case to ensure accuracy.
Because legal documents contain highly confidential information, the system will only be deployed within each individual court. It will not share data between courts, and the AI will only be trained on public and synthetic data, never on real court cases, so that privacy is fully protected.
To ensure high quality, the system will be tested using metrics like precision, recall, and F1-score. These are standard measures in AI that check how accurately the system finds and anonymizes sensitive information. Combined with user feedback, these evaluations will help guarantee that the final solution is both accurate and reliable, reducing workload and strengthening legal certainty.