The awarded paper entitled "An Algebraic Foundation for Knowledge Graph Construction" presents work that Olaf Hartig at the division Database and Information Techniques at IDA, has done together with Sitt Min Oo, a visiting PhD student from Ghent University in Belgium.
What is the background and motivation for this work?
A Knowledge Graph (KG) is a form of graph-structured database intended to capture knowledge about entities and their relationships. Motivated by their suitability for powering artificial intelligence applications, for connecting data about customers or products, and for data integration, KGs have become an important topic for many of today's data- driven enterprises, including Swedish companies such as Scania, Ericsson, and IKEA.
Constructing a KG, however, is typically a complex process which involves extracting and transforming data from diverse sources into a structured KG format, and programs developed to implement such a process quickly become convoluted, inefficient, and difficult to maintain. To address these issues, KG construction projects increasingly employ declarative mapping languages which provide a high-level, rule-based approach to specifying how data should be transformed and integrated into a KG. While these languages offer many advantages, such as conciseness, maintainability, and separation of concerns, they have been lacking a solid formal foundation.
Without such a foundation, it is impossible to achieve a more fundamental understanding of these languages and their properties (such as their expressive power or their complexity), and it is also impossible to prove the correctness of implementation and optimisation techniques. In fact, the informal nature of existing specifications of these languages has led to discrepancies between different implementations.
What does the awarded paper provide?
The main contribution of the paper is a language-agnostic algebra to capture definitions of mappings from arbitrary types of data sources to RDF (Resource Description Framework) -based KGs. Due to its language-agnostic nature, the algebra can serve as a foundation to define KG mapping languages formally.
As a second contribution, the paper demonstrates this benefit by providing an algorithm that translates mappings defined using RDF Mapping Language, RML, into the algebra. RML is one of the popular KG mapping languages, with an active community and several implementations. Through the provided translation algorithm, the paper shows not only that the presented algebra is at least as expressive as RML, but it also provides a formal definition of the semantics of RML. This formal semantics coincides with the informally-defined semantics of the RML specification, which Sitt Min OO and Olaf Hartig have verified by running the official RML test cases on an experimental system in which they have implemented their translation algorithm in combination with a prototypical evaluation engine for the algebra.
Another important value of having a formal algebra that captures declarative mapping definitions is that it may be used as the basis of a systematic and well-defined approach to plan and to optimise the execution of KG construction processes in mapping engines. Related to this option, the paper makes a third contribution: It shows several algebraic equivalences which can be used as rewriting rules to optimise mapping plans that are based on the algebra.
Read more:
- The article entitled An Algebraic Foundation for Knowledge Graph Construction
- Website of the 22nd Extended Semantic Web Conference ESWC 2025