Data quality in unstructured and semi-structured data

Data quality measurement and measures for wikis and knowledge graphs

In companies, the availability of information has become a decisive factor that has a lasting impact on productivity and profitability. Organisations that make intensive use of large amounts of data (Big Data) to support decision-making, and that see themselves as "data-driven," yield significantly better financial and operational results. This shows what a critical resource the data used in organisations really are. Two modern ways to map and formalise static and dynamic domain knowledge for this purpose are (enterprise) wikis and knowledge graphs. Due to the predominance of unstructured or semi-structured data formats in this context and the collaborative creation process in companies, ensuring data quality is of particular relevance for science and practice. Ulm University, in cooperation with the University of Regensburg and xapio GmbH, is pursuing the following goals:

  1. Development of methods and metrics to measure the data quality of wikis and knowledge graphs
  2. Definition of measures to improve the data quality of wikis and knowledge graphs
  3. Development of approaches for the automated and quality-assured creation of knowledge graphs from unstructured data such as (enterprise) wikis.

Cooperation partners: University of Regensburg, xapio GmbH

Funding body: Funding body: R&D programme "Information and Communication Technology" of the Free State of Bavaria

Project period: until 2023