Seminar Advances in Artificial Intelligence

Artificial Intelligence is a central discipline in computer science nowadays. Research at the Institute of Artificial Intelligence focuses on knowledge-based techniques, i.e., how we can formalize and reason about knowledge within intelligent systems. Knowledge-based techniques complement learning-based techniques, e.g., to provide far-ranging foresight based on map knowledge for autonomous vehicles or for planning actions that an intelligent agent needs to perform in order to reach a certain goal. The seminar will introduce students to selected research topics in this area.

Participation and Requirements

Each participant is assigned a scientific article that has to be summarized and presented to the other students. Additionally, the participants have to write a short summary, survey related work with respect to their assigned topic, peer review the work of their fellow students, and actively discuss the topics presented in the course. The language of the seminar is English.

Schedule

The seminar consists of (at least) one initial meeting for the whole course and individual meetings with the topic supervisors. The final presentations take place in blocks at the end of the semester. Exact dates and times are still to be scheduled.

Topics

In this seminal paper Reiter describes an approach for diagnosing faults that occur in a systems described in some logic. He proposes a rather general approach that allows for computing explanations for observations that follow from a logical theory. The introduced data structure of a hitting set tree and the corresponding algorithm has since been improved by many researchers and has been used in a wide range of contexts.
Prerequisites: Understanding the paper requires some background in first-order logic and a keen interest in logics and algorithms.

References:

A Theory of Diagnosis from First Principles

The Resource Description Framework (RDF) is a standardized data model for representing machine processable knowledge within the Semantic Web. Increasingly large RDF datasets are published, e.g., driven by initiatives like the Linked Open Data movement. Traditional RDF representations are very verbose and not well suited for an efficient storage and exchange. The article presents a binary RDF representation that addresses the issue. Based on a set of metrics that characterize the skewed structure of real-world RDF data, an RDF representation format is developed that modularly partitions and efficiently represents three components of RDF datasets: Header information, a Dictionary, and the actual Triples structure (thus called HDT). An experimental evaluation shows that datasets in HDT format can be compacted by more than fifteen times as compared to current naive representations, improving both parsing and processing while keeping a consistent publication scheme. Specific compression techniques over HDT further improve these compression rates and prove to outperform existing compression solutions for efficient RDF exchange.
Prerequisites: The paper proposes relatively intricate compression algorithms, but does not require special knowledge of RDF beyond what can quickly be acquired.

References:

Binary RDF representation for publication and exchange (HDT)

An important result in the historic development of Description Logics, was the discovery of the correspondence between Description Logics and Modal Logics. In his influential paper “A Correspondence Theory for Terminological Logics: Preliminary Report” Klaus Schild first observed that the Description Logic ALC is a notational variant of the multi-modal version of the logic K. Here, we want to look at this result, the general relationship between Description Logics and Modal Logics and the implications for research in both fields.

References:

A Correspondence Theory for Terminological Logics: Preliminary Report

Explainable Machine Learning (or more generally Explainable AI) is a set of techniques for explaining complex machine-learning models making them more interpretable to humans. Neural Networks are usually complex models, especially with the introduction of Deep Neural Networks, and are used for a variety of tasks including image classification and natural language processing. Neural network interpretation methods focus on visualizing and explaining the internal workings of neural networks. They aim to provide insights into the learned representations of neural networks to improve interpretability and trustworthiness.

References:

Description Logic ontologies frequently contain repeated structures. This naturally rises the question of how these similarities can be identified, expressed and used in dealing with ontologies. Here, we want to take a look at a paper of Kindermann et al. that introduces a rule-based language for describing such repeated patterns as well as for applying them to existing ontologies. This is meant as a step towards making ontologies more readable and maintainable.

References:

Generating Ontologies from Templates: A Rule-Based Approach for Capturing Regularity

Materialisation describes both the result and process of fully extending a set of facts by all its consequences based on provided inference rules. Since these computations can be quite expensive, a materialisation is typically updated incrementally whenever the original facts change, rather than re-computing everything from scratch each time. This article introduces the so-called Backward/Forward (B/F) algorithm that combines backward and forward chaining in order to update Datalog materialisations. Compared to other approaches, like DRed for example, B/F is especially efficient if many facts can be derived by several entailments.

References:

Incremental Update of Datalog Materialisation: the Backward/Forward Algorithm

Explainable Machine Learning (or more generally Explainable AI) is a set of techniques for explaining complex machine-learning models making them more interpretable to humans. Local Model-Agnostic Methods are a category of such methods focusing on explaining predictions of individual instances, like a classification of objects on one image, or a translation of one sentence from English to German. They are model-agnostic, meaning they are applicable to many machine-learning models. These methods have been applied to various domains, including image classification, natural language processing, robotics, and healthcare, and have been shown to provide insightful and understandable explanations of model predictions.

References:

Computer vision is a field of AI for enabling computers to derive meaningful and useful information from images and videos. Features are specific structures in images such as points, edges, or objects. We focus on feature detection and matching, which are important computer vision problems. There are different algorithms for solving these problems leading to different accuracy, processing times, and memory requirements. These algorithms are further used in 3D mapping, automatic object tracking, robot navigation, 3D object reconstruction, etc. even on mobile and embedded devices.

References:

Ontology Design Patterns (ODPs) are meant to facilitate the ontology generation process, similar to Software Design Patterns for software development. ODPs capture best practices and established design decisions to be reused in new ontologies. This is meant to improve the quality of ontologies and decrease the dependence on ontology experts for practical applications. Here, we want to take a look at the motivation, development and current state of ODPs.

References:

Description Logics is a family of languages designed for representation and reasoning about knowledge. The design of these languages was guided by a compromise between the expressivity of languages and the computational complexity of the algorithms required for performing the reasoning task: the more features are there in the languages, the harder it is to develop efficient procedures for them. Expressive languages usually have a very high complexity: exponential time or higher. The EL family of Description Logics was designed by restricting the number of language features so that the reasoning complexity remains polynomial and the languages can still be useful for a ranges of practical applications.

References:

A Geographical Information System (GIS) is a database containing information about objects on a map, such as buildings, roads, and points of interests. A most commonly used representation of objects uses polygons, which coordinates are acquired from satellite images. Instead of such numeric (quantitative) information, qualitative spatial representation focuses on representing relations between object. For example, one can say that a particular building is adjacent to some road, without specifying the exact position of the build or the road. The Region Connection Calculus RCC5 defines five base relations according to which two geometric regions can be related to one other: a region can be disconnected from, overlapping with, equal to, containing, or is contained within another region. The Region Connection Calculus RCC8 additionally distinguishes situations when regions share or do not share borders. The central question of this representation is whether a network consisting of such relations between regions can be realised on a physical map. For example if region A contains region B, which overlaps with region C then A and B cannot be disconnected. In general, checking if a network is realisable is an NP-complete problem, however, for some subsets of relations, this problem can be solved in polynomial time.

References:

First-Order logic (also known as Predicate Logic) is a well-known extension of Propositional Logic, in which statements can be described using (domain) variables as well as quantifiers, which determine which objects can be assigned to variables. Unlike Proposition Logic, First-Order Logic is known to be undecidable: there is no general procedure using which one can check if a First-Order formula is always true, or always false, or, if one formula logically implies another formula. To overcome this problem, researchers try to identify useful restrictions on First-Order formulas that ensures decidability of the resulting fragment of First-Order Logic consisting of such formulas. One of such restrictions is the number of different variables that can be used in formulas. The Two-Variable Fragment of First-Order Logic consists of formulas that contain at most two different variables, say only variables x and y. It turns out that this fragment enjoys the so-called "small model property": to answer questions about formulas it is sufficient to consider only "examples" with at most exponentially-many objects in the size of the formulas.

References:

Seminar Advances in Artificial Intelligence

Topics

A Theory of Diagnosis from First Principles

Binary RDF representation for publication and exchange (HDT)

Description Logics and Modal Logic

Explainable Machine-Learning Methods for Neural Network Interpretation

Extending Description Logics with Rules

Incremental Update of Datalog Materialisation: The Backward/Forward Algorithm

Local Model-Agnostic Explainable Machine Learning Methods

Methods for Feature Detection and Matching

Ontology Design Patterns

The EL Family of Description Logics

The Region Connection Calculus

Two-Variable Fragment of First-Order Logic