Dialogue Systems Project

Contents

Research in multimodal spoken language dialogue systems is performed on the basis of end-to-end systems including components for acoustic processing, speech signal analysis, recognition, spoken natural language understanding, dialogue processing and speech synthesis. In the framework of this component development several individual topics are proposed as practical studies. They may depend on the current development status of the prototype system demonstrator of the Dialogue Systems Group.
Each student/group will work on one of the proposed topics.

Exams

Exam (graded)

The grade is composed of:

  • Written report of the practical work
  • Presentation

Certificate (not graded)

Certificate after fulfillment of the following criteria:

  • At least three meetings with the supervisor throughout the semester to discuss progress of work
  • Final presentation (demo) and discussion
  • Short description and illustration (max. 10 pages)
  • Submission of the final version (hard/software and documentation) of the project to the supervisor

Open Topics

Topics will be announced before the beginning of each semester.
Information and registration is done by sending an email to the respective topic supervisor.

DeepL Translator Voice Interface for Alexa

Supervisor:Niklas Rach/Juliana Miehle/Sabine Wieluch
Description:

DeepL is a new Translation Service based on Deep Learning Algorithms. This projects aims to create a Voice Interface for easy conversation translations between two users.

Requirements:
  • Basic Python Knowledge
  • Profound interest in Dialogue Systems

Former Topics

Summer and Winter Semester 2017

Embedded Aigaion Query 2017

Dialogue Systems Project
IQ-adaptive Statistical Dialogue Management
2017
Supervisor: Juliana Miehle

Dialogue Systems Project
User Simulation for non-slot-filling dialogues
2017
Supervisor: Juliana Miehle

Summer and Winter Semester 2016

Embedded Aigaion Query 2016

Dialogue Systems Project
Advanced Linguistics using the NAO Robot
2016
Supervisor: Florian Nothdurft

Dialogue Systems Project
Automatic Emotion Recognition Based on Visual Cues using OpenCV (Euler Programme)
2016
Supervisor: Maxim Sidorov

Dialogue Systems Project
Automatic Generation of Dialog Domains from Ontologies
2016
Supervisor: Florian Nothdurft

Dialogue Systems Project
Comparison of the efficiency of classification algorithms for the Interaction Quality modelling for human-human conversation (Euler Programme)
2016
Supervisor: Anastasiia Spirina

Dialogue Systems Project
Dimensionality reduction for text-based emotion recognition
2016
Supervisor: Roman Sergienko

Dialogue Systems Project
Dimensionality reduction for text-based human-machine interaction quality recognition
2016
Supervisor: Roman Sergienko

Dialogue Systems Project
Feature selection based on wrappers with genetic algorithms for natural language call routing (Euler Programme)
2016
Supervisor: Roman Sergienko

Dialogue Systems Project
Integrating the OpenDial Toolkit into OwlSpeak
2016
Supervisor: Juliana Miehle

Dialogue Systems Project
Neuroevolution Approach to Multimodal Emotion Recognition for Russian Language (Euler Programme)
2016
Supervisor: Maxim Sidorov

Dialogue Systems Project
Optimization of weights for voting in term weighting collectives for text classification (Euler Programme)
2016
Supervisor: Roman Sergienko

Dialogue Systems Project
The most informative features selection for the Interaction Quality modelling for human-human conversation (Euler Programme)
2016
Supervisor: Anastasiia Spirina

Summer and Winter Semester 2015

Embedded Aigaion Query 2015

Dialogue Systems Project
Adaption of Dialogue Flow and Content to Verbal Intelligence
2015
Supervisor: Florian Nothdurft

Dialogue Systems Project
Advanced Feature Selection Techniques for the Problem of Speech-based Emotion Recognition (Euler Programme)
2015
Supervisor: Maxim Sidorov

Dialogue Systems Project
Artificial neural networks application for text classification in the field of SDS (Euler Programme)
2015
Supervisor: Roman Sergienko

Dialogue Systems Project
Contextual Image Classification as a Support System for Multimodal Dialogue Analysis (Euler Programme)
2015
Supervisor: Maxim Sidorov

Dialogue Systems Project
Creating an OwlSpeak Dialogue in the Cambridge Restaurant Information Domain for Real User Interaction
2015
Supervisor: Stefan Ultes

Dialogue Systems Project
Creating an OwlSpeak Dialogue in the Cambridge Restaurant Information Domain for User Simulation
2015
Supervisor: Stefan Ultes

Dialogue Systems Project
Fuzzy classification methods application for text classification in the field of SDS (Euler Programme)
2015
Supervisor: Roman Sergienko

Dialogue Systems Project
Interruption detection in human-human conversations in audio files (Euler Programme)
2015
Supervisor: Anastasiia Spirina

Dialogue Systems Project
Proactive Intelligent Systems and Human Factors (1)
2015
Supervisor: Florian Nothdurft

Dialogue Systems Project
Proactive Intelligent Systems and Human Factors (2)
2015
Supervisor: Florian Nothdurft

Dialogue Systems Project
Proactive Intelligent Systems and Human Factors (3)
2015
Supervisor: Florian Nothdurft

Dialogue Systems Project
Sequential Approaches for Interaction Quality Estimation
2015
Supervisor: Stefan Ultes

Winter Semester 2014/2015

Embedded Aigaion Query 2014

Dialogue Systems Project
Contemporary Stochastic Feature Selection Algorithms for Speech-based Emotion Recognition (Euler Programme)
2014
Supervisor: Maxim Sidorov

Dialogue Systems Project
Creating a User-adaptive Dialogue with OwlSpeak
2014
Supervisor: Stefan Ultes

Dialogue Systems Project
Creating an Extension of the LEGO Corpus (Euler Programme)
2014
Supervisor: Stefan Ultes

Dialogue Systems Project
Creating an OwlSpeak Dialogue for the Lets Go User Simulator
2014
Supervisor: Stefan Ultes

Dialogue Systems Project
Dynamic Adaptation of Explanations to Verbal Intelligence
2014
Supervisor: Florian Nothdurft

Dialogue Systems Project
Evaluation of Anger Detection Software
2014
Supervisor: Stefan Ultes

Dialogue Systems Project
Implementation of a Formant Extraction Module (Euler Programme)
2014
Supervisor: Stefan Ultes

Dialogue Systems Project
Informative Features Selection Using Non-Parametric Model in Natural Language Call Routing (Euler Programme)
2014
Supervisor: Tatiana Gasanova

Dialogue Systems Project
Investigation of Constraints for Term Weighting Methods
2014
Supervisor: Roman Sergienko

Dialogue Systems Project
Natural Language Call Routing using Neural Network (Euler Programme)
2014
Supervisor: Tatiana Gasanova

Dialogue Systems Project
Neuro-evolutionary Techniques as High-dimensional Learners for Emotion Recognition (Euler Programme)
2014
Supervisor: Maxim Sidorov

Winter Semester 2013/2014

Syllable Model for Large-Vocabulary Automatic Speech Recognition (Euler)

Supervisor:

S. Zablotskiy

Description:

 

 

 

Russian is a language with a complex mechanism of word formation and word inflection. Therefore, there exists a large number of word forms according to the grammar and different meaning nuances. The use of the sub-word units allows us to reduce the vocabulary size. In this word the units are syllables and the task is to develop a tool for automatic syllables extraction according to different linguistic theories.

(1 Person)

Requirements:
C++/Java Programming skills, analytical thinking

Speaker Identification System

Supervisor:Maxim Sidorov
Description:

There are many open issues in the field of speaker identification. For example: the number of conversation participants, detection of the speaking person at every time moment, etc. The following aspects should be investigated: the best algorithm of classification (clustering) and its parameters.

(1 Person)

Requirements:
  • RapidMiner or MATLAB Programming Skills
  • Interest in Dialogue Systems and Machine Learning Algorithms

Implementing Let's Go Dialogue into OwlSpeak

Supervisor:Stefan Ultes
Description:

Probably the only research dialogue systems applied in real life is the Let's Go system of Carnegie Mellon University in Pittsburgh. People can call the system to get information about public transportation. The task of this work is to analyze the system descriptions and to adapt the dialogue for the ontology-based OwlSpeak dialogue manager.

(1 Person)

Requirements:
  • Good JAVA Programming Skills
  • Profound Interest in Dialogue Systems

Analyzing and Adapting Let's Go Dialogue to OwlSpeak

Supervisor:Stefan Ultes
Description:

Probably the only research dialogue systems applied in real life is the Let's Go system of Carnegie Mellon University in Pittsburgh. People can call the system to get information about public transportation. The task of this work is to analyze the system descriptions and to adapt the dialogue for the ontology-based OwlSpeak dialogue manager.

(1 Person)

Requirements:
  • Good JAVA Programming Skills
  • Good Programming Skills in a Scripting Language, e.g. Perl
  • Interest in Dialogue Systems

Russian Phonetic Model for Large-Vocabulary Automatic Speech Recognition (Euler)

Supervisor:

S. Zablotskiy 

Description:

 

 

 

The task of this work is to develop a tool for automatic phoneme extraction according to different linguistic theories. The tool should be able to create a phonetic transcription of arbitrary Russian word or sub-word unit given the information about the emphasis' position. 

(1 Person)

Requirements: 
C++/Java Programming skills, analytical thinking

Summer Semester 2013

Rolling out graphical user interfaces in Arabic

Supervisor:

A. Berton

Description:

 

 

 

Current car headunits provide their graphical user interface (GUI) in European, American and Asian languages and shall soon be extended to also cover Arabic. The candidate will be introduced to an English version of the GUI. The task of the project is to derive an interaction concept for Arabic given the English GUI. Main parts of the concept will be widgets, behavioral elements, and texts. After successfully deriving the concept the candidate will extend the GUI implementation given the concept for Arabic. The candidate can implement the extension either in Adobe Flex or in another model-based GUI framework, such as Elektrobit GUIDE.

(1 Person)

Voice user interfaces in Arabic

Supervisor:

A. Berton

Description:

 

 

 

State-of-the-art in Arabic speech recognition; Transferring speech dialogs to Arabic; Speech dialog sequences in Arabic (enter destination addresses and Points-of-Interest); Extending a speech dialog demonstrator from English to Arabic.

(1 Person)

Evaluating and improving OOV models in speech recognition

Supervisor:

F. Gerl

Description:

 

 

 

It is a common problem in speech recognition to deal with words spoken by the user that do not belong to the current active lexicon of the speech recognizer, so called out-of-vocabulary (OOV) words. Tasks associated with OOV modelling are detecting the presence of OOV words, guessing the underlying sequence of sub-word units (phonemes, syllables) and possibly converting the sub-word sequence into an actual word. The main task of the work is, to evaluate different OOV models using this approach. The goal is to improve and tune OOV models using the results of these evaluations.

(1 Person)

Winter Semester 2012/2013

Design and Implementation of a Tool for the Management and Maintenance of Transcription Databases

Supervisor:

A. Kosmala

Description:

 

 

 

For the development and testing of speech processing applications, huge amounts of phonetic transcriptions need to be collected, maintained and processed. The tool to be developed requires interfaces to various exchange- and database formats. In order to test phonetic transcriptions and to make the results audible, an interface is required to specific text-to-speech systems. Moreover, a syntactic check algorithm needs to be implemented and tested, which provides immediate feedback on the correctness of a transcription to the user. A smart user interface needs to be designed to provide as much usability as possible. Since the software shall run on Windows-, as well as on Linux-systems, it is recommended to implement the tool in Java.

(1 Person)

Improvement of Speech Recognition Performance Based on a Two Pass Approach

Supervisor:

A. Kosmala

Description:

 

 

 

In speech recognition applications, two pass approaches offer several advantages in terms of computational load and memory consumption. In a first fast pass, a set of coarse hypothesis is estimated based on the spoken utterance. The hypothesis will be compared to all possible vocabulary entries, while only a limited set of the best matching entries is selected. In the second pass the recognizer has to find the most likely word or sentence among the preselected hypothesis, based on the spoken utterance. Since the overall performance depends clearly on the amount and quality of the hypothesis estimated in the first pass, it has to be investigated if additional sources of information can be incorporated into the recognition process in order to optimize amount and quality of the first pass hypothesis.

(1 Person)

Summer Semester 2012

Non-native pronunciation modeling through HMM acoustic model structure

Supervisor:

M. Raab

Description:

 

 

 

Automated speech recognition can be used for controlling navigation devices, automated telephone support systems or computers in general. In real world applications, applications have to handle speech from foreigners (=Non-Native Speech). This speech typically deviates from native speech, which is causing problems for automated recognition. Another use case is speech controlled media players. For example many English songs are popular in many countries. Even if their English is very good, non-native speakers still require special consideration by speech recognition systems. An intuitive way to model the deviations/error non-native speakers make is to allow additional pronunciation variants for words. Of course, allowing just as many alternative pronunciations as possible is not a good option, as speech recognizers can perform better when more constraints are available. Thus the goal is to provide exactly the alternative pronunciations that non-native speakers make.

(1 Person)

A nuisance speech model for NLU support

Supervisor:

R. Gruhn

Description:

 

 

 

In speech dialog systems, users are expected to say specific information, for example voice commands. Inexperienced users sometimes utter additional words, like politeness phrases, before or after the key information. Recognizing desired data items embedded in additional unrequired speech is one way of implementing natural language understanding (NLU). The target of this project is to experiment with various generic speech models to catch such nuisance speech.

(1 Person)

Winter Semester 2011/2012

A Java Workbench for Analyzing Large Dialogue Corpora

Supervisor:

A. Schmitt

Description:

The task of this project is the implementation of new features for our Eclips-RCP-based Java workbench that allows for a detailed analysis of logged dialogue conversations. Furthermore, the current project is to be enhanced in such a way that it can be made publically available on an open-source platform.

(1 Person)

Requirements:

Strong Java programming skills, interest in Spoken Language Dialogue Systems, interest in Machine Learning

Providing dialogue acts for usage in different dialogue strategies

Supervisor:

 

 

 

G. Bertrand

 

 

 

Description:

 

 

 

Adapting dialogues to user emotions, profile and the current situation is a complex task, which is needed to create a more human-like dialogue system. For testing how to adapt dialogues in an appropriate way, we want to do Wizard-of-Oz Experiments (Experiments where parts of the system are simulated by a human person). In order to be able to conduct Wizard-of-Oz experiments, we need a framework, where the wizard (human person) who simulates the system part, can choose which dialogue fragment is appropriate for the user. In this Bachelor Thesis the student can design a database which provides different dialogue acts for coping with different situations in user interaction. Different ways of interacting with the user via speech should be designed and implemented in a database which will be the foundation of a Wizard-of-Oz system.

 

 

(1 Person)

Requirements:

 

 

 

Interest in Spoken Language Dialogue Systems, basic knowledge in database programming

Server interface for presenting adaptive dialogue strategies

Supervisor:

 

 

 

G. Bertrand

 

 

 

Description:

 

 

 

Adapting dialogues to user emotions, profile and the current situation is a complex task, which is needed to create a more human-like dialogue system. For testing how to adapt dialogues in an appropriate way, we want to do Wizard-of-Oz Experiments (Experiments where parts of the system are simulated by a human person). In order to be able to conduct Wizard-of-Oz experiments, we need a framework, where the wizard (human person) who simulates the system part, can choose which dialogue fragment is appropriate for the user. In this Bachelor Thesis the student has the opportunity to create a servlet using state-of-the-art servlet technologies which is able to receive queries for adaptive dialogue acts. The servlet must have access to a database which contains dialogue fragments for creating different dialogue strategies. The servlet presents the information from the database in an xml-conform file format so that the client can interpret the information.

 

 

(1 Person)

Requirements:

 

 

 

Basic knowledge in web technologies, programming skills in Java, familiarity with XML technologies

User Interface for simulating adaptive dialogue strategies

Supervisor:

 

 

 

F. Nothdurft

 

 

 

Description:

 

 

 

Adapting dialogues to user emotions, profile and the current situation is a complex task, which is needed to create a more human-like dialogue system. For testing how to adapt dialogues in an appropriate way, we want to do Wizard-of-Oz Experiments (Experiments where parts of the system are simulated by a human person). In order to be able to conduct Wizard-of-Oz experiments, we need a framework, where the wizard (human person) who simulates the system part, can choose which dialogue fragment is appropriate for the user. In this Bachelor Thesis the student has to design an User Interface for the Wizard. The User Interface should communicate to the server side, where the dialogue fragments which are stored in a database, and create the system utterance, which is presented via speech to the user. The User Interface should be able to let the wizard choose the system utterance according to emotional, situational states and the user profile.

 

 

(1 Person)

Requirements:

 

 

 

Interest in Spoken Language Dialogue Systems, basic knowledge in client-server-communication, programming skills in User Interfaces and XML-Technologies

Mapping Between Portlets and JSP Pages

Supervisor:

 

 

 

H. Lang

 

 

 

Description:

 

 

 

Java Server Pages (JSR-245) allow the integration of java code into html documents. A set of those server pages along with a number of additional java classes make up a Portlet (JSR-168). Usually the jsp files belonging to a Portlet are referenced from within compiled java classes. Hence there is no explicit mapping between a Portlet and its associated jsp files. Since such a mapping is indispensable for an avatar based help system currently developed within the scope of the bwGRiD portal project, a way to implement a Portlet to jsp file mapping has to be devised and integrated into the existing portal framework.

 

 

(1 Person)

Requirements:

 

 

 

Profound knowledge of Java, interest in J2EE web applications, basic knowledge of web standards and XML

Model-based testing of user interfaces for infotainment applications

Supervisor:

 

 

 

H. Hüning

 

 

 

Description:

 

 

 

The focus work will be the generation of test cases from the model-based specification for user interfaces in cars. The functionality of the user interfaces is increasing rapidly, so the time for writing correct specifications is critical. In order to be able to validate the specifications, we investigate the model-based format of UML 2.0 state charts. We develop tools for a simulation from these state charts, including the behaviour of screen elements.

 

 

(1 Person)

Requirements:

 

 

 

Background in computer science, engineering or similar, Programming skills (C++, Java, ActionScript, …), Strong interest in model-based specification and testing of user interfaces, Analytic skills and ability to work single-handedly

Summer Semester 2011

Model-based testing of GUIs for in-car infotainment (GUC internship)

Supervisor:

 

 

 

H. Hüning

Description:

 

 

 

The team "HMI Implementation" at Daimler in Ulm, located next to the University of Ulm, is working on model-based specification methods and testing of user interfaces, or human-machine interfaces (HMI). The user interfaces are for the so-called head-units in Mercedes passenger cars, which integrate infotainment applications such as audio, telephone and navigation. The user interfaces typically comprise a screen, multi-functional buttons, and in some cases speech dialogues (i.e. multi-modal user interfaces). At Daimler, specifications of the user interfaces need to be written, and head-unit devices from suppliers need to be tested for conformance to the specifications. We develop tools for model-based testing of the user interfaces, so both the specification and test models can be derived from a system model. A test case is a particular path through a UML state chart. The aims of testing are twofold, first to check the model, and second to compare the behaviour of the model to the head-unit implementation. Tools for model based testing are either extensions of a UML tool like Enterprise Architect or combinations of tools like EB Guide and Conformiq (see URLs below). Such tools will be provided to the student, or, as a fallback possibility, we have our own source code for automatic test case generation from UML state charts. The focus of this bachelor thesis is planned to be the correct design of test models, such that test cases can be generated automatically. Either, another student will work on automatic test generation in parallel, or this thesis may comprise more aspects of automatic test generation. Challenges of this work will be: A systematic way to deal with different goals of testing, definition of formats for all necessary elements in the test models, validation of the results, and study of literature on model-based testing of user interfaces.

 

 

(1 Person)

Additional Information:
  1. For an overview of model-based testing and test case generation see: http://en.wikipedia.org/wiki/Model-based_testing
  2. For a UML tool enterprise architect see http://sparxsystems.eu/
  3. For the EB Guide example see http://www.elektrobit.com/what_we_deliver/automotive_software/products/eb_guide_-_hmi_development
  4. Conformiq http://www.verifysoft.com/en_conformiq_automatic_test_generation.html
Requirements:Background in computer science or engineering Programming skills (C#, java, C++ or similar, and XML) Language skills, fluent English or German, strong interest in model based testing, ability to work in a team, and single-handedly

Visualization of Adaptive Dialoguemodel

Supervisor:

 

 

 

G. Bertrand

Description:

 

 

 

This work is concerned with a displaying adaptive dialogue acts (parts of a spoken human machine dialogue) in 3D. That means to transform parts of an adaptive dialogue model (XML-Format) into a corresponding 3D Coordinate representation. For this purpose the student may take advantage of jzy3d or any other 3D library he/she finds appropriate.

 

 

(1 Person)

Requirements:Programming skills (Java/XML), interest in dialogue modeling.

Intoxication Recognition with Hidden Markov Models (University of Toronto internship)

Supervisor:

 

 

 

S. Ultes

 

 

 

Description:

 

 

 

Speech not only contains of spoken words bud holds more information about the speaker. In this task, a Hidden Markov Model (HMM) should be used to find out if a speaker is drunk or sober. For that, a suitable feature set based on Automatic Speech Recognition has to be found and the topology of the HMM has to be designed.

(1 Person)

Requirements: 

Interest in statistical pattern recognition

 

Winter Semeseter 2010/2011

Design and Evaluation of a New Way of Modeling Assistive and Adaptive Dialogue

Supervisor:

G. Bertrand

Description:

In connection with the SFB-TRR62 a new form of Dialogue Modeling has been developed. It reflects assistivity and adaptivity by providing new ways to integrate emotional strategies and explanatory dialogues into the dialogue model. The model itself is based on state-of-the-art technologies (Eclipse Modeling Framework, Eclipse Graphical Modeling Framework and above all Java).
The task related to this new model is to design a dialogue in a given domain (there already is the opportunity to use a graphical modeling tool) and to evaluate this dialogue (meaning to write some test code to prove its usability). Because of the novelty of our model we also welcome ideas to improve the model and to refine it.
The task is intended for two students but can be reduced for a single student in its breadth.

((1-)2 Person

Requirements:

For the task a firm knowledge of Java programming is required and some experience with the Eclipse Platform is recommended.

Iterative Linear Regression for User Capabilities Determination

Supervisor:

K. Zablotskaya

Description:

The objective of this work is to create a linear model for determination of speakers’ capabilities.  The capability score is calculated as a weighted sum of linguistic features extracted from monologues of speakers. A linear regression with Lagrange restriction must be implemented for determination of the weights in the linear model.

(1 Person)

Requirements:

analytical skills, programming skills (MatLab C/C++)

Connecting Knowledge Base and Dialogue Management

Supervisor:

F. Nothdurft

Description:

The focus of this work will be the design and implementation of an exemplary intermodule communication in multimodal dialogue systems, and in particular the information exchange between dialogue management and knowledge-base. The communication will be based on the SEMAINE API, an open source framework for building emotion-oriented systems.
As this work is related to the SFB-TRR62 project, the student will be part of up-to-date scientific work on companion technologies.

(1 Person)

Requirements:

Programming skills (Java). Interest in developing ideas for dialogue systems. Dialogue Systems lecture would be good.

Summer Semester 2010

Affective Grammar(s) - Emotion Recognition from Texts (GUC internship)

Description:

 

 

 

The main objective is the development of an affective grammar that is able to determine a user’s emotional state by spotting “emotion words” within utterances. The grammar should be written in theW3C grxml format to be included in a VoiceXML-based dialogue system. Optionally, speech-signal based emotion recognition shall be investigated. The recognizer output shall be integrated in a dialogue system which optionally shall be evaluated in user tests.

(1 Person)

Requirements:

Programming skills (Java), interest in model-based specification and development of user interfaces

Symbian C++ Client for Distributed Spoken Language Dialogue Systems (GUC internship)

Description:

 

 

 

The objective is the implementation of Symbian C++ client for distributed spoken language information retrieval system. The focus of the work should be placed on the development of the front-end for Distributed Speech Recognition (DSR) systems. The front-end extracts characteristic features out of speech signal, sends them to the DSR server, gets the recognized utterance back and presents the result in appropriate form to the user.

(1 Person)

Speaker Adaptation in Automatic Speech Recognition (GUC internship)

Description:

 

 

 

Speech recognition is becoming a standard interface in the human-machine communication. However, the performance of the automatic speech recognition (ASR) systems is highly affected by the mismatch in training and testing data. The differences in the vocal tract of speakers, articulation, microphone transfer function, etc. lead to performance degradation. The study of adaptation and normalization techniques to increase the reliability of an ASR-system and further implementation of these algorithms constitute the main objective of this work.

 

 

(1 Person)

Analysis of Term and Sentence (Dis)Similarity Measures for Unsupervised Categorisation of Spoken Language Utterances (GUC internship)

Description:

 

 

 

The main objective is to improve the unsupervised categorization of speech utterances by analysing new metrics for capturing similarities (or distances) between the utterances. Since components of speech utterances are the words, the purpose is to carry the mentioned analysis to the word level, i.e. search for new formulations that provide the semantic affinities between different words. If the semantic similarity among words is appropriately captured, it can be used to estimate the similarities between utterances whose component words are not identical.

 

 

(1 Person)

Dialogue Modelling in Multi-User Spoken Language Dialogue Systems (GUC internship)

Description:

 

 

 

This work deals with dialogue modelling for a multi-party spoken language dialogue system. Each user utterance is integrated in the dialogue model as a new dialogue state with information such as the semantic content, a dialogue act, speaker and addressee. The system then decides upon this dialogue state how to further proceed in the dialogue using update rules. These rules would e.g. provoke that the information
is stored in a so-called task model which describes the constraints for the database queries. Task model and update rules have to be developed and implemented.

 

 

(1 Person)

Winter Semester 2009/2010

Hidden Markov Models in RapidMiner

Supervisor: A. Schmitt
Description:

Hidden Markov Models (HMMs) are considered to be the most powerful machine learning algorithms for speech recognition and other time-variant learning tasks. RapidMiner, the popular machine learning framework, however, does not provide a possibility to integrate HMMs. In this project, a machine learning operator will be implemented based on RapidMiner (open-source Java) and JAHMM (a Java HMM implementation of Hidden Markov Models). To proove the functionality, a simple speech-based gender recognizer will be realized in RapidMiner.

(1-2 Persons)

Requirements: Java programming skills, interest in Spoken Language Dialogue Systems, interest in Machine Learning

Repair Strategies for Advanced Spoken Dialogue Systems

Supervisor: T.Heinroth & A. Schmitt
Description:

Within advanced spoken dialogue domains such as preparing a dinner together with an autonomous Intelligent Environment various fallback and repair strategies are neccessary when the task is about to fail. In this project the main task is to explore already existing approaches towards such dialogue enhancements. Several different approaches should be examined and the most promising approach will be implemented within e.g. VoiceXML.

(1 Person)

Requirements:

Interest in Spoken Language Dialogue Systems, some VoiceXML skills

 

Summer Semester 2009

Development of a Call Browser in Java

Supervisor: A. Schmitt
Description: With growing task complexity of telephone-based speech applications, we develop new tools that support us in analyzing human machine dialogues. Especially in larger conversations where dialogs not rarely consist of 50 system and user turns or even more, we quickly loose track of how the conversation between the user and the system happened. We are thus developing a framework for Call Analysis in Java, which allows for a detailed exploration of call center calls.
You can contribute to this framework by implementing a Call Browser Plugin that enables to visualize the path, the user took within the conversation in a tree-like structure. The tree can be built based on log-data. Our framework is implemented on Eclipse RCP basis and the plugin itself should be an Eclipse Plugin.
Requirements: Java programming skills, some SQL skills, interest in Spoken Language Dialogue Systems

Age Recognition

Supervisor: A. Schmitt
Description: We as humans are basically able to determine gender and the rough age of persons that we hear over telephone. We also want to train computers to be able to determine age and gender of callers in order to change the dialogue strategy of voice computers. The task of this project is to implement an age recognizer based on acoustic data within the RapidMiner framework.
Requirements:

Java programming skills, some SQL skills, interest in Spoken Language Dialogue Systems and Machine Learning

Using genetic algorithms to solve the cluster ensemble problem

Supervisor: A. Albalate
Description: Cluster ensembles refers to the combination of different cluster solutions for a data set that yields a more robust partition of the data.
Several quality criteria have been proposed for detecting the optimum solution, consisting, for example, of maximising the average mutual information (ANMI) across the pool of clusterings. In this project, the student will address the optimisation problem through greedy approaches, in particular Genetic algorithms (GA). The GA efficiency vs. the dataset size is also a fundamental issue to be finally investigated.
Requirements: The proposed approach should be implemented in R software/Java. This project can be easily extended for a Master/Bachelor thesis by considering the efficiency aspect of the implemented genetic algorithm. In this context,  hybrid solutions with SVM clustering or Principal-Component-Projection clustering could be explored.

Semi-supervised co-training of classifiers with prior clustering

Supervisor: A. Albalate
Description: Supervised classifiers are algorithms trained with large data, previously labeled by humans. The high labeling costs are however, an important factor wich becomes prohibitive for the rapid adaptation or portability of the systems using these classifiers. In this project, the student will develop a semi-supervised algorithm with only makes use of few labeled data. To this aim, clustering algorithms or cluster ensembles should be used to provide a partition of unlabelled data in an unsupervised manner. Then, a small subset of data labels will be used to automatically annotate the obtained clusters. The automatic cluster annotation can be modeled as the "Travelling salesman" problem, for which greedy approaches (i.e., Genetic Algorithms) as well as other heuristics  are applied to optimise a quality parameter. Finally, some classifiers (SVM classifier and nearest neigbour classifier)  will be trained with the automatically expanded annotations and tested.
Requirements:

The project should be implemented in R software/Java. The project can be extended for a bachelor thesis, by conducting a more exhaustive evaluation, or by increasing the evaluated set of classifiers.

Hiper-graph and similarity-based techniques for cluster ensembles

Supervisor:A. Albalate
Description:The aim of this project is to implement and evaluate several existing techniques for cluster ensemble, such as hiperedge paritioning or by constructing a new similarity matrix which reflects the consensus  between the different partitionings. A weighted modification of the techniques by considering the relative importance of the combined clusterings is to be introduced. 
Requirements:

The project can be implemented in R software/Java.

Evaluating the number of clusters in a data set via fuzzy clustering methods

Supervisor: A. Albalate
Description: The objective of this project consists of the implementation and test of formulations for assessing the quality of a cluster partition obtained from a clustering algorithm. Numerous schemes exist already for assessing hard clustering, but few of them have been applied to fuzzy clustering approaches. This do not only return a partition of the input objects but also a vector of memberships of each object into each cluster. The project consists of extending some of these traditional methods for assessing the output of fuzzy clustering algorithms.
Requirements:

The software will be implemented for the use in the statistics R sotware (similar to matlab functions .m)

Winter Semester 2008/2009

Detection of Angry Callers in a Call Centre

Supervisor: A. Schmitt
Description: In this project, you will implement an emotion recognizer, that is able to determine angry caller utterances in call centre dialogues based on acoustic data. With that knowledge, we will be able to transfer such angry callers to human agents. The emotion recognizer will be based on RapidMiner, a machine learning and data mining tool implemented in Java. You will train classification models within RapidMiner based on several thousand utterances from a US-american call centre (1-2 Persons, can be combined with gender and age topic).
Requirements: Interest in Machine Learning, Emotion Recognition, Dialogue Systems. Knowledge in Java and Machine Learning/Neural Networks advantageous

Gender and Age Detection of Callers

Supervisor: A. Schmitt
Description: We as humans are basically able to determine gender and the rough age of persons that we hear over telephone. We also want to train computers to be able to determine age and gender of callers in order to change the dialogue strategy of voice computers. You will implement a gender and age recognizer based on acoustic data within the RapidMiner framework (1-2 Persons, can be combined with emotion recognition topic).
Requirements: Interest in Machine Learning, Emotion Recognition, Dialogue Systems. Knowledge in Java and Machine Learning/Neural Networks advantageous

Voice Control Simulator

Supervisor: T. Heinroth
Description: Aim is to implement a simulator for controlling lights, heater, and blinds by the use of speech. The simulator should view a room or house with several devices. It should provide an interface for a command and control dialogue running on a speech server (asynchronous communication) and several Java Servlets.
Requirements: Good Flash/Java Servlet/VoiceXML skills.

Evaluating the number of clusters in a data set via fuzzy clustering methods

Supervisor: A. Albalate
Description: The objective of this project consists of the implementation and test of formulations for assessing the quality of a cluster partition obtained from a clustering algorithm. Numerous schemes exist already for assessing hard clustering, but few of them have been applied to fuzzy clustering approaches. This do not only return a partition of the input objects but also a vector of memberships of each object into each cluster. The project consists of extending some of these traditional methods for assessing the output of fuzzy clustering algorithms.
Requirements: The software will be implemented for the use in the statistics R sotware (similar to matlab functions .m)

Using clustering and classification schemes for determining the optimum number of clusters

Supervisor: A. Albalate
Description: The objective of this project is similar to the previous one: determining the optimum number of groups in a data set without prior knowledge. For this reason, clustering algorithms will be used in combination with second classification methods (e.g. Neural networks, Bayes classifier, etc...). The result of the clustering scheme will be implicitely label. The optimum can be discerned as the result which provides, according to the implicit labels, the best classification performance.
Requirements: Tools will be implemented either in Java and/or R (similar to matlab functions .m)

Implementation of the Pole based overlapping clustering (POBOC) algorithm to R

Supervisor: A. Albalate
Description: The Pole Overlapping Clustering is a fuzzy clustering algorithm which provides, for each input object, a pattern of memberships into each output cluster. This software is, to the date, implemented in Java. The project consists of porting the software into the programming language of the R statistics package. The implemented material will be finally tested with some existing data-sets.
Requirements: Knowledge of Java and R (similar to matlab functions .m) programming languages are desired

Hierarchical Implementation of the Pole based overlapping clustering (POBOC)

Supervisor: A. Albalate
Description: The Pole Overlapping Clustering is a fuzzy clustering algorithm which provides, for each input object, a pattern of memberships into each output cluster. This method is also capable to detect the number of clusters and does not need that the user provides this information. However, the algorithm relies in an input parameter, the global average distance of all objects, and therefore may not provide accurate solutions if the input objects are organised in a hierarchy of distances. The student will be required to use the basis software in iterative calls using local distances, in order to detect the underlying hierarchy of clusters. This project can be implemented in java by one person, using the provided basis Java class, or may be also suitable for two students if selecting the block of this project with the the previous offered project.
Requirements: Knowledge of Java and R (similar to matlab functions .m) are necessary

Summer Semester 2008

Implementation of a call-analysis tool in Java/mySQL

Supervisor: A. Schmitt
Description: This project's aim is to advance the development of our existing analysis tool for log data that has been captured during the conversation between callers and a computer system in a recent call center. The given data comprises 170.000 calls. The tool should enable the user to visualize and analyze specific calls. The milestones: a) get to know the existing tool and its source code b) implement further features c) bring in your own ideas.
Requirements: Good Java/SQL skills.

Linguistic emotion recognition

Supervisor: A. Schmitt
Description: This project's aim is to analyze transcriptions of calls between callers and a computer system deployed in a a recent call center . The milestones: a) Find out how emotional do callers behave in average during such conversations. b) Define a bad word list containing emotional and bad words, that are typically contained in such conversations. c) Implement a linguistical emotion recognizer based on keyword spotting.
Requirements: SQL skills, shell-scripting or programming skills

Implementation of an emotion recognizer using the MATLAB Neural network toolbox

Supervisor: A. Schmitt
Description: The aim of this project is the development of a simple emotion recognizer based on caller utterances captured from conversations between callers an a computer system. The recognizer should be able to determine "angry" and "non-angry" callers based on acoustic features. The milestones: a) assign emotions to some of the given utterances, b) determine the acoustic features required for emotion recognition, c) train a neural network with the MATLAB Neural Network Toolbox, d) evaluate the results.
Requirements: Matlab skills

Development of a DSR Front-End under Symbian OS

Supervisor: D. Zaykovskiy
Description: The provided ANSI C implementation of the acoustic front-end for a Distributed Speech Recognition (DSR) system has to be ported onto Symbian OS platform (1 person).
Requirements: Good C/C++ skills.

Implementation of multi-learning algorithms for semi-supervised utterance categorization

Supervisor: A. Albalate
Description: The aim of this work is to implement and analyse tools for semi-supervised categorisation of utterances in the framescope of troubleshooting automated agents. Similar to bootstrapping, multi-learning co-training algorithms are used to build automatically a significant amount of training examples, providing only a small starting labeled set. These methods involve the parallel use of different type of classifiers in combination with a decision for the final categories according to certain "democratic" criteria (e.g., majority voting). Regarding implementation, the contribution of the student(s) should be the combined use of classifiers, available in existing data-mining packages, with new scripts for computing the final category prediction. (1 person. Upon request, the task can be extended for 2 people).
Requirements: Good java/C/C++ programming skills. Interest in the machine-learning and text processing fields.

Enlarging training sets for speech utterance categorization by means of "clustering-labeling"

Supervisor: A. Albalate
Description: The aim of this project is to implement speech utterance classification with a so-called "clustering-labeling" approach. First, agglomerative clustering (complete-linkage, single-link, average linkage and centroid) are used to extract the natural groups, or clusters, inside an utterance corpus. In a succeeding, cluster tagging step, a small portion of labels is to be used in order to map each cluster into one of the predefined categories. Thereby, all members in the clusters remain implicitely labeled and the resulting data could be used to train supervised classifiers. (1 person).
Requirements:

Good java/C/C++ programming skills. Interest in the machine-learning and text processing fields.

Winter and Summer Term

Lab:date on appointment

Language

Deutsch / English

Further Information

Hours per Week / ECTS-Credits:

- Project - Dialogue Systems for CT/IST/ET: 6 SWS / 8 ECTS (not graded)
LSF - ENGC 8012 Summer
LSF - ENGC 8012 Winter

- Dialogue Systems Project for Master MI/INF: 6 SWS / 8 ECTS
(not graded including WiSe 2015 - graded from SoSe 2016)

Dialogue Systems Project for "Anwendungsfächer" MI: 4 SWS / 6 ECTS (graded)
LSF - CS8410 Winter
LSF - CS8410 Summer

Projekt Dialogsysteme for IST: 6 SWS / 10 ECTS (not graded)