Neural networks for analysis of speech and prosody

The central goal of this subproject is the development of a neural network based system for robust recognition of continuous speech giving computer commands in the context human-computer interaction.

The idea is that an individual computer workstation is adapted to the speech of its user. This should enable the speech recognition system to develop a considerable robustness against background noises, perhaps even against background speech by other persons. Due to the context the vocabulary will be limited to less than 200 words. Also the pragmatic contents of the sentences will be restricted to a small number of commands, queries or statements.

We will use a combination of spectral features, neural networks (in particular radial basisfunction networks), and hidden Markov models for speech recognition and also for the extraction of prosodic information, which can be used for estimating the user's emotions.