Contents

The lecture provides an introduction into the area of multimodal spoken natural language dialogue systems. A particular focus is placed on acoustic processing, speech signal analysis, recognition, spoken natural language understanding, dialogue processing and speech synthesis. The topics will be illustrated throughout practical sessions and demonstrations of applications and products. Local companies working in the field of multimodal spoken natural language dialogue systems will provide guest lectures.

No prerequisites from other lectures required. Some basic knowledge in digital signal processing, computer science, cybernetics and statistics would be helpful.

Topics

1. Human Communication.

  • Speech communication, structure and properties of speech, speech production, and speech perception.

2. Spoken Natural Language Dialogue Systems Overview.

  • Disciplines of speech processing, history, speech coding, speech synthesis, speech recognition, speech identification/verification, semantic analysis, dialogue modelling.

3. Speech Synthesis.

  • Relationship between phonetics and written language, speech synthesis steps, phonetic inventory, speech signal production, speech synthesis (concatenation), linear prediction, prosody control.

4. Acoustic Processing.

  • Beamforming, spectral subtraction, noise reduction, echo compensation, GSM-coding, blind source separation.

5. Speech Recognition.

  • Overview over the most commonly used techniques in speech recognition, such as feature extraction from speech, statistical modelling of speech, search and speaker adaptation techniques.

6. Semantic Analysis and Dialogue Modelling.

  • Theory of formal languages, Chomsky hierarchy, word problem, finite automata, parsing, syntactic vs. semantic grammars, rule-based vs. statistical approaches to semantic analysis, dialogue modelling and application control.

7. Systems Evaluation, Applications and Products.

  • Evaluation of speech recognition and spoken language dialogue systems, overview on research projects, commercially available and research prototypes.

Exercises and Practicals.

  • Spoken natural language dialogue systems development with a focus on parsing and VoiceXML-based dialogue management.