Dialogue Systems

The lecture provides an introduction into the area of multimodal spoken natural language dialogue systems. A particular focus is placed on acoustic processing, speech signal analysis, recognition, spoken natural language understanding, dialogue processing and speech synthesis. The topics will be illustrated throughout practical sessions and demonstrations of applications and products. Local companies working in the field of multimodal spoken natural language dialogue systems will provide guest lectures.

No prerequisites from other lectures required. Some basic knowledge in digital signal processing, computer science, cybernetics and statistics would be helpful.

Topics

1. Human Communication.

Speech communication, structure and properties of speech, speech production, and speech perception.

2. Spoken Natural Language Dialogue Systems Overview.

Disciplines of speech processing, history, speech coding, speech synthesis, speech recognition, speech identification/verification, semantic analysis, dialogue modelling.

3. Speech Synthesis.

Relationship between phonetics and written language, speech synthesis steps, phonetic inventory, speech signal production, speech synthesis (concatenation), linear prediction, prosody control.

4. Acoustic Processing.

Beamforming, spectral subtraction, noise reduction, echo compensation, GSM-coding, blind source separation.

5. Speech Recognition.

Overview over the most commonly used techniques in speech recognition, such as feature extraction from speech, statistical modelling of speech, search and speaker adaptation techniques.

6. Semantic Analysis and Dialogue Modelling.

Theory of formal languages, Chomsky hierarchy, word problem, finite automata, parsing, syntactic vs. semantic grammars, rule-based vs. statistical approaches to semantic analysis, dialogue modelling and application control.

7. Systems Evaluation, Applications and Products.

Evaluation of speech recognition and spoken language dialogue systems, overview on research projects, commercially available and research prototypes.

Exercises and Practicals.

Spoken natural language dialogue systems development with a focus on parsing and VoiceXML-based dialogue management.

References

J. Allen: Natural Language Understanding, The Benjamin/Cummings Publishing Company, Inc., 1988.
W. Minker, A. Waibel and J. Mariani: Stochastically-based semantic analysis, The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, Boston, 1999.
L.R. Rabiner and B.H. Juang: An introduction to Hidden Markov Models, IEEE Transactions on Acoustics: Speech and Signal Processing, 3:1, pp. 4-16, 1986.
S.J. Young, P.C. Woodland, and W. Byrne: Spontaneous Speech Recognition for the Credit Card Corpus Using the HTK Toolkit. IEEE Trans. Speech and Audio Processing, Vol 2, No 4, 1994.
Copies of Slides

Materials

Important News

Please check this site regularly for any last-minute changes and announcements!

Winter Term 2011/2012

Lecture:	-NA-
Exercise:	-NA-

Contact

Lecturers:
Prof. Dr. Dr.-Ing. Wolfgang Minker
Supervisors:
Dipl.-Inf. Florian Nothdurft
Dipl.-Inf. Stefan Ultes

Language

English

Requirements

Bachelor

Exams

Oral exam, Exercises and Practicals Certificate.

More Informations

Hours per Week: 3V + 2Ü
7 ECTS Credits
LSF - ENGJ 7011

Registration

Online Registration