Statistical Modelling for User-Centered, Adaptive Spoken Dialog Systems

Author: Dipl.-Inform. Stefan Ultes

Status: in progress

Description:

Dialog strategies have long since been handcrafted by dialog experts. Only within the last decade, research has moved to data-driven methods leading to statistical models. But still, most dialog systems make use solely of the spoken words and their semantics, although speech signals reveal much more about the speaker, e.g. its age, gender, emotional state, etc. Using this speaker state information - along with the semantics - can be a promising way of moving dialog systems towards better performance whilst making them more natural at the same time. Partially Observable Markov Decision Processes (POMDPs), a state-of-the-art statistical modeling method, offer an easy and unified way of integrating speaker state information into dialog systems.

Therefore, the subject of this thesis is two-fold: The first part is focused on the generation of speaker state information using state-of-the-art machine learning techniques. The second part focuses on altering the POMDP formalisms to the arising needs of additional speaker state information which primarily involves finding suitable estimates for probability models.