This project deals with the question of disposition recognition of spoken language. The focus is on the link between speech acoustics and contextual information, such as course of the dialog, the user information, his situation. This will be developed in collaboration with the projects A3, B1, B4 and C1.
To this end, the work on databases of natural scenarios shall be intensified, to capture this context information. In cooperation with the subprojects A3 and B4, linguistically detectable disposition categories are being derived. Furthermore, in close cooperation with A3 and C4, higher meanings, such as dialog termination, are being marked in the data.
For automatic detection, in addition to the classic acoustic features, paralinguistic features are also being used. Particularly in dialogue situations where acoustic features alone are not distinctive enough, paralinguistic features such as laughing, moaning, breathing, or pauses, promise improved detection.
A fusion of acoustic and paralinguistic features will enable a more robust classification. Most tellingly may be cases in which the two classifiers disagree. In order to guarantee the best possible recognition performance, further adaptation methods are being used to enable an affective adaptation to the respective user.