Invited Talks

Looking at the Interaction Management with New Eyes - Conversational Synchrony and Cooperation using Eye Gaze

Kristiina Jokinen

University of Helsinki, Finland

Human conversations are surprisingly fluent concerning the interlocutors’ turn-taking and feedback behaviour. Many studies have shown the accurate timing of utterances and pointed out how the speakers synchronize and align their behaviour to produce smooth and efficient communication. Especially such paralinguistic aspects as gesturing, eye-gaze, and facial expressions provide important signals for interaction management: they allow coordination and control of interaction in an unobtrusive manner, besides also displaying the interlocutor’s attitudes and emotional state. In the context of Interaction Technology, realistic models of interaction and synchronization are also important. The system is regarded as one of the participating agents, in particular when dealing with applications like robot companions. The key concept in such interaction strategies is linked to the notion of affordance: interaction should readily suggest to the user the appropriate ways to use the interface. The challenges for Interaction Technology thus do not deal with enabling interaction in the first place, but rather with designing systems that support rich multimodal communication possibilities and human-technology interfacing that is more conversational in style.

This talk will explore various prerequisites and enablements of communication, seen as cooperative activity which emerges from the speakers’ capability to synchronize their intentions. We seek to address some of the main challenges related to construction of the shared knowledge and most notably, we focus on eye-gaze and discuss its use in interaction coordination: providing feedback and taking turns. We also discuss issues related to collecting and analyzing eye-tracking data in natural human-human conversations, and present preliminary experiments concerning the role of eye-gaze in interaction management. The discussion is also extended towards other paralinguistic aspects of communication: in multiparty dialogues head movement and gesturing also play a crucial role in signaling the person’s intention to take, hold, or yield the turn.

Interacting with Purpose (and Feeling!): What Neuropsychology and the Performing Arts Can Tell Us About 'Real' Spoken Language Behaviour

Roger K. Moore

Department of Computer Science, University of Sheffield, UK

Recent years have seen considerable progress in both the technical capabilities and the market penetration of spoken language dialogue systems. Performance has clearly passed a threshold of usability which has triggered the mass deployment of effective interactive voice response systems, mostly based on the now firmly established VXML standard. In the research laboratories, next-generation spoken language dialogue systems are being investigated which employ statistical modelling techniques (such as POMDPs) to handle uncertainty and paralinguistic behaviours (such as back-channeling and emotion) to provide more ’natural’ voice-based interaction between humans and artificial agents. All of these developments suggest that the field is moving in a positive direction, but to what extent is it simply accumulating a battery of successful short-term engineering solutions as opposed to developing an underlying long-term theory of vocal interaction?

This talk will attempt to address this issue by drawing attention to results in research fields that are quite distinct from speech technology, but which may give some useful insights into potential generic principles of human (and even animal) behaviour. In particular, inspiration will be drawn from psychology, the neurosciences and even the performing arts, and a common theme will emerge that focuses on the need to model the drives behind communicative behaviour, their emergent consequences and the appropriate characterisation of advanced communicative agents (such as robots). It will be concluded that future developments in spoken language dialogue systems stand to benefit greatly from such a transdisciplinary approach, and that fields outside of speech technology will also benefit from the empirical grounding provided by practical engineered solutions.