Toward User-Aware Interactive Virtual Agents: Generative Multi-Modal Agent Behaviors in VR

Bhasura Gunawardhana, Yunxiang Zhang, Qi Sun, Zhigang Deng
IEEE ISMAR 2024
PDF

Abstract

Virtual agents serve as a vital interface within XR platforms. However, generating virtual agent behaviors typically rely on pre-coded actions or physics-based reactions. In this paper we present a learning-based multimodal agent behavior generation framework that adapts to users’ in-situ behaviors, similar to how humans interact with each other in the real world. By leveraging an in-house collected, dyadic conversational behavior dataset, we trained a conditional variational autoencoder (CVAE) model to achieve user-conditioned generation of virtual agents’ behaviors. Together with large language models (LLM), our approach can generate both the verbal and non-verbal reactive behaviors of virtual agents. Our comparative user study confirmed our method’s superiority over conventional animation graph-based baseline techniques, particularly regarding user-centric criteria. Thorough analyses of our results underscored the authentic nature of our virtual agents’ interactions and the heightened user engagement during VR interaction.

Bibtex

@inproceedings{gunawardhana2024toward,
title={Toward User-Aware Interactive Virtual Agents: Generative Multi-Modal Agent Behaviors in VR},
author={Gunawardhana, Bhasura S and Zhang, Yunxiang and Sun, Qi and Deng, Zhigang},
booktitle={2024 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)},
pages={1068–1077},
year={2024},
organization={IEEE}
}