HEADSET Dataset

Headset Removal for Gaze Contact in XR Applications

We are sharing the HEADSET database, a new multimodal database that consists of colored 3D point clouds, textured 3D meshes, light field (LF) images, and multi-view RGB-D images.
Download Dataset
The volumetric representation of human interactions is one of the fundamental domains in the development of immersive media productions and telecommunication applications. Particularly in the context of the rapid advancement of Extended Reality (XR) applications, this volumetric data has proven to be an essential technology for future XR elaboration. In this work, we present a new multimodal database to help advance the development of immersive technologies. Our proposed database provides ethically compliant and diverse volumetric data.

27 participants with posed facial expressions and 11 participants wearing HMDs

Data includes textured meshes, point clouds, multi-view RGB-D data, light field data

Volumetric capture studio includes 62 RGB cameras and 31 depth cameras

Abstract

 

The volumetric representation of human interactions is one of the fundamental domains in the development of immersive media productions and telecommunication applications. Particularly in the context of the rapid advancement of Extended Reality (XR) applications, this volumetric data has proven to be an essential technology for future XR elaboration. In this work, we present a new multimodal database to help advance the development of immersive technologies. Our proposed database provides ethically compliant and diverse volumetric data, in particular 27 participants displaying posed facial expressions and subtle body movements while speaking, plus 11 participants wearing head-mounted displays (HMDs). The recording system consists of a volumetric capture (VoCap) studio, including 31 synchronized modules with 62 RGB cameras and 31 depth cameras. In addition to textured meshes, point clouds, and multi-view RGB-D data, we use one Lytro Illum camera for providing light field (LF) data simultaneously. Finally, we also provide an evaluation of our dataset employment with regard to the tasks of facial expression classification, HMDs removal, and point cloud reconstruction. The dataset can be helpful in the evaluation and performance testing of various XR algorithms, including but not limited to facial expression recognition and reconstruction, facial reenactment, and volumetric video. HEADSET and its all associated raw data and license agreement will be publicly available for research purposes.

 

Acknowledgement

This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 956770. The data collection part was carried out with the support of Centre for Immersive Visual Technologies (CIVIT) research infrastructure, Tampere University, Finland. We want to especially thank Jani Käpylä, for his help during the capturing.

Publications

List of relevant publications

Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Fatemeh Ghorbani Lohesara, Karen Eguiazarian, Sebastian Knorr
IS&T Electronic Imaging, Image Processing: Algorithms and Systems XXII (2024)

Our study introduces a network designed for expression-based video inpainting, employing generative adversarial networks (GANs) to handle static and moving occlusions across video frames. By utilizing facial landmarks and an occlusion-free reference image, our model maintains the users identity consistently across frames.

Expression-aware video inpainting for HMD removal in XR applications

Fatemeh Ghorbani Lohesara, Karen Eguiazarian, Sebastian Knorr
20th ACM SIGGRAPH Conference on Visual Media Production (CVMP 2023)

In this study, we propose a new network for expression-aware video inpainting for HMD removal (EVI-HRnet) based on generative adversarial networks (GANs). Our model effectively fills in missing information with regard to facial landmarks and a single occlusion-free reference image of the user.

HEADSET: Human Emotion Awareness under Partial Occlusions Multimodal DataSET

Fatemeh Ghorbani Lohesara, Davi Rabbouni Freitas, Christine Guillemot, Karen Eguiazarian, Sebastian Knorr
IEEE Transactions on Visualization and Computer Graphics (2023)

In this work, we present a new multimodal database to help advance the development of immersive technologies. Our proposed database provides ethically compliant and diverse volumetric data, in particular 27 participants displaying posed facial expressions and subtle body movements while speaking, plus 11 participants wearing head-mounted displays (HMDs).

Comments