Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models

1Idiap Research Institute, 2UNIL

Preprint.
Paper arXiv Code
Realism transferred images

The image above shows the realism transferred images generated using our proposed method. The source images in the first row are from DigiFace dataset and the transformed images with our approach are shown in the second row.

Summary

The accuracy of face recognition systems has improved significantly in the past few years, thanks to the large amount of data collected and the advancement in neural network architectures. However, these large-scale datasets are often collected without explicit consent, raising ethical and privacy concerns. To address this, there have been proposals to use synthetic datasets for training face recognition models. Yet, such models still rely on real data to train the generative models and generally exhibit inferior performance compared to those trained on real datasets. One of these datasets, DigiFace, uses a graphics pipeline to generate different identities and different intra-class variations without using real data in training the models. However, the performance of this approach is poor on face recognition benchmarks, possibly due to the lack of realism in the images generated from the graphics pipeline. In this work, we introduce a novel framework for realism transfer aimed at enhancing the realism of synthetically generated face images. Our method leverages the large-scale face foundation model, and we adapt the pipeline for realism enhancement. By integrating the controllable aspects of the graphics pipeline with our realism enhancement technique, we generate a large amount of realistic variations— combining the advantages of both approaches. Our empirical evaluations demonstrate that models trained using our enhanced dataset significantly improve the performance of face recognition systems over the baseline. The source code and datasets will be made available publicly.

Proposed pipeline

The key insight of our approach is to reuse procedurally generated identities from a graphics pipeline and enhance their realism to reduce the domain gap. DigiFace1M provides an elaborate pipeline for generating synthetic identities and their variations, allowing us to obtain a large number of identities from this dataset. Additionally, we generate variations by interpolating between multiple images of an identity within the embedding space. Using a pre-trained foundation model, specifically the Arc2Face model, we synthesize identity-consistent images from these interpolated embeddings. We further enhance the realism of the generated images by modifying the intermediate CLIP space. The resulting dataset consists of various variations suitable for training a face recognition model.

Digi2Real Pipeline

The pipeline for for realism enhancement, our approach introduces intra class variations with SLERP sampling, reduces the domain gap in the CLIP space and generates realistic images.


Face Recognition Performance with Digi2Real

The Face Recognition performance with Digi2Real dataset significantly improves over the DigiFace and achieves better performance than many other synthetic datasets.

Digi2Real performance


Face Recognition Performance with Digi2Real IJB-B and IJB-C

Digi2Real dataset demonstrates superior performance over many alternatives, with particularly notable improvements on the IJB-B and IJB-C benchmarks, where it ranks sec- ond only to the DCFace dataset among the synthetic datasets. While performance on high-quality datasets like LFW remains comparable to other methods, our approach significantly outperforms many syn- thetic datasets on IJB-B and IJB-C, achieving verification rates of 70.14% and 75.80%, respectively.

Digi2Real performance

Performance of Digi2Real dataset on IJB-B and IJB-C benchmarks


Dataset Availability

New 🚀: The dataset is now available at the following link: https://www.idiap.ch/en/scientific-research/data/digi2real .

BibTeX

 @article{george2024digi2real,
          title={Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models}, 
          author={Anjith George and Sebastien Marcel},
          year={2024},
          eprint={2411.02188},
          url={https://arxiv.org/abs/2411.02188}, }