CelebA-Spoof, A Large Dataset For Preventing Face Spoofing

Face Recognition 14/10/2020

3 main points
✔️ Proposed "CelebA-Spoof", a large dataset for face impersonation prevention containing 43 rich attribute information
✔️ The multitasking framework AENet was used to examine the impact of attribute information on the task of preventing face spoofing.
✔️ Three generic benchmarks were proposed to support a comprehensive assessment.

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations
written by Yuanhan Zhang, Zhenfei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, Ziwei Liu
(Submitted on 24 Jul 2020 (v1), last revised 1 Aug 2020 (this version, v3))
Comments: Accepted at ECCV2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Paper Official Code COMM Code

Overview

Facial recognition systems are becoming more and more popular in all aspects of identity verification and payment. On the other hand, however, many people are concerned about the security and reliability of these systems. Anti-spoofing" is particularly important to distinguish between "live" and "spoofed" faces. It is very dangerous for someone to forge your face, spoof you, or make a payment that you don't know about. To prevent this from happening, facial identity theft prevention is an important technology that protects your privacy and property from misuse by others.

Although a lot of research results have been reported in the past, especially in China, there are still many problems in dealing with the complexities of face spoofing techniques and putting them into practice. The main reason for this is the lack of both diversity and quantity of data in the existing anti-facial spoofing datasets. In particular, there are the following problems

1) Lack of diversity in the dataset
Existing datasets are missing subjects, sessions, and sensor types. Most datasets have fewer than 2,000 subjects, fewer than 5 sessions, and fewer than 10 input sensor types.

2) Lack of annotation of the dataset
The existing dataset only annotates face spoofing techniques. This is basic but not sufficient for a dataset in the field of face impersonation prevention. There is no dataset with richer annotation information in the field of face impersonation prevention.

3) Performance Saturation
Identification accuracy using existing datasets is already saturated. It is becoming more difficult to evaluate algorithms with greater recognition accuracy and generalization capabilities than ever before. For example, when evaluating Vanilla ResNet-18 using SiW and Oulu-NPU, Recall at FPR = 0.5% has already reached 100.0% and 99.0%, respectively. However, just because these data sets are accurate enough does not mean they are sufficient for practical purposes.

To solve these problems, this paper builds a new large dataset, CelebA-Spoof, for the task of preventing face impersonation. CelebA-Spoof has the following features

1) Diversity of datasets
Existing datasets lack a variety of subjects, sessions, and sensors. Most datasets have less than 2,000 subjects, less than 5 sessions, and less than 10 input sensor types.

2) Lack of rich annotations
In addition to the 40 facial attributes defined in CelebA, it is annotated with 43 semantic information with three additional attributes: type of face impersonation, lighting conditions, and location of shooting. The rich annotations allow for a comprehensive verification of the prevention of face impersonation.

3) Largest amount of data
It consists of 625,537 images from 10,177 subjects. This is the largest dataset for face spoofing prevention.

The paper also examines the impact of annotation information on face spoofing prevention using AENet, a multitasking framework, and provides a benchmark.

To read more,

Please register with AI-SCHOLAR.

Categories related to this article

Takumu: I have worked as a Project Manager/Product Manager and Researcher at internet advertising companies (DSP, DMP, etc.) and machine learning startups. Currently, I am a Product Manager for new business at an IT company. I also plan services utilizing data and machine learning, and conduct seminars related to machine learning and mathematics.

CelebA-Spoof, A Large Dataset For Preventing Face Spoofing

Overview

AVI-Talking" Generates Natural 3D Talking Faces From Audio

AVI-Talking" Generates Natural 3D Talking Faces From Audio

Exploring Facial Expression Recognition Techniques For The Intellectually Disabled Using The MuDERI Dataset

Exploring Facial Expression Recognition Techniques For The Intellectually Disabled Using The MuDERI ...

Diffusion Facial Forgery (DiFF), A New Large-scale Dataset For Face Forgery Detection

Diffusion Facial Forgery (DiFF), A New Large-scale Dataset For Face Forgery Detection

IdentiFace: A Multimodal Face Recognition System That Captures Everything From Emotion To Gender And Its Potential

IdentiFace: A Multimodal Face Recognition System That Captures Everything From Emotion To Gender And ...

How Do Duplicate Images Affect Face Recognition Performance? The Importance Of De-duplication In Face Image Datasets

How Do Duplicate Images Affect Face Recognition Performance? The Importance Of De-duplication In Fac ...

Multi-tasking Face (MTF), A New Facial Image Dataset That Respects Privacy And Can Be Used For Multiple Tasks

Multi-tasking Face (MTF), A New Facial Image Dataset That Respects Privacy And Can Be Used For Multi ...