Catch up on the latest AI articles


DeepPrivacy2" For Privacy And Anonymizing The Person In The Image.

Face Recognition

3 main points
✔️ Higher accuracy for face anonymization compared to DeepPrivacy
✔️ Introducing not only DeepPrivacyface anonymization but also whole-body anonym ization
✔️ Introducing FDH, a new large data set useful for whole-body anonymization

DeepPrivacy2: Towards Realistic Full-Body Anonymization
written by Håkon HukkelåsFrank Lindseth
(Submitted on 17 Nov 2022)
Comments: Accepted at WACV2023
Subjects: Computer Vision and Pattern Recognition (cs.CV)


The images used in this article are from the paper, the introductory slides, or were created based on them.


Images are collected and stored for a variety of purposes. However, images may contain privacy-sensitive subjects, so care must be taken. Recently, privacy laws have been enacted in many countries around the world, such as the EU's GDPR, making it more difficult to anonymize or collect data without consent. This trend is growing stronger every year.

Currently, blurring and other methods are widely employed to anonymize images. However, this method severely distorts the data, making it subsequently unusable in applications. In recent years, an increasing number of anonymization methods utilize rapidly advancing generative models to generate contextually realistic faces. However, face anonymization alone cannot prevent cases where individuals can be identified by identifiers other than their face, such as their ears, gait, or gender. To overcome this problem, research on whole-body anonymization is also underway.

In this paper, the author proposes DeepPrivacy2, which improves the performance of face anonymization and adds whole-body anonymization to Deep Privacy, which the author previously reported, which anonymizes faces.

In particular, the new addition of whole-body anonymization introduces a large dataset called Flickr Diverse Humans (FDH). It contains 1.5 million images of people in a variety of contexts, extracted from the YFCC100M dataset, and again contributes significantly to the visual quality of the generated human subjects in this paper.

Furthermore, we propose an ensemble framework for person detection and anonymization. The detection and anonymization of persons are divided into three categories: (1)persons detected with CSE(dense pose estimation), (2) persons detected without CSE, and (3) persons detected with other facial features. This allows for detailed anonymization of people in the image with minimal omissions.

Although there have been previous methods for whole-body anonymization, such as Surface Guided GANs (SG-GANs), DeepPrivacy2 shows better performance in both image quality and anonymization assurance by using large data sets and introducing a network of ensemble-based detection and synthesis (anonymization). DeepPrivacy2 shows higher performance in both image quality and anonymization assurance.

FDH (Flicker Diverse Humans) dataset

The FDH(Flickr Di verse Humans ) dataset consists of approximately 1.55 million human images taken from the YFCC100M (Yahoo Flickr Creative Commons 100 Million) dataset. The figure below is a sample of the dataset.

Each image contains one person, with pixel-level high-density pose estimation obtained from Continuous Surface Embeddings (CSE ), 17 key points obtained from Mask R-CNN, and segmentation mask information. The segmentation masks are the common parts of the masks obtained from CSE and Mask R-CNN. Each image has a resolution of 288 x 160 and is divided into 1,524,845 images for training and 30,000 images for validation.

FDH is composed of a far larger and more diverse set of in-the-wild images than the datasets used to synthesize (and anonymize) whole bodies ( e.g., DeepFashion ) that have been used in the past. It is also less selective than typical datasets for generative models and contains many images of people with unusual poses, perspectives, lighting conditions, and contexts.

Anonymization process for DeepPrivacy2

Here we describe the DeepPrivacy2 process, which uses an ensemble of three detection and synthesis networks to detect and anonymize persons in images (see figure below ).

First, the Detection ensures that all persons in the image are detected. This is the "Detection" process shown in the figure above. ( 1) Detection of people using high-density pose estimation (with CSE) (2) Detection using instance segmentation with Mask R-CNN (without CSE) (3) Detection of faces by DSFD for the remaining people. In the image in the center of the above figure, the person with color is detected using (1), the person without color is detected using (2), and the face with a red box on the face is detected using (3).

For person detection, the sum set of ( 1)person detection using dense pose estimation (with CSE) and (2) detection using instance segmentation with Mask R-CNN (without CSE ) is anonymized and is designed so that accessories and hair are also anonymized. This improves on the problem ofSurface Guided GANs (SG-GANs ), which was also previously proposed as a whole body anonymization, where accessories and hair were not anonymized, degrading image quality. As can be seen in the figure below, hair and ears appear to be unnaturally attached in the center image. Although high-density pose estimation (CES) is not mandatory from a privacy perspective, we can see that it significantly improves the quality of the composited image.

In addition, DeepPrivacy2 has greatly improved detection omissions compared to conventional SG-GANs, since Mask R-CNN and DSFD are used even when detection by CSE fails.

Three independent trained generators are then applied to each detection result. There is a Full-Body Anonymization Generator and a Face Anonymization Generator. There are two Full-Body Anonymization Generators, one to be applied to persons detected using CSE and one to persons caught without CSE. These are the "CSE Guided Full-Body Generator," "Unconditional Full-Body Generator," and "Face Generator" processes shown above.

In contrast to conventional generators for face anonymization, we propose a generator that does not use key points for synthesis (anonymization). This improves the reproducibility of detection in cases where key points are difficult to detect. The dataset used is an updated version of the FDF dataset.

Finally, the combined (anonymized) image is stitched to the original image. This is the "Image Stitching" process shown above. In particular, full-body anonymization differs from face anonymization in that there are many overlaps in detection. Therefore, if not processed correctly, these overlaps can produce visually unnatural artifacts at the boundaries between persons.

In this paper, each person is stitched recursively with Ascending Ordering, depending on the number of pixels covered by the person. Recursive stitching assumes that the compositing method handles artifacts due to overlap when generating each person. The ordering also assumes that the foreground object covers a larger area and that the foreground object is stitched last. In the reverse ordering (foreground object first), the background object "overwrites" the foreground object because of the possibility of overlapping detections. This simple ordering greatly reduces visual artifacts at the boundaries between persons.

The figure below compares the results of anonymizing the image stitching method described in this paper with descending ordering and ascending order. The results show that the image quality is improved at the detection boundaries(e.g., red-marked areas).


We have confirmed the quality of the DeepPrivacy2 synthesis. However, since there is no standard baseline for comparing data anonymization, for face anonymization we compare to the widely adopted face anonymization technique DeepPrivacy, and for whole body anonymization, we compare to Surface Guided GANs (SG-GANs). The training data is the FDH dataset for the whole body anonymization generator and the FDF256 for the face anonymization generator; the FDF256 dataset is an updated version of the FDF. In addition, Market1501, Cityscapes, and COCO are used for the evaluation data.

The figures below show the results of synthesis (anonymization ) in FDH by the whole-body anonymization generator. (a) shows the original image and its detection result. (b) is the result of synthesis (anonymization) using the "Unconditional Full-Body Generator" without C SE. ( c) to (e) are the results of synthesis (anonymization) using the "CSE Guided Full-Body Generator" with CSE.

We see that DeepPrivacy2 generates high-quality figures for a variety of background contexts, poses, and overlaps. We also see that CSE is necessary for high-quality anonymization; the Unconditional Full-Body Generator, which does not use CSE, has somewhat unnatural arms and legs.

The figure below shows the results of synthesis (anonymization) with FDF256 using the Face Anonymization Generator. The first row shows the original image and the result of face detection on it; the second to fifth rows show the faces synthesized (anonymized) with DeepPrivacy2.

Here we compare the performance of DeepPrivacy2's face anonymization generator with that of DeepPrivacy. However, DeepPrivacy2 uses FDF256, an updated F DF, and the image resolution is 256 x 256. On the other hand, DeepPrivacy uses FDF and has a lower resolution of 128 x 128, making a direct comparison impossible. Therefore, in this validation, the images were retrained with FDF (128 x 128) and compared. As a result, DeepPrivacy2 achieved FID=0.56, which is a significant improvement compared to DeepPrivacy's FID=0.68. Note that FID (FrechetInceptionDistance) is a measure of image reproducibility; the smaller the value, the better.

The figure below compares the results of the DeepPrivacy ( b) andDeepPrivacy2 (c) face anonymization generators, showing that DeepPrivacy2 (c)generates higher quality faces than DeepPrivacy (b)and better handles overlaps between detected objects. The results show that DeepPrivacy2 (c) produces higher-quality faces than DeepPrivacy (b).

In addition, DeepPrivacyuses key points, and if there are persons whose key points cannot be detected, their faces are not anonymized. Therefore, some persons may be omitted from the anonymization process. On the other hand, DeepPrivacy2, as mentioned above, does not use key points for synthesis, so even if key points cannot be detected, the anonymization process is still performed.


DeepPrivacy2 is a practical tool to anonymize faces and whole bodies without reducing image quality. Compared to conventional anonymization methods, it significantly improves image quality and anonymization assurance. It also introduces FDH, a large dataset containing a wide variety of poses and contexts that is useful for whole-body anonymization.

In recent years, privacy laws have been introduced in many countries and regions, complicating the collection and storage of data. This can be a barrier to developing applications that rely on high-quality images, such as computer vision models. In this context, DeepPraivacy2 will be a handy tool to solve these problems.

On the other hand, however, since DeepPraivacy2 is a technology that synthesizes real people, it may be misused in the same way as DeepFake and others. Therefore, it will be necessary to simultaneously develop technologies for fake detection through the DeepFake Detection Challenge, which is being challenged by communities around the world. The DeepPrivacy2 code, trained models, and FDH dataset are available at

If you have any suggestions for improvement of the content of the article,
please contact the AI-SCHOLAR editorial team through the contact form.

Contact Us