# LowKey, The User Protection System That Breaks The Commercial Facial Recognition API!

3 main points
✔️ We propose a tool called "LowKey" to bypass face recognition systems
✔️ Preprocess images to make them unusable for face recognition
✔️ This tool is effective against commercial black box face authentication APIs.

written by Valeriia CherepanovaMicah GoldblumHarrison FoleyShiyuan DuanJohn P DickersonGavin TaylorTom Goldstein
(Submitted on 29 Sept 2020)

code：

## First of all

Facial recognition (FR) systems are very effective technologies, but they also encompass risks such as the violation of personal privacy. In this article, we present a new system called "LowKey" that protects users from face recognition systems.

LowKey allows users to pre-process their data before publishing their images so that they are not available for facial recognition systems. Surprisingly, we have demonstrated its effectiveness in commercial black box face recognition APIs, such as Amazon Rekognition and Microsoft Azure Face.

First, we explain the problem setting (or terminology, etc.) in face recognition.

Gallery images

A gallery image refers to a database of images whose identities are clear (labeled ). For example, they are collected from passport mugshots and social network profile images. These images are used as information sources and reference images for face recognition systems.

Probe images

A probe image is an image in which the face recognition system wants to identify the subject person.

For example, if you want to identify a person in a surveillance camera, the surveillance camera image is the probe image.

#### Identification/verfication

IDENTIFICATION is a task that addresses the question, "Who is this person? is a task that addresses the question "Who is this person?". It involves selecting a person from a given probe image who matches the subject of the image in the gallery image. Verification is a task that, given two images, asks "Are these the same person? (e.g., unlocking a smartphone with face recognition) .

In the paper presented in this article, we focus specifically on identification. The description of the LowKey method that follows also assumes that the task to be addressed is identification.

## The technique (LowKey)

In the state-of-the-art face recognition system, after detecting the location of the face, the feature vector of the face is extracted from the probe image. Based on these features, the k-nearest neighbor method is used to find the most similar image in the gallery image.

The proposed method, LowKey, corrupts the feature vectors of gallery images by applying filters to images that can be collected as gallery images (e.g., SNS profile images). As a result, it can fail to match with the corresponding probe images. This is summarized in the following diagram

### LowKey Attack

The goal of LowKey is to ensure that user images (which are assumed to be collected as gallery images) do not match probe images of the same person.

This is done by generating a perturbed image whose feature vectors are significantly different from the original image (so that the probe image and feature vectors do not match) and at the same time making the original image and the perturbed image indistinguishable (minimizing perceptual similarity). This is done by

The target face recognition system for LowKey is unknown, and we do not have information about the preprocessing such as face position detection or the model architecture used as a backbone. To improve its ability to deal with such unknown face recognition systems, it is trained to trick an ensemble of various backbone architectures obtained by various algorithms.

In addition, by using both images with/without smoothing by Gaussian filter, we can improve the quality and effectiveness of the generated images. In this case, LowKey's goal is expressed by the following equation.

where $x$ is the original image, $x^{\prime}$ is the perturbed image (the generated image), $f_i$ is the $i$th model in the ensemble, $G$ is the Gaussian smoothing function, and $A$ indicates the process of extracting the face part in the image and resizing it to 112×112.

$LPIPS$ is a measure of perceptual similarity, or LPIPS (image quality: used to increase the similarity to the original image).

This problem is solved by iteratively updating $x^{\prime}$ by the signed gradient ascent method (signed gradient ascent).

## Experiment

### Experiment setup

As an ensemble of models to be used in LowKey, we use the ArcFace and CosFace face recognition systems. For each of these systems, we train the ResNet-50/152 and IR-50/152 backbone models on MS-Celeb-1M.

In our experiments, we mainly test attacks on the FaceScrub dataset. Specifically, we use 1/10 of the images for each identity as probe images and the rest as gallery images, and for 100 randomly selected identities, we apply LowKey to the corresponding gallery images.

### Effects on commercial black box APIs

Under the aforementioned experimental setup, we test the effect of LowKey on two commercial face recognition APIs (Amazon Rekognition and Microsoft Azure Face). The results are as follows.

In both Amazon Rekognition and Microsoft Azure Face, it was shown to work very effectively with clean data and compared to the existing methods of Fawkes.

The effectiveness of the LowKey tool depends on several characteristics that are described below.

1. The attack must also be effective against unknown models.
2. The image must be acceptable to the user.
3. LowKey must be fast enough.
4. It must be valid even after saving in PNG or JPEG format.
5. It must be valid for images of any size.

In the following, we investigate these characteristics.

1. Is the attack effective against unknown models?

For each attacker/defender, the results for the various models are as follows.

As shown in the table, the attack by IR architecture is more effective against IR face recognition system and the attack by ResNet is more effective against ResNet face recognition system.

We then found that by using an ensemble of different models, we can generate attacks that are valid for all models.

#### 2. Are the images acceptable to the user?

The following are examples of attack generation for the cases with and without Gaussian smoothing.

The upper row corresponds to the case without Gaussian smoothing and the lower row corresponds to the case with it.

The smoothed case produces a smoother image, and as shown in the table below, smoothing improves the performance of the attack more.

In addition, comparisons with existing methods and original images are shown below.

The first line is the original image, the second line is Fawkes (existing method), the third line is LowKey (small attack), and the fourth line is LowKey attack. In general, LowKey is considered to be able to generate an image with no significant discomfort.

#### 3.LowKey is fast enough?

We compare the execution time with Fawkes as a baseline.

On a single NVIDIA GeForce RTX 2080 TI GPU, Fawkes takes an average of 54 seconds per image, while LowKey takes an average of 32 seconds. Therefore, LowKey is significantly faster than existing methods.

4. Is it still valid after saving in PNG or JPEG format?

After converting to PNG/JPG format, we conducted experiments on commercial APIs and found that the recognition performance of Microsoft Azure Face was 0.1% for PNG format and 0.2% for JPEG format.

Amazon Rekognition shows 2.4% for PNGs and 3.8% for JPEGs, indicating that the attack still works effectively, although compression reduces performance a bit.

5. Is it effective for any size of the image?

An example of LowKey generation for a large image is shown below.

The upper row shows the original image and the lower row shows the result generated by LowKey. Even for large images, LowKey is able to generate images with no significant discomfort.

## Summary

In the paper presented in this article, we present LowKey, a tool to protect users from facial recognition systems, and its effectiveness has been demonstrated against commercial black box APIs.

However, it is not possible to protect users with 100% certainty, and further facial recognition systems designed to be particularly robust to perturbations could overcome even these technologies.

It is important to note that it does not completely eliminate the risk of users revealing their personal information, but it is a very important technology to prevent the misuse of facial recognition systems.

If you have any suggestions for improvement of the content of the article,