Human Eyes Solve The Mystery Of Images That Deceive AI

Study 02/10/2024

3 main points
✔️ Human experimentation is essential for evaluating attacks that modify images without restrictions, but such research is currently lacking.
✔️ We proposed a new framework for human evaluation experiments, SCOOTER. We presented a systematic experimental procedure and a method for calculating the required number of participants.
✔️ Rapid advances in AI technology increase the threat of natural image attacks. The framework of this research will be used for further research development.

How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples
written by Dren Fazlija, Arkadij Orlov, Johanna Schrader, Monty-Maximilian Zühlke, Michael Rohs, Daniel Kudenko
(Submitted on 19 Apr 2024)
Comments: 3 pages, 3 figures, AAAI 2024 Spring Symposium on User-Aligned Assessment of Adaptive AI Systems
Subjects: Artificial Intelligence (cs.AI)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Summary

As machine learning models permeate our lives, adversarial examples threaten the security of AI systems. In the field of imaging, images that have been cleverly altered to be imperceptible to humans can greatly deceive state-of-the-art machine learning models. These samples contain minute changes that are usually obvious differences to humans, but that do not make sense to machine learning models. Attacks using such samples can cause machine learning models to exhibit erroneous predictions and behavior.

Traditionally, such attacks have been relatively easy to defend against because of restrictions on image modification. However, recent research claims that it is possible to generate hostile samples with unrestricted modifications while maintaining a natural look and feel. Attackers can take advantage of this freedom to launch attacks that go beyond the scope of what conventional defenses assume.

Is this "unrestricted hostile sample" really imperceptible to humans? Rigorous human evaluation experiments are essential to assess its quality. In this paper, we propose SCOOTER, a human evaluation framework specifically designed for image-based attacks, and provide a pathway for researchers to address this important issue.

Related Research

The most relevant prior study to this paper is the evaluation protocol for the Text-To-Image generation model by (Otani et al. 2023). This study provides domain-specific questions, a user interface, recommendations for experimental design, and templates for reporting results. However, it does not adequately cover the details of experimental design that are important for inexperienced researchers and lacks mention of how to assure data quality. For example, standard methods such as attention checks and teaching manipulation checks are not included. Also, disclosing eligibility requirements for participants is not recommended because it increases participant self-misidentification.

Another related work is the basic framework for collecting human image quality ratings by (Zhou et al. 2019). The protocols HYPEtime and HYPE∞, which emerged from this framework, are widely used in subjective image quality assessment tasks, but have similar weaknesses as (Otani et al. 2023).

While building on the findings of these previous studies, this study aims to rigorously define the details of the experimental design to support inexperienced researchers. Specifically, we implement measures to assure data quality, such as attention checks, teaching manipulation checks, and privatization of participant eligibility requirements.

Proposed Method (SCOOTER)

This paper proposes SCOOTER (Systemizing Confusion Over Observations To Evaluate Realness), a framework for human evaluation of unrestricted adversarial samples. SCOOTER consists of the following elements A human evaluation of the sample.

1. modularly designed web application: a Flask-based web app that allows for easy integration of images.
2 . research protocol: a detailed guide to each step of the online research.
3. online leaderboard: allows comparison of state-of-the-art attack methods against different target models.
4. image database: collects generated adversary samples for further analysis.

The core of the proposed methodology is a 13-minute online study. The flow of the study is as follows

1. color vision test (Figure 1): To exclude participants with color blindness, five Ishihara-type images are determined.

2. comprehension check (Fig. 2): Six image pairs are presented, and only participants who can correctly judge at least five of the corrected image pairs can proceed to the main study.

3. the main study (Figure 3): a slider input is used to evaluate the degree of correction for 106 images with a continuous value ranging from -100 (no correction) to +100 (correction). Of these, 50 are unmodified images, 50 are hostile samples, and 6 are caution-check images.

We also propose a method for empirically estimating the number of participants needed for a statistically meaningful study; we plan to collect data for 690 participants for each of the three attack methods to determine an adequate sample size.

In summary, SCOOTER represents a comprehensive framework to support human evaluation experiments on unrestricted hostile samples. The proposed research protocols and methods for estimating the number of participants will play an important role in improving the quality of research in this area.

Experimental Plan

This paper focuses on proposing a framework, SCOOTER, for human evaluation of unrestricted hostile samples, and no experiments have yet been conducted using SCOOTER in practice.

In the description of the proposed method, an experimental design is presented to empirically estimate the required number of participants. Specifically, the plan is to collect data for 690 participants for each of the three attack methods using the adversarially trained ResNet-50 model (Salman et al. 2020) to determine a sufficient sample size. This experimental design will play an important role in ensuring the quality of studies using SCOOTER.

However, this plan is only in the proposal stage and no actual experiments have been conducted. This paper is significant in that it provides a framework for addressing the important issue of human evaluation of unrestricted adversarial samples. It is hoped that future empirical experiments using SCOOTER will be conducted and the results reported.

Conclusion

Unrestricted image-based adversarial attacks are expected to play an important role in the near future given the rapid advances in AI image generation. The proposed framework SCOOTER serves as a toolbox to support and raise awareness of high-quality research in this area.

In the future, it is hoped that SCOOTER will be used to demonstrate its effectiveness and encourage more research on unrestricted hostile samples. In addition, exploring the relationship with AI image generation techniques could lead to realistic threat countermeasures.

Categories related to this article

Sasayama