PARA: A Large Dataset For Predicting Aesthetic Evaluation Of Personalized Images
3 main points
✔️ We have constructed a large dataset that captures the subjective aesthetic evaluation of individuals. The dataset is unique in that it measures not only the objective properties of the image, but also the subjective response of the viewer, and includes this in the dataset.
✔️ Analysis of the dataset reveals that aesthetic evaluation is a strong reflection of subjective response.
✔️ Through the construction of a prediction model using the dataset, it became clear that the use of subjective data can improve the performance of predicting an individual's aesthetic evaluation.
Personalized Image Aesthetics Assessment with Rich Attributes
written by Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, Yandong Guo
(Submitted on 31 Mar 2022)
Comments: Accepted to CVPR2022
Subjects: Computer Vision and Pattern Recognition (cs.CV)
The images used in this article are from the paper, the introductory slides, or were created based on them.
The aesthetic evaluation of an image depends not only on the objective properties of the image and subject, but also on the subjective properties of the viewer. Several datasets on individual aesthetic evaluation have been proposed in the past, but they have limitations in annotation. In this study, a dataset Personalized image aesthetics database with rich attributes (PARA) was created using a total of 31220 images.
Analysis of the obtained data revealed that subjective attribute data (labels such as emotional responses) are reflected in aesthetic evaluation. We also used the subjective information to train an individualized model of aesthetic evaluation of images, and investigated the influence of both the objective attributes of the image and the subjective attributes of the viewer on aesthetic evaluation.
The proposed dataset is available here.
Image aesthetics assessment (IAA) is a framework for computerized evaluation of the aesthetic value of a photograph. There are two categories: generic image aesthetics assessment (GIAA) and personalized image aesthetics assessment (PIAA). This study deals with the latter.
Previously, datasets such as FLICKR-AES and AADB have been proposed for individual aesthetic evaluation, but they suffer from a limited diversity of annotations. In this study, a new dataset was constructed to address this problem.
The three contribution points are as follows
A comprehensive and large dataset of individuals' subjective aesthetic evaluation was constructed. A total of 31220 images were used, and the dataset includes not only objective attributes of the images, but also items related to the viewer's subjective response (e.g., content preference, difficulty in judgment, etc.).
The above items related to subjective response were successfully reflected in the aesthetic evaluation.
The use of data from the above subjective response items also improves the performance of the model for predicting aesthetic evaluations.
How to create a dataset
After collecting images, including Creative Commons licensed images, and annotating the image scenes, we sample 28,000 images to maintain content diversity. We then add about 3,000 images from the existing aesthetic evaluation dataset to balance the distribution of aesthetic scores. The ten types of image scenes used were portrait, animal, plant, landscape, building, still life, nightscape, food, indoor, and other. Subjects are asked to label the various attributes of the images obtained in this way.
In addition to labels for the objective attributes of the image, such as brightness, color, composition, and content, the participants are asked to label the subjective attributes of the image. The labels for subjective attributes include the emotions felt (eight types: amusement, excitement, contentment, awe, disgust, sadness, fear, and neutral), difficulty in judging the aesthetics, preference for what is shown (content), and the number of times the image has been shared on social media (four types). the feeling of wanting to share it on social media.
We also obtain personal information about the subjects, such as age, gender, education, personality traits, and experience with art and photography. For personality traits, we use the Big-Five personality traits that are often used in the context of psychology. Specifically, the Big-Five personality traits are Openness (O), Conscientiousness (C), Extraversion (E), Agreeableness (A), and Neuroticism (N).
In addition to these, subjects are asked to rate the aesthetic score and image quality of the image on a scale of 1 to 5, respectively.
The annotation results for a total of 31220 collected images were analyzed.
Distribution of each attribute
The distribution of each of the attributes for aesthetic evaluation studied is broadly similar, but there are still slight differences. This indicates that each attribute is correlated with each other, but also has useful information as a stand-alone attribute.
Distribution of aesthetic evaluation scores
The variance of the scores across subjects is shown for each range of aesthetic evaluation score values. The higher the score, the smaller the variance, indicating that there is a commonality in aesthetic evaluation, i.e., what is considered beautiful, among individuals. On the other hand, the lower aesthetics scores have larger variance, indicating the need to take individual differences into account.
Pearson correlation coefficients between aesthetic evaluation score and each attribute
The table shows that there is a high correlation between the aesthetic evaluation score and the image quality score. There also seems to be a high correlation between the preference for what is depicted (content) and whether or not people feel inclined to share the photo on social media. This indicates that people are more inclined to share photos that they like. On the other hand, the correlation between the two attributes is about 0.5, indicating that there are some commonalities and some differences.
Correlations between subjects' personality traits and each attribute's aesthetic evaluation score
This figure shows which attributes are highly correlated with aesthetic evaluation scores for subjects with different personality traits. It can be seen that subjects with strong neuroticism tendency N react much differently than the other subjects. This indicates that people with strong N tend to overreact to external stimuli and have stronger emotional reactions than others. It also shows that subjects with strong extraversion E tend to focus more on what they see (content) when making aesthetic evaluation decisions.
Relationship between Emotional and Aesthetic Evaluation Scores
The eight emotions (amusement, excitement, contentment, awe, disgust, sadness, fear, and neutral) were divided into three groups: positive (amusement, excitement, contentment), neutral (awe neutral), and negative (disgust, sadness, fear). It was found that images with an aesthetic evaluation score of 2.0 or less (left of the black dotted line l1) often evoked negative emotions, while images with an aesthetic evaluation score of 4.0 or more (right of the black dotted line l2) often evoked positive emotions.
Building predictive models using datasets
Finally, to demonstrate the usefulness of the collected dataset, this study investigated whether the use of subjective attribute data improves the model's aesthetic evaluation prediction performance.
The process of model building is roughly explained as follows: First, a general-purpose model for predicting aesthetic evaluation of images common to many people (GIAA model) is trained, and then fine-tuned to become a model for predicting individual aesthetic evaluation (PIAA model) using data from specific subjects. The process is as follows.
In the experiment, we compared the case in which the subjects' personal information (personality traits, art experience, and photography experience), one of the features of the proposed dataset, was used (Conditional PIAA group) with the case in which it was not (Unconditional PIAA group).
The following table shows the results of the evaluation with the Pearson correlation coefficient (PLCC) and Spearman correlation coefficient (SROCC) between the predictions and the correct answers.
In the Conditional PIAA group and the Unconditional PIAA group, the vertical columns correspond to the differences in the neural network structure used and the subjects' personal information used for conditioning, and the horizontal columns correspond to the differences in the number of times fine tuning was performed on the individual data (without The horizontal columns correspond to the differences in the number of times fine tuning was performed on individual data (without finetune indicates 0 times).
The results show that the performance of predicting an individual's aesthetic evaluation can be improved by fine tuning using that individual's specific data. We also see that performance can be improved by using information on subjective attributes.
In this issue, we introduced a study that proposed a new dataset specialized for individual aesthetics to solve the problem of predicting the aesthetic evaluation of images. Prediction of aesthetic evaluation of images is a technology with various possible applications, such as image recommendation and automatic selection of photos. In particular, personalized services are increasingly required these days, and the research presented here is an important step in this direction.
Categories related to this article