I Created The Art Of Abstract Expressionism With Evolutionary Strategy Algorithms.
3 main points
✔️ I tried to combine evolutionary strategies and the CLIP model to do computer art.
✔️ Included the process of creating art and was able to express abstract concepts diversely and accurately.
✔️ In comparison with gradient-based methods, the art style was found to be strongly dependent on the optimization algorithm.
Modern Evolution Strategies for Creativity: Fitting Concrete Images and Abstract Concepts
written by Yingtao Tian, David Ha
(Submitted on 18 Sep 2021)
Comments: Published on arxiv.
Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
The images used in this article are from the paper, the introductory slides, or were created based on them.
first of all
This article is about a paper that tackles computer art using a method that combines evolutionary strategy (ES) and CILP (early in the new year! AI could be about to have another break through).
In the early 20th century, a modern revolution in art took place. Art with an abstract perspective, which abandoned the traditional drawing of objects based on perspective, developed rapidly. Among them, Picasso and other famous artists proposed geometric art expression. In addition, Mondrian tried to express the world through pure and simple combinations of shapes, and his influence was later echoed in Abstract Expressionism and Minimalist art, making a significant contribution to the art world.
On the other hand, the idea of minimalist art was also explored in the realm of computer art, where the idea of algorithmic complexity was used to try to represent the complexity of the world. Genetic algorithms are one such example and are characterized by the ability to capture the creative process of the artist as the images evolve iteratively.
In this research, we tried to create art with simple triangles using Evolution Strategies (ES) and showed that we can create art with various abstract expressionism according to human language instructions using CLIP, which was released by OpenAI in January 2021. We have also released the source code so that computer art artists can use it easily.
As an example, Figure 1. shows the abstract art created by the proposed method. Especially, 4.'Walt Disney World' and 6.'A picture of Tokyo' is well captured the features.
Modern Evolution Strategies for Creativity
The objective of the proposed method (Figure 2.) is to place triangles with transparency using an evolutionary strategy (ES). A triangle is represented by 10 parameters: coordinates of its three vertices (x1, y1, x2, y2, x3, y3), color (r, b, g), and transparency (a). A fit score is calculated to show how well the generated image fits the text or the target image. In this way, the ES algorithm selects parameters from several candidate parameters so that the fit score is high. In this study, we also used PGPE with an optimization method called ClipUp as a comparative ES algorithm.
If N triangles are used to create the art, there are 10N parameters, the number N is a hyperparameter, and the other parameters are updated. If the transparency (a) is 0, there will be no triangles, so we give the algorithm a degree of freedom for the number of triangles.
As shown in Figure 2. the ES algorithm is directly related to the evaluation of the fit score, so we are free to choose what we consider to be fit. In this study, we considered fitting to concrete images or abstract concepts. If you want to fit a concrete image, you can use the L2 loss in pixels between the generated image and the target image as the fit score. If we want to fit an abstract concept, we can calculate the fit score of the generated image and the target concept in the latent space. In this study, we used the image encoder and text encoder of the CLIP model, respectively, to project the image and text to the latent space, and then used the sine similarity as the fit score. It is particularly worth mentioning that the ES algorithm performs a black-box optimization, so there is no need for rendering and calculating the fit score to be differentiable.
Fitting Concrete Target Image
Here we will see the result of fitting to a specific image, Figure 3. is the famous "Mona Lisa" fitted with 50 triangles and updated with 10,000 steps. The result is a unique style of art that tries to express the fine texture and background with triangles. In the evolution process shown on the right, you can also see how the shape and color of the shapes are finely adjusted.
Number of triangles and parameters
The PGPE algorithm used in the proposed method is efficient and the parameters increase linearly with the number of triangles. It can also be seen from Figure 4. that the proposed method can fit any target image.
Choice of ES algorithm
In this study, we compare our ClipUp and PGPE with traditional evolutionary algorithms, and Figure 5. shows that the proposed method is better for the same number of iterations and parameters. In the quantitative evaluation, we can see that the proposed method does not outperform even after 56 more iterations over the baseline.
Comparison with gradient-based optimization methods
The proposed ES-based method is compared with the gradient-based nvdiffrast method. Figure 6. shows that the proposed and gradient-based methods can produce comparable images, but the proposed method has a slightly higher fit score. It is interesting to note that different styles of art were obtained. The proposed method uses large triangles for the background and smaller triangles for the details, whereas the gradient-based method tends to introduce textures not found in the target. This may be because the proposed method focuses on triangle placement and the gradient-based method focuses on transparent color composition.
Fitting Abstract Concept with CLIP
Next, we will look at the results of fitting an abstract concept expressed in language. This is a far more difficult and interesting problem than fitting concrete images, which we introduced in the previous section because we are quite free in what we can draw.
It took 2000 steps to converge to fit an abstract concept and an example of the result is shown in Figure 7. We were able to handle not only single words and phrases but also longer sentences. In particular, we obtained creative art concepts that can be interpreted by humans. We fine-tuned the evolutionary process so that the first three capture the characteristics of humans and the Disneyland castle, and the last captures the characteristics of Google's headquarters in Silicon Valley. Google's headquarters in particular properly captures the complex features, and if you're curious, Google "Google Silicon Valley" and compare the two.
Number of triangles and parameters
Figure 8. shows the result of fitting with the different number of triangles. It is abstract and difficult to evaluate, but all of them seem to be fit. It can be seen that the number of triangles balances the budget that should be used to express the features when considering the budget available for art production. However, it can be seen that the result of fitting 'A picture of Tokyo' with 200 triangles does not work well. The excessive use of triangles makes the task more difficult, so this is an issue for future research.
There is a large degree of freedom in fitting an abstract concept. Figure 9. shows the results of four experiments, each with 50 triangles and 2000 steps. Figure 9. shows the results of four experiments with 50 triangles and 2000 steps, respectively. The authors claim that the results are different within the range of human interpretability and have the properties required for computer-assisted art production.
Comparison with gradient-based optimization methods
Finally, a comparison with gradient-based methods was made: a lot of good work has already been done on art production using CLIP, including Clipdraw and StyleGAN. However, the gradient dynamics of the renderer and CLIP are very different, so it is not easy to optimize them. It was necessary to prepare a manuscript depending on the study. In this study, we use the same differentiable renderer as in the previous section and compare it to the nvdiffrast, which allows the loss to propagate back to the choice of parameters, as shown in Figure 2.
Both methods can fit the concept (Figure 10.) The proposed method with ES expresses the boundaries of shapes and objects more clearly. Interestingly, the proposed method represents an art style that is more similar to Abstract Expressionism. Similar to the difference between post-impressionism and impressionism, the proposed method differs from the gradient-based method by using bolder colors and shapes. The authors argue that the choice of the algorithm leads to the art style, as these results are strongly dependent on the optimization algorithm.
How about it? We introduced the art of minimalism generated by a method combining ES and CLIP as an algorithm of computer art. In our experiments, we verified that we can generate geometric abstractions targeting human language and image interpretation. We argue that artists can create unique art creations through a combination of algorithms, and we recommend trying them out, as the easy-to-use source code is published to support this. Finally, through this article, I hope to share with you that there is still a lot of potential for AI in diverse domains, including art.
Categories related to this article