[ArtCoder] Generate Art-like QR Code With Style Conversion!

Style_Transfer 10/11/2021

3 main points
✔️ Using Neural Style Transfer to generate art-like QR codes
✔️ Proposed an end-to-end QR code generation method based on three loss functions
✔️ Successfully generated robust QR codes that can be read by real applications

An End-to-end Method for Producing Scanning-robust Stylized QR Codes
written by Hao Su, Jianwei Niu, Xuefeng Liu, Qingfeng Li, Ji Wan, Mingliang Xu, Tao Ren
(Submitted on 16 Nov 2020)
Comments: Published on arxiv.
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

first of all

In this article, we introduce our research on generating art-like QR codes using Neural Style Transfer.

In the proposed method (ArtCoder), QR Code can be generated from a source image (Content), a target image (Style) and a message to be set, as in a general image style transformation, as follows.

If you scan this image with your smartphone or other devices, you will see the string "Thank you for reviewing our paper.

The proposed method (ArtCoder)

pipeline

The pipeline of the proposed methodology is as follows

The proposed method is modeled as a function $\Psi$ that generates a QR code $Q=\Psi(I_s, I_c, M)$ from a style image $I_s$, a content image $I_c$ and a message M.

In this case, the objective function (loss function) $L_{total}$ of $Q=\Psi(I_s,I_c,M)$ is defined by the following equation which combines three loss functions.

$L_{total}=\lambda_1L_{style}(I_s,Q)+\lambda_2L_{content}(I_c,Q)+\lambda_3L_code(M,Q)$

The $\lambda_1,\lambda_2,\lambda_3$ are parameters that indicate the weights of the three functions.

In addition, style loss $L_{style}$, content loss $L_{content}$, and code loss $L_{code}$ are used to preserve the style, content, and message content of QR Code, respectively. Let's look at each loss in turn below.

Style loss $L_{style}$ and content loss $L_{content}$

The style loss $L_{style}$ and content loss $L_{content}$ are used to ensure that the generated QR Code retains its style and content contents. Specifically, according to the existing work on style transformation (1, 2), they are defined by the following equations

$L_{style}(I_s,Q)=\frac{1}{C_sH_sW_s}||G[f_s(I_s)]-G[f_s(Q)]||^2_2$

$L_{content}(I_c,Q)=\frac{1}{C_sH_sW_s}||f_c(I_c)]-f_c(Q)||^2_2$

where $G$ denotes the Gram matrix and $f_s(,f_c)$ denotes the feature map extracted from the $s(,c)$-th layer of the pre-trained VGG-19.

About code loss $L_{code}$.

The code loss $L_{code}$ is used to control the content of the generated QR code using the SS Layer (Sampling-Simulation layer), a virtual QR code reader that simulates the sampling process of the QR code reader.

About Sampling-Simulation layer

In Goole ZXing, which is used to decode QR Codes, the QR Code reader samples and decodes the center pixel of each module (the black and white squares in the QR Code).

When reading a QR code with a QR code reader, the probability $g_{M_k(i,j)}$ that a pixel existing at the coordinate $(i,j)$ with the center of each module as the origin is sampled is considered to follow the following equation.

$g_{M_k(i,j)}=\frac{1}{2\pi\sigma^2}e^{-\frac{i^2+j^2}{2\sigma^2}}$

In the Sampling-Simulation layer, based on the aforementioned equation, the sampling process when the generated QR Code is read by an actual QR Code reader is simulated to improve the robustness of the QR Code reading.

Specifically, for each module $m-dis-m$ (each with $a-dis-a$ pixels) in the QR code, we perform a convolution operation with kernel size $a$, stride $a$, and padding 0, and output the feature map $F=l_ss(Q)$ of $m-dis-m$.

Based on this feature map $F=l_ss(Q)$ and $g_{M_k(i,j)}$, the bit $F_{M_k}$ corresponding to the module $M_k$ is given by

$F_{M_k}=\sum_{(i,j) \in M_k} g_{M_k(i,j)} \cdot Q_{M_k(i,j)}$

The $F_{M_k}$ is considered to simulate whether each module is decoded as 0 or 1 when the QR code is actually read by a QR code reader.

Code loss $L_{code}$

The code loss $L_{code}$ is obtained as the sum of the subcode losses $L^{M_k}_{code}$ corresponding to each module $M_k \in Q$ of QR code $Q$.

$L_{code}=\sum_{M_k in Q}L^{M_k}_{code}$

where $L^{M_k}_{code}$ is given by the following equation.

$L^{M_k}_{code}=K_{M_k} \cdot ||\textit{M}_{M_k}-F_{M_k}||$

where $\textit{M}$ is an $m×m$ matrix representing the target QR code and whether each module should be 0 or 1. Also, $K$ is the activation map computed by the competition mechanism described below.

Competition mechanism

By controlling the activation map $K$, the competition mechanism decides whether to prioritize and optimize the visual quality of the generated QR code ($L_style, L_content$) or whether the QR code can be accurately read by a QR code reader ($L_code$ ).

The pipeline for this competition mechanism is shown in the figure below.

Specifically, the activation map $K$ is 0 if the module $M_k$ is correct and 1 if it is wrong when the QR code $Q$ is read by the virtual QR code reader $R_{QR}$.

By adopting such a competition mechanism, we prioritize and optimize $L_{code}$ for wrong modules and $L_{style}, L_{content}$ for correct modules, so that both image quality and robustness of QR code reading are properly and robustness of QR code reading.

About the virtual QR code reader $R_{QR}$.

When a normal QR code reader reads QR code $Q$, QR code $Q$ is converted to grayscale, and binarization is performed as follows according to the value of each module.

Here, $T$ is the threshold for determining whether a module is black or white.

On the other hand, when reading a QR code $Q$ with a virtual QR code reader, binarization is performed for each module $M_k$ as follows.

Here, $T_b and $T_w$ are the thresholds for determining whether a module is black or white, respectively ($T_b$ is the threshold when the module $M_k$ is black ($\textit{M}_{M_k}=0$) and $T_w$ is the threshold in the opposite case).

In other words, by using different thresholds according to the ideal value of each module ($\textit{M}_{M_k}$), we can discriminate the value of each module more strictly than in actual QR code reading.

At this time, we introduce $\eta=\frac{|T-T_b|}{T}=\frac{|T_w-T|}{(255-T)}$ as a parameter to indicate the robustness of QR code reading. By setting this parameter $\eta$, we can make a trade-off between the quality of the image and the robustness of the QR code reading.

Based on the above, the loss function is optimized according to the pipeline described above, and the generated QR code is updated iteratively.

experimental results

Experiment setup

Regarding the datasets used in our experiments, the content image dataset consists of 100 512x512 images (portraits, cartoons, landscapes, animals, logos, etc.), and the style image dataset consists of 30 images representing various styles.

Experiments are performed on NVIDIA Tesla V100 GPU with hyperparameters set to $\lambda_1=10^15, \lambda_2=10^7, \lambda_3=10^20, learning rate 0.001, robust parameter \eta=0.6$. Also, the number of iterations for QR code generation is $10^4$.

About the quality of the generated QR Code

Comparison with existing methods

To begin with, the results of comparison with existing NST (Neural Style Transfer) techniques or QR code generation methods are as follows.

Comparing Ours with the others, the generated results are of lesser quality than the existing NST examples, and do not produce a large amount of point-like noise as compared to existing art-like QR code generation methods.

For the weight parameter $\lambda$.

Here are the results of varying the $\lambda_2$ for content loss among the weight parameters.

As shown in the figure, it can be seen that the generated image can be controlled by varying the weight parameter.

On the Robustness of QR Code Reading

In the following experiment, we verify the robustness of the generated QR Code to be read by a real application.

Qualitative Analysis of Robustness

The following image shows a case in which a part of a generated QR Code is enlarged.

If you look at the center of each module after binarization (blue and red circles in Binary results), you can see that the black and white regions are well separated for each module, and the result after sampling is the same as the ideal result.

Therefore, the generated QR code can be robustly read by common QR code readers.

For the parameter $\eta$, which indicates the robustness

The results for varying the robustness parameter $\eta$ are as follows.

Among the figures, (a) shows the generated image, (b) shows a magnified part of it, (c) shows the error module, and (d) shows the magnitude of each loss.

In general, increasing $\eta$ increases the code loss and improves the robustness but decreases the image quality. On the other hand, when $\eta$ is small, the code loss converges to zero quickly, and the image quality is improved but the robustness is decreased.

About the success probability of reading

For the generated image, the following results are obtained when the image is displayed on the screen in three sizes (3cm x 3cm, 5cm x 5cm, and 7cm x 7cm) and scanned at a distance of 20cm.

This table shows the average number of successful scans for 50 scans of 30 QR codes using each mobile device (successful decoding within 3 seconds is considered successful).

In general, the success rate is at least 96%, indicating that the proposed method is robust enough to be effective in real applications. (Even when the scan failed, the read seemed to succeed although it took more than 3 seconds to complete.)

The effect of distance and angle on reading

The results of the comparison with the existing method when changing the distance and angle and $\eta$ are shown in the following figure.

As a result of the comparison, when $\eta>0.6$, the robustness of the proposed method is equal to or slightly inferior to the existing methods, and it is robust enough to be used in practice.

summary

In this article, we have presented our work on generating art-like QR codes using Neural Style Transfer. The proposed method (ArtCoder) is capable of generating QR codes with excellent quality and robustness that can be read by real-world applications.

However, if the generation speed is sufficiently improved in future research, we may see the day when we encounter eye-catching QR Codes with an artistic style in the real world!