The Key To The Colorization Task Is Object Recognition!

Image Recognition 09/11/2020

3 main points

✔️ New training base for instance-based colorization methods
✔️ Improved accuracy by extracting image features at the instance and full image levels and optimizing feature fusion
✔️ Traditional State-of-the-art performance compared to methods

Instance-aware Image Colorization
written by Jheng-Wei Su, Hung-Kuo Chu, Jia-Bin Huang
(Submitted on 21 May 2020)
Comments: Accepted at CVPR2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Paper Official Code Demo

Introduction

The task of converting a grayscale image to a color image has achieved a lot of success with the development of machine learning. Converting old black and white images into color images has been a hot topic of discussion, as well as bringing back the color images of the time. In practice, however, this task is an unrealistic problem, as it involves predicting the missing two channels from a single-channel grayscale image. Furthermore, there are multiple options for colorizing objects, and even clothing can be a multimodal problem, such as red, green, and blue.

Previous method

Conventional methods include user-involved colorization techniques that use reference images as guidance and colorization techniques that learn and colorize a large data set with advances in deep learning. The latter is the main method for colorization tasks because of its end-to-end colorization capability, although it depends on the training data inevitably.

However, it has been reported that the performance of these existing methods is significantly worse for images with multiple objects. (Figure 1)

Figure 1: The traditional method of When there are multiple objects, the problematic point

In the image above, you can see that the colorization itself is not working well because of the multiple objects. Like the orange in the lower image, it is particularly noteworthy that the colorization of the car in the back of the image works well with the proposed method.

Proposal method

For the above problem, the authors thought that it might be possible to improve the accuracy of the colorization by separating "background " and "object ". In other words, the colorization is done at the instance level.

And we consider the following two reasons why instance-level colorization increases accuracy

Unlike existing methods that learn to colorize an entire image, learning to colorize an instance makes the task easier because there is no need to deal with wasteful areas of complex backgrounds.
By using local objects as input, the network of the proposed method can learn object-level color representation. Therefore, color confusion with the background can be avoided.

overview

The proposed method consists of three parts. (Figure 2)

A pre-training model to generate object-detected crop images
Two backbone networks trained end-to-end for instance level and full image colorization, respectively
A module for selectively fusing features extracted from the two network layers

To read more,

Please register with AI-SCHOLAR.

Categories related to this article

Image Recognition

加藤: AI-SCHOLAR is a commentary media that introduces the latest articles on AI (artificial intelligence) in an easy-to-understand manner. The role of AI is not limited to technological innovation, as Japan's scientific capabilities are declining and the government continues to cut back on research budgets. Communicating with the world the technology of AI, its applications, and the context of the basic science that supports it is an important outreach, and can greatly influence society's understanding and impression of science. AI-SCHOLAR is designed to help eliminate the gaps in understanding of AI between the general public and experts, and to contribute to the integration of AI into society. In addition, we would like to help you embody your learning and research experiences in the media and express them in society. Anyone can explain advanced and difficult matters in difficult terms, but AI-SCHOLAR pursues "readability” and "comprehensibility" by making full use of vocabulary and design in conveying information as a medium.