Doesn't CNN Know The Location?

Image Recognition 28/10/2020

3 main points
✔️ CNN encodes location information
✔️ 0 padding to learn location information
✔️ The location information is stored in deeper layers.

How Much Position Information Do Convolutional Neural Networks Encode?
written by Md Amirul Islam, Sen Jia, Neil D. B. Bruce
(Submitted on 22 Jan 2020)
Comments: Accepted to ICLR2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Paper Official Code COMM Code

Introduction

CNN has been successful in many computer vision tasks such as classification and detection tasks. Basically, CNNs are mainly used as a feature extractor for objects. This is because they are trained by local filters and are basically considered to have no location information. Therefore, this point has been pointed out, and models that mainly acquire location information, such as capsule nets, have been proposed. So many of you think that CNN has very little location information?

However, this study may change that implicit idea.
See Figure 1.

Figure 1: Prominent areas of the image.

Look at the top row. On the left, there is a pronounced area on the man in the middle. Similarly, you can see that most of the images have a pronounced area around the middle. The next image is shown in the bottom row with the image cropped by the blue line in the top row. Notice something strange here. Since we have only cropped the image, there is no change in the texture and so on, so there should be no change in the noticeable area. However, the salient area of the cropped image is clearly shifted to the left (the center region). (I think this is because we learned that the salient region is in the center of the image in terms of the training image, which is why this happened. ) This is contrary to our implicit belief that CNNs do not have location information.

From this result, the authors hypothesize that CNN may be encoding (acquiring) location information that humans just don't understand. This article is about CNN's location information.

This paper was accepted into ICLR2020 and all of the reviewers gave it a perfect score (8 points) (only 34 out of 2594 papers received a perfect score). If you are interested in the scores of the other papers, see here.

To read more,

Please register with AI-SCHOLAR.

Categories related to this article

加藤: AI-SCHOLAR is a commentary media that introduces the latest articles on AI (artificial intelligence) in an easy-to-understand manner. The role of AI is not limited to technological innovation, as Japan's scientific capabilities are declining and the government continues to cut back on research budgets. Communicating with the world the technology of AI, its applications, and the context of the basic science that supports it is an important outreach, and can greatly influence society's understanding and impression of science. AI-SCHOLAR is designed to help eliminate the gaps in understanding of AI between the general public and experts, and to contribute to the integration of AI into society. In addition, we would like to help you embody your learning and research experiences in the media and express them in society. Anyone can explain advanced and difficult matters in difficult terms, but AI-SCHOLAR pursues "readability” and "comprehensibility" by making full use of vocabulary and design in conveying information as a medium.