Everything You Need To Know About Transformer In Computer Vision! Partt3/5(Segmentation, Image Generation, Low Level Vision Task)

Transformer 26/01/2021

3 main points
✔️Explain the applications of Transformer in computer vision
✔️Explains examples of research in segmentation, image generation, and low-level vision tasks
✔️Total of 37 models, 9 models are described in this article

Transformers in Vision: A Survey
written by Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, Mubarak Shah
(Submitted on 4 Jan 2021)
Comments: 24 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)

first of all

Transformer has shown its high performance not only in natural language processing but also in many other areas. Among them, the application research of the Transformer in the field of computer vision, which deals with visual information, has become very popular.

In view of this demand, we will provide a very extensive and detailed description of Transformer in computer vision.

This article presents examples of applications of Transformer in segmentation, image generation, and low-level vision tasks.

Two models for segmentation, four for image generation, and three for low-level visual tasks are described.

See Parts 2, 4, and 5 for examples of research on other tasks, and Part 1 for a general description of transformers in computer vision.

Overall Structure (Table of Contents)

1. about Transformer in Computer Vision (Part1)

2.A Concrete Example of Transformer in Computer Vision(Part2～5)
2.1 Transformers for Image Recognition(Part2)
2.2 Transformers for Object Detection(Part2)
2.3 Transformers for Segmentation
・Axial-attention for Panoptic Segmentation
・CMSA(Cross-modal Self-Attention)
2.4 Transformers for Image Generation
・iGPT(Image GPT)
・Image Transformer
・High-resolution Image Synthesis
・SceneFormer
2.5 Transformers for Low-level Vision
・TTSR(Texture Transformer Network for Image Super-Resolution)
・IPT(Image Processing Transformer)
・ColTran(Colorization Transformer)
2.6 Transformers for Multi-modal Tasks(Part4)
2.7 Video Understanding(Part5)
2.8 Transformers in Low-shot Learning(Part5)
2.9 Transformers for Clustering(Part5)
2.10 Transformers for 3D Analysis(Part5)