Catch up on the latest AI articles

It's Out At Last! Truly DETR! An Innovative Paradigm For Object Detection

It's Out At Last! Truly DETR! An Innovative Paradigm For Object Detection

Object Detection

main points
✔️ Applying the Transformer to object detection
✔️ End-to-end models to reduce manual design

✔️ Redefining object detection as a direct set prediction problem

End-to-End Object Detection with Transformers
written by Nicolas CarionFrancisco MassaGabriel SynnaeveNicolas UsunierAlexander KirillovSergey Zagoruyko
(Submitted on 26 May 2020 (v1), last revised 28 May 2020 (this version, v3))

Comments: Published by arXiv
Subjects: Computer Vision and Pattern Recognition (cs.CV)



It's no longer a commodity, and models like object detection, Yolo/SSD, etc. used in various scenarios have been quantized and are now running on computers as small as a Raspy.
Here is a new paradigm using the Transformer that has taken language processing by storm.
DETR = DEtection TRansformer
Structurally, it is a CNN body and a Transformer connected to a CNN body.
Structure of DETR from paper Fig. 1
At first glance, it seems to be a simple connection between CNN and a Transformer, but it is an excellent paper with solid experimental comparisons and a lot of discussion. There was a lot of know-how in both the object detection and the transformer, and there was a lot to be gained from this paper.
In addition, the implementation is now available on github and a trained model is available for you to try out right away. With more than 3.1k stars already (as of June 2020), this newcomer is a hot newcomer.
Let's take a look at that DETR, and the key points worth noting.

Point 1: End-to-End Philosophy

One of the most commonly cited reasons for the growth of Deep learning is the end-to-end learning that automates feature extraction (MatWorks, "Three Things You Need to Know About This").


While this "end-to-end philosophy" has led to major breakthroughs in machine translation and speech recognition, it is significant that it is an effective solution for object detection, where the human design factor still determines performance.

Examples of human design elements in object detection

To read more,

Please register with AI-SCHOLAR.

Sign up for free in 1 minute


If you have any suggestions for improvement of the content of the article,
please contact the AI-SCHOLAR editorial team through the contact form.

Contact Us