Catch up on the latest AI articles

What is AI-SCHOLAR?

ECCV2020_Papers Accepted List

ECCV2020_Papers Accepted List

Article 31/08/2020

The 2020 European Conference on Computer Vision (ECCV 2020), which took place August 24-27, 2020, is conference in the field of image analysis.

Quaternion Equivariant Capsule Networks for 3D Point Clouds

[supplementary material]

DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares

[supplementary material]

NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search

[supplementary material]

Describing Textures using Natural Language

[supplementary material]

Empowering Relational Network by Self-Attention Augmented Conditional Random Fields for Group Activity Recognition

[supplementary material]

AiR: Attention with Reasoning Capability

[supplementary material]

Self6D: Self-Supervised Monocular 6D Object Pose Estimation

[supplementary material]

Invertible Image Rescaling

[supplementary material]

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation

[supplementary material]

House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation

[supplementary material]

Crowdsampling the Plenoptic Function

[supplementary material]

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

End-to-End Object Detection with Transformers

[supplementary material]

DeepSFM: Structure From Motion Via Deep Bundle Adjustment

[supplementary material]

Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry

[supplementary material]

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

[supplementary material]

Conditional Convolutions for Instance Segmentation

[supplementary material]

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution

[supplementary material]

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

[supplementary material]

Privacy Preserving Structure-from-Motion

[supplementary material]

Rewriting a Deep Generative Model

[supplementary material]

Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets

[supplementary material]

Long-term Human Motion Prediction with Scene Context

[supplementary material]

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

[supplementary material]

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes

[supplementary material]

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

[supplementary material]

Learning and Aggregating Deep Local Descriptors for Instance-level Recognition

A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem

[supplementary material]

Learn to Recover Visible Color for Video Surveillance in a Day

[supplementary material]

Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images

[supplementary material]

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation

[supplementary material]

BorderDet: Border Feature for Dense Object Detection

[supplementary material]

Regularization with Latent Space Virtual Adversarial Training

[supplementary material]

Du²Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels

[supplementary material]

Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning

Targeted Attack for Deep Hashing based Retrieval

[supplementary material]

Gradient Centralization: A New Optimization Technique for Deep Neural Networks

[supplementary material]

Content-Aware Unsupervised Deep Homography Estimation

[supplementary material]

Multi-View Optimization of Local Feature Geometry

[supplementary material]

The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization

[supplementary material]

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video

[supplementary material]

Learning Stereo from Single Images

[supplementary material]

Prototype Rectification for Few-Shot Learning

[supplementary material]

Learning Feature Descriptors using Camera Pose Supervision

[supplementary material]

Semantic Flow for Fast and Accurate Scene Parsing

[supplementary material]

Appearance Consensus Driven Self-Supervised Human Mesh Recovery

[supplementary material]

Diffraction Line Imaging

[supplementary material]

Aligning and Projecting Images to Class-conditional Generative Networks

[supplementary material]

Suppress and Balance: A Simple Gated Network for Salient Object Detection

[supplementary material]

Visual Memorability for Robotic Interestingness via Unsupervised Online Learning

[supplementary material]

Post-Training Piecewise Linear Quantization for Deep Neural Networks

[supplementary material]

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

[supplementary material]

In-Home Daily-Life Captioning Using Radio Signals

[supplementary material]

Self-Challenging Improves Cross-Domain Generalization

[supplementary material]

A Competence-aware Curriculum for Visual Concepts Learning via Question Answering

[supplementary material]

Multitask Learning Strengthens Adversarial Robustness

[supplementary material]

S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search

[supplementary material]

Improving Deep Video Compression by Resolution-adaptive Flow Coding

[supplementary material]

Motion Capture from Internet Videos

[supplementary material]

Appearance-Preserving 3D Convolution for Video-based Person Re-identification

Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization

[supplementary material]

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation

[supplementary material]

Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded Apertures

[supplementary material]

Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling

[supplementary material]

Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction

[supplementary material]

Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network

[supplementary material]

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation

[supplementary material]

CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image

[supplementary material]

Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

[supplementary material]

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

[supplementary material]

Domain-invariant Stereo Matching Networks

[supplementary material]

DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling

[supplementary material]

Content Adaptive and Error Propagation Aware Deep Video Compression

[supplementary material]

Towards Streaming Perception

[supplementary material]

Towards Automated Testing and Robustification by Semantic Adversarial Data Generation

[supplementary material]

Adversarial Generative Grammars for Human Activity Prediction

[supplementary material]

GDumb: A Simple Approach that Questions Our Progress in Continual Learning

[supplementary material]

Learning Lane Graph Representations for Motion Forecasting

[supplementary material]

What Matters in Unsupervised Optical Flow

[supplementary material]

Synthesis and Completion of Facades from Satellite Imagery

[supplementary material]

Mapillary Planet-Scale Depth Dataset

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction

[supplementary material]

Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters

[supplementary material]

EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning

[supplementary material]

Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation

[supplementary material]

Cross-Domain Cascaded Deep Translation

[supplementary material]

“Look Ma, no landmarks!” – Unsupervised, Model-based Dense Face Alignment

[supplementary material]

Online Invariance Selection for Local Feature Descriptors

[supplementary material]

Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations

[supplementary material]

TextCaps: a Dataset for Image Captioning with Reading Comprehension

[supplementary material]

It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction

[supplementary material]

Learning What to Learn for Video Object Segmentation

[supplementary material]

SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing

[supplementary material]

LIMP: Learning Latent Shape Representations with Metric Preservation Priors

[supplementary material]

Unsupervised Sketch to Photo Synthesis

[supplementary material]

A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions

[supplementary material]

SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification

[supplementary material]

Hierarchical Face Aging through Disentangled Latent Characteristics

[supplementary material]

Hybrid Models for Open Set Recognition

TopoGAN: A Topology-Aware Generative Adversarial Network

[supplementary material]

Learning to Localize Actions from Moments

[supplementary material]

ForkGAN: Seeing into the Rainy Night

[supplementary material]

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning

[supplementary material]

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval

[supplementary material]

TSIT: A Simple and Versatile Framework for Image-to-Image Translation

[supplementary material]

ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices

[supplementary material]

HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation

[supplementary material]

Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve

[supplementary material]

A Unified Framework of Surrogate Loss by Refactoring and Interpolation

[supplementary material]

Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images

[supplementary material]

Memory-augmented Dense Predictive Coding for Video Representation Learning

[supplementary material]

PointMixup: Augmentation for Point Clouds

[supplementary material]

Identity-Guided Human Semantic Parsing for Person Re-Identification

Learning Gradient Fields for Shape Generation

[supplementary material]

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder

[supplementary material]

Corner Proposal Network for Anchor-free, Two-stage Object Detection

PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing

[supplementary material]

Learning Delicate Local Representations for Multi-Person Pose Estimation

Learning to Plan with Uncertain Topological Maps

[supplementary material]

Neural Design Network: Graphic Layout Generation with Constraints

[supplementary material]

Learning Open Set Network with Discriminative Reciprocal Points

[supplementary material]

Convolutional Occupancy Networks

[supplementary material]

Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry

[supplementary material]

TIDE: A General Toolbox for Identifying Object Detection Errors

[supplementary material]

PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding

[supplementary material]

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation

[supplementary material]

Circumventing Outliers of AutoAugment with Knowledge Distillation

S2DNet: Learning Image Features for Accurate Sparse-to-Dense Matching

[supplementary material]

RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving

[supplementary material]

Video Object Segmentation with Episodic Graph Memory Networks

[supplementary material]

Rethinking Bottleneck Structure for Efficient Mobile Network Design

[supplementary material]

Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks

[supplementary material]

Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach

REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets

[supplementary material]

Contrastive Learning for Weakly Supervised Phrase Grounding

[supplementary material]

Collaborative Learning of Gesture Recognition and 3D Hand Pose Estimation with Multi-Order Feature Analysis

[supplementary material]

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors

[supplementary material]

TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

[supplementary material]

Semi-Siamese Training for Shallow Face Learning

[supplementary material]

GAN Slimming: All-in-One GAN Compression by A Unified Optimization Framework

[supplementary material]

Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition

Binarized Neural Network for Single Image Super Resolution

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

[supplementary material]

Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation

[supplementary material]

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

[supplementary material]

Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets

[supplementary material]

Hamiltonian Dynamics for Real-World Shape Interpolation

[supplementary material]

Learning to Scale Multilingual Representations for Vision-Language Tasks

[supplementary material]

Multi-modal Transformer for Video Retrieval

[supplementary material]

Feature Representation Matters: End-to-End Learning for Reference-based Image Super-resolution

RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera

[supplementary material]

Surface Normal Estimation of Tilted Images via Spatial Rectifier

[supplementary material]

Multimodal Shape Completion via Conditional Generative Adversarial Networks

[supplementary material]

Generative Sparse Detection Networks for 3D Single-shot Object Detection

[supplementary material]

Grounded Situation Recognition

[supplementary material]

Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos

[supplementary material]

Unpaired Learning of Deep Image Denoising

[supplementary material]

Self-supervising Fine-grained Region Similarities for Large-scale Image Localization

[supplementary material]

Rotationally-Temporally Consistent Novel View Synthesis of Human Performance Video

[supplementary material]

Side-Aware Boundary Localization for More Precise Object Detection

[supplementary material]

SF-Net: Single-Frame Supervision for Temporal Action Localization

[supplementary material]

Negative Margin Matters: Understanding Margin in Few-shot Classification

[supplementary material]

Particularity beyond Commonality: Unpaired Identity Transfer with Multiple References

[supplementary material]

Tracking Objects as Points

[supplementary material]

CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

[supplementary material]

Transporting Labels via Hierarchical Optimal Transport for Semi-Supervised Learning

MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning

[supplementary material]

Learning to Factorize and Relight a City

[supplementary material]

Region Graph Embedding Network for Zero-Shot Learning

[supplementary material]

GRAB: A Dataset of Whole-Body Human Grasping of Objects

[supplementary material]

DEMEA: Deep Mesh Autoencoders for Non-Rigidly Deforming Objects

[supplementary material]

RANSAC-Flow: Generic Two-stage Image Alignment

[supplementary material]

Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds

[supplementary material]

Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images

[supplementary material]

Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

[supplementary material]

Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & Application

[supplementary material]

MovieNet: A Holistic Dataset for Movie Understanding

[supplementary material]

Short-Term and Long-Term Context Aggregation Network for Video Inpainting

[supplementary material]

DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization

[supplementary material]

Face Super-Resolution Guided by 3D Facial Priors

[supplementary material]

Label Propagation with Augmented Anchors: A Simple Semi-Supervised Learning baseline for Unsupervised Domain Adaptation

[supplementary material]

Are Labels Necessary for Neural Architecture Search?

[supplementary material]

BLSM: A Bone-Level Skinned Model of the Human Mesh

[supplementary material]

Associative Alignment for Few-shot Image Classification

[supplementary material]

Cyclic Functional Mapping: Self-supervised Correspondence between Non-isometric Deformable Shapes

View-Invariant Probabilistic Embedding for Human Pose

[supplementary material]

Contact and Human Dynamics from Monocular Video

[supplementary material]

PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation

[supplementary material]

Points2Surf Learning Implicit Surfaces from Point Clouds

[supplementary material]

Few-Shot Scene-Adaptive Anomaly Detection

[supplementary material]

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting

[supplementary material]

Entropy Minimisation Framework for Event-based Vision Model Estimation

[supplementary material]

Reconstructing NBA Players

[supplementary material]

PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments

TENet: Triple Excitation Network for Video Salient Object Detection

Deep Feedback Inverse Problem Solver

[supplementary material]

Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification

Hallucinating Visual Instances in Total Absentia

[supplementary material]

Weakly-supervised 3D Shape Completion in the Wild

[supplementary material]

DTVNet: Dynamic Time-lapse Video Generation via Single Still Image

[supplementary material]

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss

[supplementary material]

Collaborative Video Object Segmentation by Foreground-Background Integration

[supplementary material]

Adaptive Margin Diversity Regularizer for handling Data Imbalance in Zero-Shot SBIR

ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation

[supplementary material]

Calibration-free Structure-from-Motion with Calibrated Radial Trifocal Tensors

[supplementary material]

Occupancy Anticipation for Efficient Exploration and Navigation

[supplementary material]

Unified Image and Video Saliency Modeling

[supplementary material]

TAO: A Large-Scale Benchmark for Tracking Any Object

[supplementary material]

A Generalization of Otsu’s Method and Minimum Error Thresholding

[supplementary material]

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

[supplementary material]

Big Transfer (BiT): General Visual Representation Learning

[supplementary material]

VisualCOMET: Reasoning about the Dynamic Context of a Still Image

[supplementary material]

Few-shot Action Recognition with Permutation-invariant Attention

[supplementary material]

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling

[supplementary material]

Learning Visual Context by Comparison

[supplementary material]

Large Scale Holistic Video Understanding

[supplementary material]

Indirect Local Attacks for Context-aware Semantic Segmentation Networks

[supplementary material]

Predicting Visual Overlap of Images Through Interpretable Non-Metric Box Embeddings

[supplementary material]

Connecting Vision and Language with Localized Narratives

[supplementary material]

Adversarial T-shirt! Evading Person Detectors in A Physical World

[supplementary material]

Bounding-box Channels for Visual Relationship Detection

Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial Distortion

[supplementary material]

SRFlow: Learning the Super-Resolution Space with Normalizing Flow

[supplementary material]

DeepGMR: Learning Latent Gaussian Mixture Models for Registration

[supplementary material]

Active Perception using Light Curtains for Autonomous Driving

[supplementary material]

Invertible Neural BRDF for Object Inverse Rendering

Semi-supervised Semantic Segmentation via Strong-weak Dual-branch Network

[supplementary material]

Practical Deep Raw Image Denoising on Mobile Devices

[supplementary material]

SoundSpaces: Audio-Visual Navigation in 3D Environments

[supplementary material]

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

[supplementary material]

Erasing Appearance Preservation in Optimization-based Smoothing

[supplementary material]

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler

[supplementary material]

Guided Deep Decoder: Unsupervised Image Pair Fusion

[supplementary material]

Filter Style Transfer between Photos

[supplementary material]

JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network for 3D Hand Pose Estimation from a Single Depth Image

[supplementary material]

Dynamic Group Convolution for Accelerating Convolutional Neural Networks

[supplementary material]

RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering

Object-Contextual Representations for Semantic Segmentation

[supplementary material]

Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring

[supplementary material]

Joint Semantic Instance Segmentation on Graphs with the Semantic Mutex Watershed

[supplementary material]

Photon-Efficient 3D Imaging with A Non-Local Neural Network

[supplementary material]

GeLaTO: Generative Latent Textured Objects

[supplementary material]

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

[supplementary material]

Directional Temporal Modeling for Action Recognition

[supplementary material]

Shonan Rotation Averaging: Global Optimality by Surfing SO(p)(n)

[supplementary material]

Semantic Curiosity for Active Visual Learning

[supplementary material]

Multi-Temporal Recurrent Neural Networks For Progressive Non-Uniform Single Image Deblurring With Incremental Temporal Training

[supplementary material]

ProgressFace: Scale-Aware Progressive Learning for Face Detection

[supplementary material]

Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference

[supplementary material]

CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos

Modeling the Effects of Windshield Refraction for Camera Calibration

[supplementary material]

Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search

[supplementary material]

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models

[supplementary material]

Visual Relation Grounding in Videos

[supplementary material]

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

[supplementary material]

Controlling Style and Semantics in Weakly-Supervised Image Generation

[supplementary material]

Jointly learning visual motion and confidence from local patches in event cameras

[supplementary material]

SODA: Story Oriented Dense Video Captioning Evaluation Framework

[supplementary material]

Sketch-Guided Object Localization in Natural Images

[supplementary material]

A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses

[supplementary material]

Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

[supplementary material]

The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement

[supplementary material]

STAR: Sparse Trained Articulated Human Body Regressor

[supplementary material]

Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer

[supplementary material]

Collaboration by Competition: Self-coordinated Knowledge Amalgamation for Multi-talent Student Learning

[supplementary material]

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians

[supplementary material]

Learning 3D Part Assembly from a Single Image

[supplementary material]

PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions

[supplementary material]

Highly Efficient Salient Object Detection with 100K Parameters

[supplementary material]

HardGAN: A Haze-Aware Representation Distillation GAN for Single Image Dehazing

Lifespan Age Transformation Synthesis

[supplementary material]

Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation

[supplementary material]

Simulating Content Consistent Vehicle Datasets with Attribute Descent

Multiview Detection with Feature Perspective Transformation

[supplementary material]

Learning Object Relation Graph and Tentative Policy for Visual Navigation

[supplementary material]

Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition

Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning

[supplementary material]

Inducing Optimal Attribute Representations for Conditional GANs

[supplementary material]

AR-Net: Adaptive Frame Resolution for Efficient Action Recognition

[supplementary material]

Image-to-Voxel Model Translation for 3D Scene Reconstruction and Segmentation

[supplementary material]

Consistency Guided Scene Flow Estimation

[supplementary material]

Autoregressive Unsupervised Image Segmentation

[supplementary material]

Controllable Image Synthesis via SegVAE

[supplementary material]

Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search

[supplementary material]

Efficient Non-Line-of-Sight Imaging from Transient Sinograms

[supplementary material]

Texture Hallucination for Large-Factor Painting Super-Resolution

[supplementary material]

Learning Progressive Joint Propagation for Human Motion Prediction

[supplementary material]

Image Stitching and Rectification for Hand-Held Cameras

[supplementary material]

ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds

[supplementary material]

The Group Loss for Deep Metric Learning

[supplementary material]

Learning Object Depth from Camera Motion and Video Object Segmentation

[supplementary material]

OnlineAugment: Online Data Augmentation with Less Domain Knowledge

[supplementary material]

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction

[supplementary material]

Intra-class Feature Variation Distillation for Semantic Segmentation

Temporal Distinct Representation Learning for Action Recognition

Representative Graph Neural Network

[supplementary material]

Deformation-Aware 3D Model Embedding and Retrieval

[supplementary material]

Atlas: End-to-End 3D Scene Reconstruction from Posed Images

[supplementary material]

Multiple Class Novelty Detection Under Data Distribution Shift

[supplementary material]

Colorization of Depth Map via Disentanglement

[supplementary material]

Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes

[supplementary material]

GeoGraph: Graph-based multi-view object detection with geometric cues end-to-end

Localizing the Common Action Among a Few Videos

[supplementary material]

TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification

[supplementary material]

Traffic Accident Benchmark for Causality Recognition

Face Anti-Spoofing with Human Material Perception

[supplementary material]

How Can I See My Future? FvTraj: Using First-person View for Pedestrian Trajectory Prediction

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

NASA Neural Articulated Shape Approximation

[supplementary material]

Towards Unique and Informative Captioning of Images

[supplementary material]

When Does Self-supervision Improve Few-shot Learning?

[supplementary material]

Two-branch Recurrent Network for Isolating Deepfakes in Videos

Incremental Few-Shot Meta-Learning via Indirect Discriminant Alignment

[supplementary material]

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

[supplementary material]

Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation

Global Distance-distributions Separation for Unsupervised Person Re-identification

[supplementary material]

I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image

[supplementary material]

Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose

[supplementary material]

ALRe: Outlier Detection for Guided Refinement

Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations

Unsupervised Domain Attention Adaptation Network for Caricature Attribute Recognition

[supplementary material]

Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection

[supplementary material]

Curriculum DeepSDF

Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance

[supplementary material]

Improved Adversarial Training via Learned Optimizer

[supplementary material]

Component Divide-and-Conquer for Real-World Image Super-Resolution

[supplementary material]

Enabling Deep Residual Networks for Weakly Supervised Object Detection

[supplementary material]

Deep near-light photometric stereo for spatially varying reflectances

[supplementary material]

Learning Visual Representations with Caption Annotations

[supplementary material]

Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier

[supplementary material]

Regression of Instance Boundary by Aggregated CNN and GCN

[supplementary material]

Social Adaptive Module for Weakly-supervised Group Activity Recognition

RGB-D Salient Object Detection with Cross-Modality Modulation and Selection

[supplementary material]

RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval

[supplementary material]

Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection

[supplementary material]

Faster Person Re-Identification

Quantization Guided JPEG Artifact Correction

[supplementary material]

3PointTM: Faster Measurement of High-Dimensional Transmission Matrices

Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer

[supplementary material]

Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction

[supplementary material]

World-Consistent Video-to-Video Synthesis

[supplementary material]

Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation

GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild

[supplementary material]

Event-based Asynchronous Sparse Convolutional Networks

[supplementary material]

AtlantaNet: Inferring the 3D Indoor Layout from a Single 360(∘) Image beyond the Manhattan World Assumption

[supplementary material]

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

[supplementary material]

REMIND Your Neural Network to Prevent Catastrophic Forgetting

[supplementary material]

Image Classification in the Dark using Quanta Image Sensors

[supplementary material]

n-Reference Transfer Learning for Saliency Prediction

[supplementary material]

Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection

[supplementary material]

Bottom-Up Temporal Action Localization with Mutual Regularization

[supplementary material]

On Modulating the Gradient for Meta-Learning

[supplementary material]

Domain-Specific Mappings for Generative Adversarial Style Transfer

[supplementary material]

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning

DHP: Differentiable Meta Pruning via HyperNetworks

[supplementary material]

Deep Transferring Quantization

[supplementary material]

Deep Credible Metric Learning for Unsupervised Domain Adaptation Person Re-identification

Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?

Arbitrary-Oriented Object Detection with Circular Smooth Label

[supplementary material]

Learning Event-Driven Video Deblurring and Interpolation

[supplementary material]

Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference

[supplementary material]

Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation

[supplementary material]

CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain Adaptation

[supplementary material]

Prototype Mixture Models for Few-shot Semantic Segmentation

[supplementary material]

Webly Supervised Image Classification with Self-Contained Confidence

[supplementary material]

Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

[supplementary material]

Monocular 3D Object Detection via Feature Domain Adaptation

[supplementary material]

AUTO3D: Novel view synthesis through unsupervisely learned variational viewpoint and global 3D representation

[supplementary material]

VPN: Learning Video-Pose Embedding for Activities of Daily Living

[supplementary material]

Soft Anchor-Point Object Detection

[supplementary material]

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid

[supplementary material]

Soft Expert Reward Learning for Vision-and-Language Navigation

Part-aware Prototype Network for Few-shot Semantic Segmentation

[supplementary material]

Learning from Extrinsic and Intrinsic Supervisions for Domain Generalization

[supplementary material]

Joint Learning of Social Groups, Individuals Action and Sub-group Activities in Videos

[supplementary material]

Whole-Body Human Pose Estimation in the Wild

[supplementary material]

Relative Pose Estimation of Calibrated Cameras with Known SE(3) Invariants

[supplementary material]

Sequential Convolution and Runge-Kutta Residual Architecture for Image Compressed Sensing

[supplementary material]

Deep Hough Transform for Semantic Line Detection

[supplementary material]

Structured Landmark Detection via Topology-Adapting Deep Graph Learning

[supplementary material]

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

[supplementary material]

Learning to Balance Specificity and Invariance for In and Out of Domain Generalization

[supplementary material]

Contrastive Learning for Unpaired Image-to-Image Translation

[supplementary material]

DLow: Diversifying Latent Flows for Diverse Human Motion Prediction

[supplementary material]

GRNet: Gridding Residual Network for Dense Point Cloud Completion

[supplementary material]

Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition

Blind Face Restoration via Deep Multi-scale Component Dictionaries

[supplementary material]

Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods

[supplementary material]

Inequality-Constrained and Robust 3D Face Model Fitting

[supplementary material]

Gabor Layers Enhance Network Robustness

[supplementary material]

Conditional Image Repainting via Semantic Bridge and Piecewise Value Function

[supplementary material]

Learnable Cost Volume Using the Cayley Representation

[supplementary material]

HALO: Hardware-Aware Learning to Optimize

[supplementary material]

Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling

[supplementary material]

BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition

Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision

[supplementary material]

Domain Adaptive Semantic Segmentation Using Weak Labels

[supplementary material]

Knowledge Distillation Meets Self-Supervision

[supplementary material]

Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions

[supplementary material]

Reconstructing the Noise Variance Manifold for Image Denoising

[supplementary material]

Occlusion-Aware Depth Estimation with Adaptive Normal Constraints

[supplementary material]

VisualEchoes: Spatial Image Representation Learning through Echolocation

[supplementary material]

Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval

[supplementary material]

Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation

[supplementary material]

Spatially Aware Multimodal Transformers for TextVQA

[supplementary material]

Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector

[supplementary material]

URIE: Universal Image Enhancement for Visual Recognition in the Wild

[supplementary material]

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

[supplementary material]

SPL-MLL: Selecting Predictable Landmarks for Multi-Label Learning

Unpaired Image-to-Image Translation using Adversarial Consistency Loss

[supplementary material]

Discriminability Distillation in Group Representation Learning

[supplementary material]

Monocular Expressive Body Regression through Body-Driven Attention

[supplementary material]

Dual Adversarial Network: Toward Real-world Noise Removal and Noise Generation

[supplementary material]

Linguistic Structure Guided Context Modeling for Referring Image Segmentation

[supplementary material]

Federated Visual Classification with Real-World Data Distribution

[supplementary material]

Robust Re-Identification by Multiple Views Knowledge Distillation

[supplementary material]

Defocus Deblurring Using Dual-Pixel Data

[supplementary material]

RhyRNN: Rhythmic RNN for Recognizing Events in Long and Complex Videos

Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective Mapping

[supplementary material]

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

[supplementary material]

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

[supplementary material]

Learning to Learn with Variational Information Bottleneck for Domain Generalization

[supplementary material]

Deep Positional and Relational Feature Learning for Rotation-Invariant Point Cloud Analysis

[supplementary material]

Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks

[supplementary material]

Layered Neighborhood Expansion for Incremental Multiple Graph Matching

SCAN: Learning to Classify Images without Labels

[supplementary material]

Graph convolutional networks for learning with few clean and many noisy labels

[supplementary material]

Object-and-Action Aware Model for Visual Language Navigation

A Comprehensive Study of Weight Sharing in Graph Networks for 3D Human Pose Estimation

[supplementary material]

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution

[supplementary material]

Efficient Semantic Video Segmentation with Per-frame Inference

[supplementary material]

Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers

[supplementary material]

Deep Spiking Neural Network: Energy Efficiency Through Time based Coding

InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling

[supplementary material]

Utilizing Patch-level Category Activation Patterns for Multiple Class Novelty Detection

[supplementary material]

People as Scene Probes

[supplementary material]

Mapping in a Cycle: Sinkhorn Regularized Unsupervised Learning for Point Cloud Shapes

[supplementary material]

Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions

[supplementary material]

TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video

[supplementary material]

Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost

[supplementary material]

Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation

Modeling 3D Shapes by Reinforcement Learning

[supplementary material]

LST-Net: Learning a Convolutional Neural Network with a Learnable Sparse Transform

[supplementary material]

Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision

[supplementary material]

CN: Channel Normalization For Point Cloud Recognition

Rethinking the Defocus Blur Detection Problem and A Real-Time Deep DBD Model

AutoMix: Mixup Networks for Sample Interpolation via Cooperative Barycenter Learning

Scene Text Image Super-resolution in the wild

[supplementary material]

Coupling Explicit and Implicit Surface Representations for Generative 3D Modeling

[supplementary material]

Learning Disentangled Representations with Latent Variation Predictability

[supplementary material]

Deep Space-Time Video Upsampling Networks

[supplementary material]

Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery

[supplementary material]

Fast Video Object Segmentation using the Global Context Module

[supplementary material]

Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos

[supplementary material]

Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification

[supplementary material]

MessyTable: Instance Association in Multiple Camera Views

[supplementary material]

A Unified Framework for Shot Type Classification Based on Subject Centric Lens

[supplementary material]

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues

[supplementary material]

HTML: A Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization

[supplementary material]

CycAs: Self-supervised Cycle Association for Learning Re-identifiable Descriptions

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

Towards Real-Time Multi-Object Tracking

A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation

Unsupervised Deep Metric Learning with Transformed Attention Consistency and Contrastive Clustering Loss

[supplementary material]

STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos

[supplementary material]

Hierarchical Style-based Networks for Motion Synthesis

[supplementary material]

Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop

[supplementary material]

Learning to Count in the Crowd from Limited Labeled Data

[supplementary material]

SPOT: Selective Point Cloud Voting for Better Proposal in Point Cloud Object Detection

Explainable Face Recognition

[supplementary material]

From Shadow Segmentation to Shadow Removal

[supplementary material]

Diverse and Admissible Trajectory Prediction through Multimodal Context Understanding

[supplementary material]

CONFIG: Controllable Neural Face Image Generation

[supplementary material]

Single View Metrology in the Wild

[supplementary material]

Procedure Planning in Instructional Videos

[supplementary material]

Funnel Activation for Visual Recognition

GIQA: Generated Image Quality Assessment

Adversarial Continual Learning

[supplementary material]

Adapting Object Detectors with Conditional Domain Normalization

[supplementary material]

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction

Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction

[supplementary material]

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

[supplementary material]

Self-supervised Bayesian Deep Learning for Image Recovery with Applications to Compressive Sensing

[supplementary material]

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement

Semi-supervised Learning with a Teacher-student Network for Generalized Attribute Prediction

[supplementary material]

Unsupervised Domain Adaptation with Noise Resistible Mutual-Training for Person Re-identification

DPDist: Comparing Point Clouds Using Deep Point Cloud Distance

[supplementary material]

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

DataMix: Efficient Privacy-Preserving Edge-Cloud Inference

[supplementary material]

Neural Re-Rendering of Humans from a Single Image

[supplementary material]

Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation

[supplementary material]

PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration

Why do These Match? Explaining the Behavior of Image Similarity Models

[supplementary material]

CooGAN: A Memory-Efficient Framework for High-Resolution Facial Attribute Editing

[supplementary material]

Progressive Transformers for End-to-End Sign Language Production

[supplementary material]

Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting

[supplementary material]

Making Affine Correspondences Work in Camera Geometry Computation

[supplementary material]

Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web Faces

[supplementary material]

Foley Music: Learning to Generate Music from Videos

[supplementary material]

Contrastive Multiview Coding

[supplementary material]

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses

[supplementary material]

Generative Low-bitwidth Data Free Quantization

[supplementary material]

Local Correlation Consistency for Knowledge Distillation

[supplementary material]

Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild

[supplementary material]

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation

[supplementary material]

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations

[supplementary material]

Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues

Weakly-Supervised Cell Tracking via Backward-and-Forward Propagation

[supplementary material]

SeqHAND: RGB-Sequence-Based 3D Hand Pose and Shape Estimation

[supplementary material]

Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization

[supplementary material]

AMLN: Adversarial-based Mutual Learning Network for Online Knowledge Distillation

Online Multi-modal Person Search in Videos

Single Image Super-Resolution via a Holistic Attention Network

[supplementary material]

Can You Read Me Now? Content Aware Rectification using Angle Supervision

[supplementary material]

Momentum Batch Normalization for Deep Learning with Small Batch Size

[supplementary material]

AdvPC: Transferable Adversarial Perturbations on 3D Point Clouds

[supplementary material]

Edge-aware Graph Representation Learning and Reasoning for Face Parsing

BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network

[supplementary material]

G-LBM:Generative Low-dimensional Background Model Estimation from Video Sequences

[supplementary material]

H3DNet: 3D Object Detection Using Hybrid Geometric Primitives

[supplementary material]

Expressive Telepresence via Modular Codec Avatars

[supplementary material]

Cascade Graph Neural Networks for RGB-D Salient Object Detection

FairALM: Augmented Lagrangian Method for Training Fair Models with Little Regret

[supplementary material]

Generating Videos of Zero-Shot Compositions of Actions and Objects

[supplementary material]

ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language

[supplementary material]

Renovating Parsing R-CNN for Accurate Multiple Human Parsing

Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning

Gradient-Induced Co-Saliency Detection

[supplementary material]

Nighttime Defogging Using High-Low Frequency Decomposition and Grayscale-Color Networks

SegFix: Model-Agnostic Boundary Refinement for Segmentation

[supplementary material]

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction

[supplementary material]

Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars

[supplementary material]

Neural Geometric Parser for Single Image Camera Calibration

[supplementary material]

Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision

[supplementary material]

Learning Architectures for Binary Networks

[supplementary material]

Semantic View Synthesis

[supplementary material]

An Analysis of Sketched IRLS for Accelerated Sparse Residual Regression

Relative Pose from Deep Learned Depth and a Single Affine Correspondence

[supplementary material]

Video Super-Resolution with Recurrent Structure-Detail Network

Shape Adaptor: A Learnable Resizing Module

[supplementary material]

Shuffle and Attend: Video Domain Adaptation

[supplementary material]

DRG: Dual Relation Graph for Human-Object Interaction Detection

[supplementary material]

Flow-edge Guided Video Completion

[supplementary material]

End-to-End Trainable Deep Active Contour Models for Automated Image Segmentation: Delineating Buildings in Aerial Imagery

[supplementary material]

Towards End-to-end Video-based Eye-Tracking

[supplementary material]

Generating Handwriting via Decoupled Style Descriptors

[supplementary material]

LEED: Label-Free Expression Editing via Disentanglement

[supplementary material]

Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards

[supplementary material]

Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder

[supplementary material]

Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation

[supplementary material]

Class-Incremental Domain Adaptation

[supplementary material]

Anti-Bandit Neural Architecture Search for Model Defense

Wavelet-Based Dual-Branch Network for Image Demoiréing

[supplementary material]

Low Light Video Enhancement using Synthetic Data Produced with an Intermediate Domain Mapping

[supplementary material]

Non-Local Spatial Propagation Network for Depth Completion

[supplementary material]

DanbooRegion: An Illustration Region Dataset

[supplementary material]

Event Enhanced High-Quality Image Recovery

[supplementary material]

PackDet: Packed Long-Head Object Detector

[supplementary material]

A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS

[supplementary material]

Learning Semantic Neural Tree for Human Parsing

[supplementary material]

Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation

[supplementary material]

Burst Denoising via Temporally Shifted Wavelet Transforms

[supplementary material]

JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction

[supplementary material]

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation

[supplementary material]

Rethinking Pseudo-LiDAR Representation

[supplementary material]

Deep Multi Depth Panoramas for View Synthesis

[supplementary material]

MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection

[supplementary material]

ContactPose: A Dataset of Grasps with Object Contact and Hand Pose

[supplementary material]

API-Net: Robust Generative Classifier via a Single Discriminator

[supplementary material]

Bias-based Universal Adversarial Patch Attack for Automatic Check-out

[supplementary material]

Imbalanced Continual Learning with Partitioning Reservoir Sampling

[supplementary material]

Guided Collaborative Training for Pixel-wise Semi-Supervised Learning

[supplementary material]

Stacking Networks Dynamically for Image Restoration Based on the Plug-and-Play Framework

[supplementary material]

Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight

Spatial Attention Pyramid Network for Unsupervised Domain Adaptation

[supplementary material]

GSIR: Generalizable 3D Shape Interpretation and Reconstruction

Weakly Supervised 3D Object Detection from Lidar Point Cloud

[supplementary material]

Two-phase Pseudo Label Densification for Self-training based Domain Adaptation

[supplementary material]

Adaptive Offline Quintuplet Loss for Image-Text Matching

[supplementary material]

Learning Object Placement by Inpainting for Compositional Data Augmentation

[supplementary material]

Deep Vectorization of Technical Drawings

[supplementary material]

CAD-Deform: Deformable Fitting of CAD Models to 3D Scans

[supplementary material]

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic Points

[supplementary material]

Multi-Agent Embodied Question Answering in Interactive Environments

Conditional Sequential Modulation for Efficient Global Image Retouching

[supplementary material]

Segmenting Transparent Objects in the Wild

[supplementary material]

Length-Controllable Image Captioning

[supplementary material]

Few-Shot Semantic Segmentation with Democratic Attention Networks

[supplementary material]

Defocus Blur Detection via Depth Distillation

[supplementary material]

Motion Guided 3D Pose Estimation from Videos

[supplementary material]

Reflection Separation via Multi-bounce Polarization State Tracing

[supplementary material]

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image Editing

[supplementary material]

Learning with Noisy Class Labels for Instance Segmentation

Deep Image Clustering with Category-Style Representation

[supplementary material]

Self-supervised Motion Representation via Scattering Local Motion Cues

Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets

[supplementary material]

BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation

[supplementary material]

Hard negative examples are hard, but useful

[supplementary material]

ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions

Video Object Detection via Object-level Temporal Aggregation

[supplementary material]

Object Detection with a Unified Label Space from Multiple Datasets

[supplementary material]

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

[supplementary material]

Comprehensive Image Captioning via Scene Graph Decomposition

Symbiotic Adversarial Learning for Attribute-based Person Search

[supplementary material]

Amplifying Key Cues for Human-Object-Interaction Detection

[supplementary material]

Rethinking Few-shot Image Classification: A Good Embedding is All You Need?

[supplementary material]

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization

[supplementary material]

Action Localization through Continual Predictive Learning

[supplementary material]

Generative View-Correlation Adaptation for Semi-Supervised Multi-View Learning

READ: Reciprocal Attention Discriminator for Image-to-Video Re-Identification

[supplementary material]

3D Human Shape Reconstruction from a Polarization Image

[supplementary material]

The Devil is in the Details: Self-Supervised Attention for Vehicle Re-Identification

Improving One-stage Visual Grounding by Recursive Sub-query Construction

[supplementary material]

Multi-level Wavelet-based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed Video

[supplementary material]

Example-Guided Image Synthesis using Masked Spatial-Channel Attention and Self-Supervision

[supplementary material]

Content-Consistent Matching for Domain Adaptive Semantic Segmentation

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

[supplementary material]

History Repeats Itself: Human Motion Prediction via Motion Attention

[supplementary material]

Unsupervised Video Object Segmentation with Joint Hotspot Tracking

[supplementary material]

SRNet: Improving Generalization in 3D Human Pose Estimation with a Split-and-Recombine Approach

[supplementary material]

CAFE-GAN: Arbitrary Face Attribute Editing with Complementary Attention Feature

[supplementary material]

MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object Detection

Latent Topic-aware Multi-Label Classification

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain Adaptation

[supplementary material]

Curriculum Manager for Source Selection in Multi-Source Domain Adaptation

Powering One-shot Topological NAS with Stabilized Share-parameter Proxy

Classes Matter: A Fine-grained Adversarial Approach to Cross-domain Semantic Segmentation

[supplementary material]

Boundary-preserving Mask R-CNN

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

[supplementary material]

MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation

[supplementary material]

Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling

[supplementary material]

The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation

[supplementary material]

What is Learned in Deep Uncalibrated Photometric Stereo?

[supplementary material]

Prior-based Domain Adaptive Object Detection for Hazy and Rainy Conditions

[supplementary material]

Adversarial Ranking Attack and Defense

[supplementary material]

ReDro: Efficiently Learning Large-sized SPD Visual Representation

[supplementary material]

Graph-Based Social Relation Reasoning

[supplementary material]

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection

[supplementary material]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

[supplementary material]

Asynchronous Interaction Aggregation for Action Detection

[supplementary material]

Shape and Viewpoint without Keypoints

[supplementary material]

Learning Attentive and Hierarchical Representations for 3D Shape Recognition

TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search

[supplementary material]

Associative3D: Volumetric Reconstruction from Sparse Views

PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit

[supplementary material]

Memory Selection Network for Video Propagation

[supplementary material]

Disentangled Non-local Neural Networks

[supplementary material]

URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark

[supplementary material]

Generalizing Person Re-Identification by Camera-Aware Invariance Learning and Cross-Domain Mixup

[supplementary material]

Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks

[supplementary material]

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

[supplementary material]

Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip

[supplementary material]

Knowledge Transfer via Dense Cross-Layer Mutual-Distillation

[supplementary material]

Matching Guided Distillation

[supplementary material]

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Learning to Compose Hypercolumns for Visual Correspondence

[supplementary material]

Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction

[supplementary material]

Object-based Illumination Estimation with Rendering-aware Neural Networks

[supplementary material]

Progressive Point Cloud Deconvolution Generation Network

[supplementary material]

SSCGAN: Facial Attribute Editing via Style Skip Connections

Negative Pseudo Labeling using Class Proportion for Semantic Segmentation in Pathology

Learn to Propagate Reliably on Noisy Affinity Graphs

[supplementary material]

Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search

[supplementary material]

TANet: Towards Fully Automatic Tooth Arrangement

[supplementary material]

UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection

[supplementary material]

GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision

[supplementary material]

Resolution Switchable Networks for Runtime Efficient Image Recognition

[supplementary material]

SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation

[supplementary material]

Learning to Detect Open Classes for Universal Domain Adaptation

[supplementary material]

Visual Compositional Learning for Human-Object Interaction Detection

[supplementary material]

Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches

[supplementary material]

Rethinking Class Activation Mapping for Weakly Supervised Object Localization

[supplementary material]

OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features

[supplementary material]

Interpretable Neural Network Decoupling

[supplementary material]

Omni-sourced Webly-supervised Learning for Video Recognition

[supplementary material]

CurveLane-NAS: Unifying Lane-Sensitive Architecture Search and Adaptive Point Blending

[supplementary material]

Contextual-Relation Consistent Domain Adaptation for Semantic Segmentation

[supplementary material]

Estimating People Flows to Better Count Them in Crowded Scenes

[supplementary material]

Generate to Adapt: Resolution Adaption Network for Surveillance Face Recognition

[supplementary material]

Learning Feature Embeddings for Discriminant Model based Tracking

[supplementary material]

WeightNet: Revisiting the Design Space of Weight Networks

Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift

[supplementary material]

Learning Where to Focus for Efficient Video Object Detection

[supplementary material]

Learning Object Permanence from Video

[supplementary material]

Adaptive Text Recognition through Visual Matching

[supplementary material]

Actions as Moving Points

[supplementary material]

Learning to Exploit Multiple Vision Modalities by Using Grafted Networks

[supplementary material]

Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild

[supplementary material]

3D Fluid Flow Reconstruction Using Compact Light Field PIV

[supplementary material]

Contextual Diversity for Active Learning

[supplementary material]

Temporal Aggregate Representations for Long-Range Video Understanding

[supplementary material]

Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition

[supplementary material]

General 3D Room Layout from a Single View by Render-and-Compare

[supplementary material]

Neural Dense Non-Rigid Structure from Motion with Latent Space Constraints

[supplementary material]

Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability

[supplementary material]

Yet Another Intermediate-Level Attack

Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction

[supplementary material]

Early Exit Or Not: Resource-Efficient Blind Quality Enhancement for Compressed Images

[supplementary material]

PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations

[supplementary material]

How does Lipschitz Regularization Influence GAN Training?

[supplementary material]

Infrastructure-based Multi-Camera Calibration using Radial Projections

[supplementary material]

MotionSqueeze: Neural Motion Feature Learning for Video Understanding

[supplementary material]

Polarized Optical-Flow Gyroscope

[supplementary material]

Online Meta-Learning for Multi-Source and Semi-Supervised Domain Adaptation

[supplementary material]

An Ensemble of Epoch-wise Empirical Bayes for Few-shot Learning

[supplementary material]

On the Effectiveness of Image Rotation for Open Set Domain Adaptation

[supplementary material]

Combining Task Predictors via Enhancing Joint Predictability

[supplementary material]

Multi-Scale Positive Sample Refinement for Few-Shot Object Detection

[supplementary material]

Single-Image Depth Prediction Makes Feature Matching Easier

[supplementary material]

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition

[supplementary material]

CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization

[supplementary material]

Learning Joint Spatial-Temporal Transformations for Video Inpainting

[supplementary material]

Single Path One-Shot Neural Architecture Search with Uniform Sampling

[supplementary material]

Learning to Generate Novel Domains for Domain Generalization

[supplementary material]

Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections

[supplementary material]

Impact of base dataset design on few-shot image classification

[supplementary material]

Invertible Zero-Shot Recognition Flows

[supplementary material]

GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes

[supplementary material]

Location Sensitive Image Retrieval and Tagging

[supplementary material]

Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image

[supplementary material]

Guessing State Tracking for Visual Dialogue

Memory-Efficient Incremental Learning Through Feature Adaptation

[supplementary material]

Neural Voice Puppetry: Audio-driven Facial Reenactment

[supplementary material]

One-Shot Unsupervised Cross-Domain Detection

[supplementary material]

Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks

[supplementary material]

Probabilistic Future Prediction for Video Scene Understanding

[supplementary material]

Suppressing Mislabeled Data via Grouping and Self-Attention

Class-wise Dynamic Graph Convolution for Semantic Segmentation

[supplementary material]

Character-Preserving Coherent Story Visualization

[supplementary material]

GINet: Graph Interaction Network for Scene Parsing

[supplementary material]

Tensor Low-Rank Reconstruction for Semantic Segmentation

[supplementary material]

Attentive Normalization

[supplementary material]

Count- and Similarity-aware R-CNN for Pedestrian Detection

[supplementary material]

TRADI: Tracking Deep Neural network Weight Distributions

[supplementary material]

Spatiotemporal Attacks for Embodied Agents

[supplementary material]

Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model without Manual Annotation

Unselfie: Translating Selfies to Neutral-pose Portraits in the Wild

[supplementary material]

Design and Interpretation of Universal Adversarial Patches in Face Detection

[supplementary material]

Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild

[supplementary material]

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

[supplementary material]

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification

Contextual Heterogeneous Graph Network for Human-Object Interaction Detection

[supplementary material]

Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning

[supplementary material]

A Closest Point Proposal for MCMC-based Probabilistic Surface Registration

Interactive Video Object Segmentation Using Global and Local Transfer Modules

[supplementary material]

End-to-end Interpretable Learning of Non-blind Image Deblurring

[supplementary material]

Employing Multi-Estimations for Weakly-Supervised Semantic Segmentation

Learning Noise-Aware Encoder-Decoder from Noisy Labels by Alternating Back-Propagation for Saliency Detection

[supplementary material]

Rethinking Image Deraining via Rain Streaks and Vapors

[supplementary material]

Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes

Is Sharing of Egocentric Video Giving Away Your Biometric Signature?

[supplementary material]

Captioning Images Taken by People Who Are Blind

[supplementary material]

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

[supplementary material]

Conditional Entropy Coding for Efficient Video Compression

[supplementary material]

Differentiable Feature Aggregation Search for Knowledge Distillation

Attention Guided Anomaly Localization in Images

[supplementary material]

Self-supervised Video Representation Learning by Pace Prediction

[supplementary material]

Full-Body Awareness from Partial Observations

[supplementary material]

Reinforced Axial Refinement Network for Monocular 3D Object Detection

Self-Supervised Multi-Task Procedure Learning from Instructional Videos

[supplementary material]

CosyPose: Consistent multi-view multi-object 6D pose estimation

[supplementary material]

In-Domain GAN Inversion for Real Image Editing

[supplementary material]

Key Frame Proposal Network for Efficient Pose Estimation in Videos

[supplementary material]

Exchangeable Deep Neural Networks for Set-to-Set Matching and Learning

[supplementary material]

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

[supplementary material]

Cross-Modal Weighting Network for RGB-D Salient Object Detection

Open-set Adversarial Defense

[supplementary material]

Deep Image Compression using Decoder Side Information

[supplementary material]

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation

[supplementary material]

A Generic Visualization Approach for Convolutional Neural Networks

[supplementary material]

Interactive Annotation of 3D Object Geometry using 2D Scribbles

[supplementary material]

Hierarchical Kinematic Human Mesh Recovery

[supplementary material]

Multi-Loss Rebalancing Algorithm for Monocular Depth Estimation

[supplementary material]

3D Bird Reconstruction: a Dataset, Model, and Shape Recovery from a Single View

[supplementary material]

We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos

[supplementary material]

Joint Optimization for Multi-Person Shape Models from Markerless 3D-Scans

[supplementary material]

Accurate RGB-D Salient Object Detection via Collaborative Learning

Finding Your (3D) Center: 3D Object Detection Using a Learned Loss

[supplementary material]

Collaborative Training between Region Proposal Localization and Classification for Domain Adaptive Object Detection

Two Stream Active Query Suggestion for Active Learning in Connectomics

[supplementary material]

Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images

[supplementary material]

6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference

[supplementary material]

Modeling Artistic Workflows for Image Generation and Editing

[supplementary material]

A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks

[supplementary material]

Hidden Footprints: Learning Contextual Walkability from 3D Human Trails

[supplementary material]

Self-Supervised Learning of Audio-Visual Objects from Video

[supplementary material]

GAN-based Garment Generation Using Sewing Pattern Images

[supplementary material]

Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach

[supplementary material]

An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds

[supplementary material]

Monotonicity Prior for Cloud Tomography

[supplementary material]

Learning Trailer Moments in Full-Length Movies with Co-Contrastive Attention

[supplementary material]

Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline

[supplementary material]

Learning to Generate Grounded Visual Captions without Localization Supervision

[supplementary material]

Neural Hair Rendering

[supplementary material]

JNR: Joint-based Neural Rig Representation for Compact 3D Face Modeling

[supplementary material]

On Disentangling Spoof Trace for Generic Face Anti-Spoofing

[supplementary material]

Streaming Object Detection for 3-D Point Clouds

[supplementary material]

NAS-DIP: Learning Deep Image Prior with Neural Architecture Search

[supplementary material]

Learning to Learn in a Semi-Supervised Fashion

[supplementary material]

FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

[supplementary material]

RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects

[supplementary material]

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation

[supplementary material]

Learning to Separate: Detecting Heavily-Occluded Objects in Urban Scenes

Towards causal benchmarking of bias in face analysis algorithms

[supplementary material]

Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation

[supplementary material]

Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions

[supplementary material]

Transformation Consistency Regularization – A Semi-Supervised Paradigm for Image-to-Image Translation

[supplementary material]

LIRA: Lifelong Image Restoration from Unknown Blended Distortions

[supplementary material]

HDNet: Human Depth Estimation for Multi-Person Camera-Space Localization

SOLO: Segmenting Objects by Locations

[supplementary material]

Learning to See in the Dark with Events

[supplementary material]

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data

[supplementary material]

Context-Gated Convolution

[supplementary material]

Polynomial Regression Network for Variable-Number Lane Detection

[supplementary material]

Structural Deep Metric Learning for Room Layout Estimation

Adaptive Task Sampling for Meta-Learning

[supplementary material]

Deep Complementary Joint Model for Complex Scene Registration and Few-shot Segmentation on Medical Images

Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems

[supplementary material]

High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling

[supplementary material]

Online Ensemble Model Compression using Knowledge Distillation

Deep Learning-based Pupil Center Detection for Fast and Accurate Eye Tracking System

[supplementary material]

Efficient Residue Number System Based Winograd Convolution

[supplementary material]

Robust Tracking against Adversarial Attacks

[supplementary material]

Single-Shot Neural Relighting and SVBRDF Estimation

[supplementary material]

Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement

[supplementary material]

Angle-based Search Space Shrinking for Neural Architecture Search

[supplementary material]

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

[supplementary material]

Towards Fast, Accurate and Stable 3D Dense Face Alignment

[supplementary material]

Iterative Feature Transformation for Fast and Versatile Universal Style Transfer

[supplementary material]

CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search

[supplementary material]

Toward Faster and Simpler Matrix Normalization via Rank-1 Update

[supplementary material]

Accurate Polarimetric BRDF for Real Polarization Scene Rendering

[supplementary material]

Lensless Imaging with Focusing Sparse URA Masks in Long-Wave Infrared and its Application for Human Detection

[supplementary material]

Topology-Preserving Class-Incremental Learning

Inter-Image Communication for Weakly Supervised Localization

UFO²: A Unified Framework towards Omni-supervised Object Detection

[supplementary material]

iCaps: An Interpretable Classifier via Disentangled Capsule Networks

[supplementary material]

Detecting Natural Disasters, Damage, and Incidents in the Wild

[supplementary material]

[supplementary material]

Acquiring Dynamic Light Fields through Coded Aperture Camera

[supplementary material]

Gait Recognition from a Single Image using a Phase-Aware Gait Cycle Reconstruction Network

[supplementary material]

Informative Sample Mining Network for Multi-Domain Image-to-Image Translation

[supplementary material]

Spherical Feature Transform for Deep Metric Learning

[supplementary material]

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering

[supplementary material]

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes

[supplementary material]

Representation Sharing for Fast Object Detector Search and Beyond

[supplementary material]

Peeking into occluded joints: A novel framework for crowd pose estimation

[supplementary material]

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition

[supplementary material]

Deep Hashing with Active Pairwise Supervision

[supplementary material]

Graph Edit Distance Reward: Learning to Edit Scene Graph

Malleable 2.5D Convolution: Learning Receptive Fields along the Depth-axis for RGB-D Scene Parsing

[supplementary material]

Feature-metric Loss for Self-supervised Learning of Depth and Egomotion

Propagating Over Phrase Relations for One-Stage Visual Grounding

Adversarial Semantic Data Augmentation for Human Pose Estimation

Free View Synthesis

[supplementary material]

Face Anti-Spoofing via Disentangled Representation Learning

[supplementary material]

Prime-Aware Adaptive Distillation

Meta-Learning with Network Pruning

[supplementary material]

Spiral Generative Network for Image Extrapolation

[supplementary material]

SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches

[supplementary material]

Few-shot Compositional Font Generation with Dual Memory

[supplementary material]

PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling

[supplementary material]

Handcrafted Outlier Detection Revisited

[supplementary material]

The Average Mixing Kernel Signature

[supplementary material]

BCNet: Learning Body and Cloth Shape from A Single Image

[supplementary material]

Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos

[supplementary material]

Interactive Multi-Dimension Modulation with Dynamic Controllable Residual Learning for Image Restoration

[supplementary material]

Polysemy Deciphering Network for Human-Object Interaction Detection

[supplementary material]

PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning

[supplementary material]

Learning Graph-Convolutional Representations for Point Cloud Denoising

Semantic Line Detection Using Mirror Attention and Comparative Ranking and Matching

[supplementary material]

A Differentiable Recurrent Surface for Asynchronous Event-Based Data

[supplementary material]

Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches

[supplementary material]

LiteFlowNet3: Resolving Correspondence Ambiguity for More Accurate Optical Flow Estimation

[supplementary material]

Microscopy Image Restoration with Deep Wiener-Kolmogorov Filters

[supplementary material]

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

[supplementary material]

JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds

[supplementary material]

Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior

[supplementary material]

An Inference Algorithm for Multi-Label MRF-MAP Problems with Clique Size 100

[supplementary material]

Dual Refinement Underwater Object Detection Network

Multiple Sound Sources Localization from Coarse to Fine

[supplementary material]

Task-Aware Quantization Network for JPEG Image Compression

[supplementary material]

Energy-Based Models for Deep Probabilistic Regression

[supplementary material]

CLOTH3D: Clothed 3D Humans

[supplementary material]

Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images

[supplementary material]

CLNet: A Compact Latent Network for Fast Adjusting Siamese Trackers

Occlusion-Aware Siamese Network for Human Pose Estimation

Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model

[supplementary material]

NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image

[supplementary material]

Model-based occlusion disentanglement for image-to-image translation

[supplementary material]

Rotation-robust Intersection over Union for 3D Object Detection

[supplementary material]

New Threats against Object Detector with Non-local Block

[supplementary material]

Self-Supervised CycleGAN for Object-Preserving Image-to-Image Domain Adaptation

[supplementary material]

On the Usage of the Trifocal Tensor in Motion Segmentation

[supplementary material]

3D-Rotation-Equivariant Quaternion Neural Networks

[supplementary material]

InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image

[supplementary material]

Active Crowd Counting with Limited Supervision

[supplementary material]

Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance

[supplementary material]

Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language

[supplementary material]

Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On

[supplementary material]

NODIS: Neural Ordinary Differential Scene Understanding

[supplementary material]

AssembleNet++: Assembling Modality Representations via Attention Connections - Supplementary Material -

[supplementary material]

Learning Propagation Rules for Attribution Map Generation

[supplementary material]

Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference

[supplementary material]

Learning Predictive Models from Observation and Interaction

[supplementary material]

Unifying Deep Local and Global Features for Image Search

[supplementary material]

Human Body Model Fitting by Learned Gradient Descent

[supplementary material]

DDGCN: A Dynamic Directed Graph Convolutional Network for Action Recognition

[supplementary material]

Learning latent representations across multiple data domains using Lifelong VAEGAN

[supplementary material]

DVI: Depth Guided Video Inpainting for Autonomous Driving

[supplementary material]

Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation

[supplementary material]

APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection

[supplementary material]

Visual Question Answering on Image Sets

[supplementary material]

Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots

[supplementary material]

Placepedia: Comprehensive Place Understanding with Multi-Faceted Annotations

[supplementary material]

DELTAS: Depth Estimation by Learning Triangulation And densification of Sparse points

[supplementary material]

Dynamic Low-light Imaging with Quanta Image Sensors

[supplementary material]

Disambiguating Monocular Depth Estimation with a Single Transient

[supplementary material]

DSDNet: Deep Structured self-Driving Network

[supplementary material]

QuEST: Quantized Embedding Space for Transferring Knowledge

[supplementary material]

EGDCL: An Adaptive Curriculum Learning Framework for Unbiased Glaucoma Diagnosis

Backpropagated Gradient Representations for Anomaly Detection

[supplementary material]

Dense RepPoints: Representing Visual Objects with Dense Point Sets

[supplementary material]

On Dropping Clusters to Regularize Graph Convolutional Neural Networks

[supplementary material]

Adaptive Video Highlight Detection by Learning from User History

Improving 3D Object Detection through Progressive Population Based Augmentation

[supplementary material]

DR-KFS: A Differentiable Visual Similarity Metric for 3D Shape Reconstruction

[supplementary material]

SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization

[supplementary material]

Adversarial Learning for Zero-shot Domain Adaptation

YOLO in the Dark - Domain Adaptation Method for Merging Multiple Models -

Identity-Aware Multi-Sentence Video Description

[supplementary material]

VQA-LOL: Visual Question Answering under the Lens of Logic

[supplementary material]

Piggyback GAN: Efficient Lifelong Learning for Image Conditioned Generation

TRRNet: Tiered Relation Reasoning for Compositional Visual Question Answering

Mining Inter-Video Proposal Relations for Video Object Detection

[supplementary material]

TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval

[supplementary material]

Minimum Class Confusion for Versatile Domain Adaptation

[supplementary material]

Large Batch Optimization for Object Detection: Training COCO in 12 Minutes

Towards Practical and Efficient High-Resolution HDR Deghosting with CNN

[supplementary material]

Monocular Differentiable Rendering for Self-Supervised 3D Object Detection

[supplementary material]

Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation

[supplementary material]

Dynamic and Static Context-aware LSTM for Multi-agent Motion Prediction

[supplementary material]

Image-based table recognition: data, model, and evaluation

[supplementary material]

Group Activity Prediction with Sequential Relational Anticipation Model

PiP: Planning-informed Trajectory Prediction for Autonomous Driving

[supplementary material]

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

[supplementary material]

Hierarchical Context Embedding for Region-based Object Detection

[supplementary material]

Attention-Driven Dynamic Graph Convolutional Network for Multi-Label Image Recognition

[supplementary material]

Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection

[supplementary material]

Sparse-to-Dense Depth Completion Revisited: Sampling Strategy and Graph Construction

[supplementary material]

MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation

[supplementary material]

Detecting Human-Object Interactions with Action Co-occurrence Priors

[supplementary material]

Learning Connectivity of Neural Networks from a Topological Perspective

JSTASR: Joint Size and Transparency-Aware Snow Removal Algorithm Based on Modified Partial Convolution and Veiling Effect Removal

[supplementary material]

Ocean: Object-aware Anchor-free Tracking

[supplementary material]

Object Tracking using Spatio-Temporal Networks for Future Prediction Location

Pillar-based Object Detection for Autonomous Driving

[supplementary material]

Sparse Adversarial Attack via Perturbation Factorization

[supplementary material]

3D Scene Reconstruction from a Single Viewport

[supplementary material]

Learning to Optimize Domain Specific Normalization for Domain Generalization

[supplementary material]

Self-supervised Outdoor Scene Relighting

[supplementary material]

Privacy Preserving Visual SLAM

[supplementary material]

Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning

[supplementary material]

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

Globally Optimal and Efficient Vanishing Point Estimation in Atlanta World

[supplementary material]

StyleGAN2 Distillation for Feed-forward Image Manipulation

[supplementary material]

Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Learning Disentangled Representations via Mutual Information Estimation

[supplementary material]

Challenge-Aware RGBT Tracking

Fully Trainable and Interpretable Non-Local Sparse Models for Image Restoration

[supplementary material]

AutoSimulate: (Quickly) Learning Synthetic Data Generation

[supplementary material]

LatticeNet: Towards Lightweight Image Super-resolution with Lattice Block

Learning from Scale-Invariant Examples for Domain Adaptation in Semantic Segmentation

[supplementary material]

Active Visual Information Gathering for Vision-Language Navigation

[supplementary material]

Deep Hough-Transform Line Priors

[supplementary material]

Unsupervised Shape and Pose Disentanglement for 3D Meshes

[supplementary material]

CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection

[supplementary material]

Inclusive GAN: Improving Data and Minority Coverage in Generative Models

[supplementary material]

SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects

[supplementary material]

Dive Deeper Into Box for Object Detection

[supplementary material]

PG-Net: Pixel to Global Matching Network for Visual Tracking

[supplementary material]

Why Are Deep Representations Good Perceptual Quality Features?

[supplementary material]

Geometric Estimation via Robust Subspace Recovery

Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification

[supplementary material]

Human Correspondence Consensus for 3D Object Semantic Understanding

[supplementary material]

Learning Memory Augmented Cascading Network for Compressed Sensing of Images

Least squares surface reconstruction on arbitrary domains

[supplementary material]

Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery

Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting

[supplementary material]

DADA: Differentiable Automatic Data Augmentation

SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans

[supplementary material]

Kinship Identification through Joint Learning using Kinship Verification Ensembles

[supplementary material]

Kernelized Memory Network for Video Object Segmentation

[supplementary material]

A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection

Splitting vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation

Temporal Keypoint Matching and Refinement Network for Pose Estimation and Tracking

[supplementary material]

Neural Point-Based Graphics

[supplementary material]

FHDe²Net: Full High Definition Demoireing Network

[supplementary material]

Learning Structural Similarity of User Interface Layouts using Graph Networks

[supplementary material]

NAS-Count: Counting-by-Density with Neural Architecture Search

[supplementary material]

Towards Generalization Across Depth for Monocular 3D Object Detection

[supplementary material]

Margin-Mix: Semi–Supervised Learning for Face Expression Recognition

[supplementary material]

Principal Feature Visualisation in Convolutional Neural Networks

[supplementary material]

Progressive Refinement Network for Occluded Pedestrian Detection

[supplementary material]

Monocular Real-Time Volumetric Performance Capture

[supplementary material]

The Mapillary Traffic Sign Dataset for Detection and Classification on a Global Scale

[supplementary material]

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

[supplementary material]

Disentangling Multiple Features in Video Sequences using Gaussian Processes in Variational Autoencoders

[supplementary material]

SEN: A Novel Feature Normalization Dissimilarity Measure for Prototypical Few-Shot Learning Networks

[supplementary material]

Kinematic 3D Object Detection in Monocular Video

[supplementary material]

Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents

[supplementary material]

SACA Net: Cybersickness Assessment of Individual Viewers for VR Content via Graph-based Symptom Relation Embedding

[supplementary material]

End-to-End Low Cost Compressive Spectral Imaging with Spatial-Spectral Self-Attention

[supplementary material]

Know Your Surroundings: Exploiting Scene Information for Object Tracking

[supplementary material]

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases

[supplementary material]

Anatomy-Aware Siamese Network: Exploiting Semantic Asymmetry for Accurate Pelvic Fracture Detection in X-ray Images

DeepLandscape: Adversarial Modeling of Landscape Videos

[supplementary material]

GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images

[supplementary material]

Spatial-Angular Interaction for Light Field Image Super-Resolution

[supplementary material]

BATS: Binary ArchitecTure Search

[supplementary material]

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

[supplementary material]

Look here! A parametric learning based approach to redirect visual attention

[supplementary material]

Variational Diffusion Autoencoders with Random Walk Sampling

[supplementary material]

Adaptive Variance Based Label Distribution Learning For Facial Age Estimation

Connecting the Dots: Detecting Adversarial Perturbations Using Context Inconsistency

[supplementary material]

Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations

[supplementary material]

VarSR: Variational Super-Resolution Network for Very Low Resolution Images

[supplementary material]

Co-Heterogeneous and Adaptive Segmentation from Multi-Source and Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion Segmentation

[supplementary material]

Towards Recognizing Unseen Categories in Unseen Domains

[supplementary material]

Square Attack: a query-efficient black-box adversarial attack via random search

[supplementary material]

You Are Here: Geolocation by Embedding Maps and Images

[supplementary material]

Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation

[supplementary material]

From Image to Stability: Learning Dynamics from Human Pose

[supplementary material]

LevelSet R-CNN: A Deep Variational Method for Instance Segmentation

[supplementary material]

Efficient Scale-Permuted Backbone with Learned Resource Distribution

[supplementary material]

Reducing Distributional Uncertainty by Mutual Information Maximisation and Transferable Feature Learning

[supplementary material]

Bridging Knowledge Graphs to Generate Scene Graphs

[supplementary material]

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

[supplementary material]

Learning Visual Commonsense for Robust Scene Graph Generation

[supplementary material]

MPCC: Matching Priors and Conditionals for Clustering

[supplementary material]

PointAR: Efficient Lighting Estimation for Mobile Augmented Reality

Discrete Point Flow Networks for Efficient Point Cloud Generation

[supplementary material]

Accelerating Deep Learning with Millions of Classes

[supplementary material]

Password-conditioned Anonymization and Deanonymization with Face Identity Transformers

[supplementary material]

Inertial Safety from Structured Light

[supplementary material]

PointTriNet: Learned Triangulation of 3D Point Sets

[supplementary material]

Toward Unsupervised, Multi-Object Discovery in Large-Scale Image Collections

[supplementary material]

Deep Novel View Synthesis from Colored 3D Point Clouds

[supplementary material]

Consensus-Aware Visual-Semantic Embedding for Image-Text Matching

Spatial Hierarchy Aware Residual Pyramid Network for Time-of-Flight Depth Denoising

[supplementary material]

Sat2Graph: Road Graph Extraction through Graph-Tensor Encoding

Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition

Polarimetric Multi-View Inverse Rendering

[supplementary material]

SideInfNet: A Deep Neural Network for Semi-Automatic Semantic Segmentation with Side Information

[supplementary material]

Improving Face Recognition by Clustering Unlabeled Faces in the Wild

[supplementary material]

NeuRoRA: Neural Robust Rotation Averaging

[supplementary material]

SG-VAE: Scene Grammar Variational Autoencoder to generate new indoor scenes

[supplementary material]

Unsupervised Learning of Optical Flow with Deep Feature Similarity

Blended Grammar Network for Human Parsing

P²Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

[supplementary material]

Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs

[supplementary material]

Adaptive Mixture Regression Network with Local Counting Map for Crowd Counting

[supplementary material]

BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging

[supplementary material]

Ultra Fast Structure-aware Deep Lane Detection

[supplementary material]

Cross-Identity Motion Transfer for Arbitrary Objects through Pose-Attentive Video Reassembling

[supplementary material]

Domain Adaptive Object Detection via Asymmetric Tri-way Faster-RCNN

Exclusivity-Consistency Regularized Knowledge Distillation for Face Recognition

Learning Camera-Aware Noise Models

[supplementary material]

Towards Precise Completion of Deformable Shapes

[supplementary material]

Iterative Distance-Aware Similarity Matrix Convolution with Mutual-Supervised Point Elimination for Efficient Point Cloud Registration

[supplementary material]

Pairwise Similarity Knowledge Transfer for Weakly Supervised Object Localization

[supplementary material]

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

[supplementary material]

TPFN: Applying Outer Product along Time to Multimodal Sentiment Analysis Fusion on Incomplete Data

[supplementary material]

ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis

[supplementary material]

Learning with Privileged Information for Efficient Image Super-Resolution

[supplementary material]

Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification

Autoencoder-based Graph Construction for Semi-supervised Learning

[supplementary material]

Virtual Multi-view Fusion for 3D Semantic Segmentation

[supplementary material]

Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition

[supplementary material]

Deep Shape from Polarization

[supplementary material]

A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning

Mind the Discriminability: Asymmetric Adversarial Domain Adaptation

[supplementary material]

SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting 1D Occupancy Segments From 2D Coordinates

[supplementary material]

Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking

[supplementary material]

Deep FusionNet for Point Cloud Semantic Segmentation

[supplementary material]

Deep Material Recognition in Light-Fields via Disentanglement of Spatial and Angular Information

[supplementary material]

Dual Adversarial Network for Deep Active Learning

Fully Convolutional Networks for Continuous Sign Language Recognition

[supplementary material]

Self-adapting confidence estimation for stereo

[supplementary material]

Deep Surface Normal Estimation on the 2-Sphere with Confidence Guided Semantic Attention

[supplementary material]

AutoSTR: Efficient Backbone Search for Scene Text Recognition

Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification

[supplementary material]

Adversarial Training with Bi-directional Likelihood Regularization for Visual Classification

[supplementary material]

Faster AutoAugment: Learning Augmentation Strategies Using Backpropagation

Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation

[supplementary material]

Boundary-Aware Cascade Networks for Temporal Action Segmentation

[supplementary material]

Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation

[supplementary material]

Inference Graphs for CNN Interpretation

[supplementary material]

An End-to-End OCR Text Re-organization Sequence Learning for Rich-text Detail Image Comprehension

Improving Query Efficiency of Black-box Adversarial Attack

[supplementary material]

Self-similarity Student for Partial Label Histopathology Image Segmentation

[supplementary material]

BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

A Decoupled Learning Scheme for Real-world Burst Denoising from Raw Images

[supplementary material]

Global-and-Local Relative Position Embedding for Unsupervised Video Summarization

[supplementary material]

Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms

[supplementary material]

SPARK: Spatial-aware Online Incremental Attack Against Visual Tracking

[supplementary material]

CenterNet Heatmap Propagation for Real-time Video Object Detection

[supplementary material]

Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection

[supplementary material]

SOLAR: Second-Order Loss and Attention for Image Retrieval

[supplementary material]

Fixing Localization Errors to Improve Image Classification

PatchPerPix for Instance Segmentation

[supplementary material]

Attend and Segment: Attention Guided Active Semantic Segmentation

[supplementary material]

Accelerating CNN Training by Pruning Activation Gradients

[supplementary material]

Global and Local Enhancement Networks for Paired and Unpaired Image Enhancement

[supplementary material]

Probabilistic Anchor Assignment with IoU Prediction for Object Detection

[supplementary material]

Eyeglasses 3D shape reconstruction from a single face image

[supplementary material]

Temporal Complementary Learning for Video Person Re-Identification

HoughNet: Integrating near and long-range evidence for bottom-up object detection

[supplementary material]

Graph Wasserstein Correlation Analysis for Movie Retrieval

[supplementary material]

Context-Aware RCNN: A Baseline for Action Detection in Videos

Full-Time Monocular Road Detection Using Zero-Distribution Prior of Angle of Polarization

[supplementary material]

A Flexible Recurrent Residual Pyramid Network for Video Frame Interpolation

[supplementary material]

Learning Enriched Features for Real Image Restoration and Enhancement

[supplementary material]

Detail Preserved Point Cloud Completion via Separated Feature Aggregation

[supplementary material]

LabelEnc: A New Intermediate Supervision Method for Object Detection

[supplementary material]

Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets

[supplementary material]

PAMS: Quantized Super-Resolution via Parameterized Max Scale

SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds

[supplementary material]

OID: Outlier Identifying and Discarding in Blind Image Deblurring

[supplementary material]

Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors

[supplementary material]

Enhanced Sparse Model for Blind Deblurring

[supplementary material]

SumGraph: Video Summarization via Recursive Graph Modeling

[supplementary material]

Feature Normalized Knowledge Distillation for Image Classification

A Metric Learning Reality Check

[supplementary material]

FTL: A universal framework for training low-bit DNNs via Feature Transfer

XingGAN for Person Image Generation

[supplementary material]

GATCluster: Self-Supervised Gaussian-Attention Network for Image Clustering

[supplementary material]

VCNet: A Robust Approach to Blind Image Inpainting

[supplementary material]

Learning to Predict Context-adaptive Convolution for Semantic Segmentation

EfficientFCN: Holistically-guided Decoding for Semantic Segmentation

GroSS: Group-Size Series Decomposition for Grouped Architecture Search

[supplementary material]

Efficient Adversarial Attacks for Visual Object Tracking

[supplementary material]

Globally-Optimal Event Camera Motion Estimation

[supplementary material]

Weakly-supervised Learning of Human Dynamics

[supplementary material]

Journey Towards Tiny Perceptual Super-Resolution

[supplementary material]

What makes fake images detectable? Understanding properties that generalize

[supplementary material]

Embedding Propagation: Smoother Manifold for Few-Shot Classification

[supplementary material]

Category Level Object Pose Estimation via Neural Analysis-by-Synthesis

[supplementary material]

High-Fidelity Synthesis with Disentangled Representation

[supplementary material]

PL₁P - Point-line Minimal Problems under Partial Visibility in Three Views

[supplementary material]

Prediction and Recovery for Adaptive Low-Resolution Person Re-Identification

[supplementary material]

Learning Canonical Representations for Scene Graph to Image Generation

[supplementary material]

Adversarial Robustness on In- and Out-Distribution Improves Explainability

[supplementary material]

Deformable Style Transfer

[supplementary material]

Aligning Videos in Space and Time

[supplementary material]

Neural Wireframe Renderer: Learning Wireframe to Image Translations

[supplementary material]

RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax

Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction

[supplementary material]

Determining the Relevance of Features for Deep Neural Networks

[supplementary material]

Weakly Supervised Semantic Segmentation with Boundary Exploration

GANHopper: Multi-Hop GAN for Unsupervised Image-to-Image Translation

[supplementary material]

DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild

Multi-view adaptive graph convolutions for graph classification

Instance Adaptive Self-Training for Unsupervised Domain Adaptation

[supplementary material]

Weight Decay Scheduling and Knowledge Distillation for Active Learning

HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs

Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning

[supplementary material]

Geometry Constrained Weakly Supervised Object Localization

[supplementary material]

Duality Diagram Similarity: a generic framework for initialization selection in task transfer learning

[supplementary material]

OneGAN: Simultaneous Unsupervised Learning of Conditional Image Generation, Foreground Segmentation, and Fine-Grained Clustering

[supplementary material]

Mining self-similarity: Label super-resolution with epitomic representations

[supplementary material]

AE-OT-GAN: Training GANs from data specific latent distribution

[supplementary material]

Null-sampling for Interpretable and Fair Representations

[supplementary material]

Guiding Monocular Depth Estimation Using Depth-Attention Volume

[supplementary material]

Tracking Emerges by Looking Around Static Scenes, with Neural 3D Mapping

[supplementary material]

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer

[supplementary material]

BézierSketch: A generative model for scalable vector sketches

[supplementary material]

Semantic Relation Preserving Knowledge Distillation for Image-to-Image Translation

[supplementary material]

Domain Adaptation Through Task Distillation

[supplementary material]

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

[supplementary material]

More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning

[supplementary material]

Extending and Analyzing Self-Supervised Learning Across Domains

[supplementary material]

Multi-Source Open-Set Deep Adversarial Domain Adaptation

[supplementary material]

Neural Batch Sampling with Reinforcement Learning for Semi-Supervised Anomaly Detection

[supplementary material]

LEMMA: A Multi-view Dataset for LEarning Multi-agent Multi-task Activities

[supplementary material]

Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces From Images

Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion

[supplementary material]

Proposal-based Video Completion

[supplementary material]

HGNet: Hybrid Generative Network for Zero-shot Domain Adaptation

Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

[supplementary material]

All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling

[supplementary material]

A Broader Study of Cross-Domain Few-Shot Learning

[supplementary material]

Practical Poisoning Attacks on Neural Networks

[supplementary material]

Unsupervised Domain Adaptation in the Dissimilarity Space for Person Re-identification

Learn distributed GAN with Temporary Discriminators

[supplementary material]

SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

Improving Adversarial Robustness by Enforcing Local and Global Compactness

[supplementary material]

TopoAL: An Adversarial Learning Approach for Topology-Aware Road Segmentation

[supplementary material]

Channel selection using Gumbel Softmax

[supplementary material]

Exploiting Temporal Coherence for Self-Supervised One-shot Video Re-identification

[supplementary material]

An Efficient Training Framework for Reversible Neural Architectures

Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation

[supplementary material]

FreeCam3D: Snapshot Structured Light 3D with Freely-Moving Cameras

[supplementary material]

One-Pixel Signature: Characterizing CNN Models for Backdoor Detection

Learning to Transfer Learn: Reinforcement Learning-Based Selection for Adaptive Transfer Learning

[supplementary material]

Structure-Aware Generation Network for Recipe Generation from Images

A Simple and Effective Framework for Pairwise Deep Metric Learning

[supplementary material]

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner

[supplementary material]

A Recurrent Transformer Network for Novel View Action Synthesis

[supplementary material]

Multi-view Action Recognition using Cross-view Video Prediction

[supplementary material]

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction

[supplementary material]

Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation

[supplementary material]

Efficient Outdoor 3D Point Cloud Semantic Segmentation for Critical Road Objects and Distributed Contexts

Attributional Robustness Training using Input-Gradient Spatial Alignment

[supplementary material]

Reducing the Sim-to-Real Gap for Event Cameras

[supplementary material]

Spatial Geometric Reasoning for Room Layout Estimation via Deep Reinforcement Learning

Learning Data Augmentation Strategies for Object Detection

[supplementary material]

DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search

A Closer Look at Generalisation in RAVEN

[supplementary material]

Supervised Edge Attention Network for Accurate Image Instance Segmentation

Discriminative Partial Domain Adversarial Network

[supplementary material]

Differentiable Programming for Hyperspectral Unmixing using a Physics-based Dispersion Model

[supplementary material]

Deep Cross-species Feature Learning for Animal Face Recognition via Residual Interspecies Equivariant Network

Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed Scenes

[supplementary material]

Sound2Sight: Generating Visual Dynamics from Sound and Context

[supplementary material]

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection

NoiseRank: Unsupervised Label Noise Reduction with Dependence Models

Fast Adaptation to Super-Resolution Networks via Meta-Learning

TP-LSD: Tri-Points Based Line Segment Detector

[supplementary material]

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation

[supplementary material]

An Attention-driven Two-stage Clustering Method for Unsupervised Person Re-Identification

[supplementary material]

Toward Fine-grained Facial Expression Manipulation

Adaptive Object Detection with Dual Multi-Label Prediction

Table Structure Recognition using Top-Down and Bottom-Up Cues

[supplementary material]

Novel View Synthesis on Unpaired Data by Conditional Deformable Variational Auto-Encoder

[supplementary material]

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments

[supplementary material]

Boundary Content Graph Neural Network for Temporal Action Proposal Generation

[supplementary material]

Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition

[supplementary material]

VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval

Attention-Based Query Expansion Learning

Interpretable Foreground Object Search As Knowledge Distillation

Improving Knowledge Distillation via Category Structure

High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered Face Images

[supplementary material]

Attentive Prototype Few-shot Learning with Capsule Network-based Embedding

Weakly Supervised Instance Segmentation by Learning Annotation Consistent Instances

[supplementary material]

DA4AD: End-to-End Deep Attention-based Visual Localization for Autonomous Driving

[supplementary material]

Visual-Relation Conscious Image Generation from Structured-Text

[supplementary material]

Patch-wise Attack for Fooling Deep Neural Network

[supplementary material]

Feature Pyramid Transformer

[supplementary material]

MABNet: A Lightweight Stereo Network Based on Multibranch Adjustable Bottleneck Module

Guided Saliency Feature Learning for Person Re-identification in Crowded Scenes

Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection

Explaining Image Classifiers using Statistical Fault Localization

Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers

[supplementary material]

Learning Video Representations by Transforming Time

[supplementary material]

Unsupervised Monocular Depth Estimation for Night-time Images using Adversarial Domain Feature Adaptation

Variational Connectionist Temporal Classification

End-to-end Dynamic Matching Network for Multi-view Multi-person 3d Pose Estimation

[supplementary material]

Orderly Disorder in Point Cloud Domain

Deep Decomposition Learning for Inverse Imaging Problems

FLOT: Scene Flow on Point Clouds guided by Optimal Transport

[supplementary material]

Accurate Reconstruction of Oriented 3D Points using Affine Correspondences

[supplementary material]

Volumetric Transformer Networks

360(o) Camera Alignment via Segmentation

[supplementary material]

A Novel Line Integral Transform for 2D Affine-Invariant Shape Retrieval

Explanation-based Weakly-supervised Learning of Visual Relations with Graph Networks

[supplementary material]

Guided Semantic Flow

[supplementary material]

Document Structure Extraction using Prior based High Resolution Hierarchical Semantic Segmentation

[supplementary material]

Measuring the Importance of Temporal Features in Video Saliency

[supplementary material]

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

[supplementary material]

Towards Reliable Evaluation of Algorithms for Road Network Reconstruction from Aerial Images

[supplementary material]

Online Continual Learning under Extreme Memory Constraints

[supplementary material]

Learning to Cluster under Domain Shift

[supplementary material]

Defense Against Adversarial Attacks via Controlling Gradient Leaking on Embedded Manifolds

[supplementary material]

Improving Optical Flow on a Pyramid Level

[supplementary material]

Procrustean Regression Networks: Learning 3D Structure of Non-Rigid Objects from 2D Annotations

[supplementary material]

Learning to Learn Parameterized Classification Networks for Scalable Input Images

[supplementary material]

Stereo Event-based Particle Tracking Velocimetry for 3D Fluid Flow Reconstruction

[supplementary material]

Simplicial Complex based Point Correspondence between Images warped onto Manifolds

[supplementary material]

Representation Learning on Visual-Symbolic Graphs for Video Understanding

[supplementary material]

Distance-Normalized Unified Representation for Monocular 3D Object Detection

Sequential Deformation for Accurate Scene Text Detection

Where to Explore Next? ExHistCNN for History-aware Autonomous 3D Exploration

[supplementary material]

Semi-Supervised Segmentation based on Error-Correcting Supervision

Quantum-soft QUBO Suppression for Accurate Object Detection

Label-similarity Curriculum Learning

[supplementary material]

Recurrent Image Annotation With Explicit Inter-Label Dependencies

Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution

SimPose: Effectively Learning DensePose and Surface Normals of People from Simulated Data

ByeGlassesGAN: Identity Preserving Eyeglasses Removal for Face Images

[supplementary material]

Differentiable Joint Pruning and Quantization for Hardware Efficiency

Learning to Generate Customized Dynamic 3D Facial Expressions

[supplementary material]

LandscapeAR: Large Scale Outdoor Augmented Reality by Matching Photographs with Terrain Models Using Learned Descriptors

[supplementary material]

Learning Disentangled Feature Representation for Hybrid-distorted Image Restoration

Jointly De-biasing Face Recognition and Demographic Attribute Estimation

[supplementary material]

Regularized Loss for Weakly Supervised Single Class Semantic Segmentation

[supplementary material]

Spike-FlowNet: Event-based Optical Flow Estimation with Energy-Efficient Hybrid Neural Networks

[supplementary material]

Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations

[supplementary material]

Inherent Adversarial Robustness of Deep Spiking Neural Networks: Effects of Discrete Input Encoding and Non-Linear Activations

[supplementary material]

Synthesizing Coupled 3D Face Modalities by Trunk-Branch Generative Adversarial Networks

[supplementary material]

Learning to Learn Words from Visual Scenes

[supplementary material]

On Transferability of Histological Tissue Labels in Computational Pathology

[supplementary material]

Learning Actionness via Long-range Temporal Order Verification

[supplementary material]

Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays

[supplementary material]

Character Region Attention For Text Spotting

Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network

Dual Mixup Regularized Learning for Adversarial Domain Adaptation

Robust and On-the-fly Dataset Denoising for Image Classification

[supplementary material]

Imaging Behind Occluders Using Two-Bounce Light

[supplementary material]

Improving Object Detection with Selective Self-Supervised Self-Training

[supplementary material]

Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction

[supplementary material]

Info3D: Representation Learning on 3D Objects using Mutual Information Maximization and Contrastive Learning

[supplementary material]

Adversarial Data Augmentation via Deformation Statistics

Neural Predictor for Neural Architecture Search

[supplementary material]

Learning Permutation Invariant Representations using Memory Networks

Feature Space Augmentation for Long-Tailed Data

[supplementary material]

Laying the Foundations of Deep Long-Term Crowd Flow Prediction

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning

Fairness by Learning Orthogonal Disentangled Representations

[supplementary material]

Self-supervision with Superpixels: Training Few-shot Medical Image Segmentation without Annotation

[supplementary material]

On Diverse Asynchronous Activity Anticipation

[supplementary material]

Representative-Discriminative Learning for Open-set Land Cover Classification of Satellite Imagery

[supplementary material]

Structure-Aware Human-Action Generation

[supplementary material]

Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition

[supplementary material]

S³Net: Semantic-Aware Self-supervised Depth Estimation with Monocular Videos and Synthetic Data

[supplementary material]

Leveraging Seen and Unseen Semantic Relationships for Generative Zero-Shot Learning

[supplementary material]

Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks

[supplementary material]

UNITER: UNiversal Image-TExt Representation Learning

[supplementary material]

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

[supplementary material]

Improving Face Recognition from Hard Samples via Distribution Distillation Loss

[supplementary material]

Extract and Merge: Superpixel Segmentation with Regional Attributes

Spatial-Adaptive Network for Single Image Denoising

[supplementary material]

Physics-based Feature Dehazing Networks

Learning Surrogates via Deep Embedding

An Asymmetric Modeling for Action Assessment

[supplementary material]

High-quality Single-model Deep Video Compression with Frame-Conv3D and Multi-frame Differential Modulation

[supplementary material]

Instance-Aware Embedding for Point Cloud Instance Segmentation

Self-Paced Deep Regression Forests with Consideration on Underrepresented Examples

Manifold Projection for Adversarial Defense on Face Recognition

[supplementary material]

Weakly Supervised Learning with Side Information for Noisy Labeled Images

Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision

[supplementary material]

SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection

[supplementary material]

Modeling the Space of Point Landmark Constrained Diffeomorphisms

PieNet: Personalized Image Enhancement Network

[supplementary material]

Rotational Outlier Identification in Pose Graphs Using Dual Decomposition

Speech-driven Facial Animation using Cascaded GANs for Learning of Motion and Texture

[supplementary material]

Solving Phase Retrieval with a Learned Reference

Dual Grid Net: Hand Mesh Vertex Regression from Single Depth Maps

This is a list in the papers and reference materials accepted for ECCV 2020.

It has been compiled by the European Computer Vision Association (ECVA), a non-profit organization based in Zurich that aims to promote the dissemination of information on research on computer vision theory and practice.

Categories related to this article

Article

加藤: AI-SCHOLAR is a commentary media that introduces the latest articles on AI (artificial intelligence) in an easy-to-understand manner. The role of AI is not limited to technological innovation, as Japan's scientific capabilities are declining and the government continues to cut back on research budgets. Communicating with the world the technology of AI, its applications, and the context of the basic science that supports it is an important outreach, and can greatly influence society's understanding and impression of science. AI-SCHOLAR is designed to help eliminate the gaps in understanding of AI between the general public and experts, and to contribute to the integration of AI into society. In addition, we would like to help you embody your learning and research experiences in the media and express them in society. Anyone can explain advanced and difficult matters in difficult terms, but AI-SCHOLAR pursues "readability” and "comprehensibility" by making full use of vocabulary and design in conveying information as a medium.

If you have any suggestions for improvement of the content of the article,
please contact the AI-SCHOLAR editorial team through the contact form.