ECCV2020_Papers Accepted List
The 2020 European Conference on Computer Vision (ECCV 2020), which took place August 24-27, 2020, is conference in the field of image analysis.
Quaternion Equivariant Capsule Networks for 3D Point Clouds
DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search
Describing Textures using Natural Language
AiR: Attention with Reasoning Capability
Self6D: Self-Supervised Monocular 6D Object Pose Estimation
Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation
House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation
Crowdsampling the Plenoptic Function
VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment
End-to-End Object Detection with Transformers
DeepSFM: Structure From Motion Via Deep Bundle Adjustment
Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
Conditional Convolutions for Instance Segmentation
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
Privacy Preserving Structure-from-Motion
Rewriting a Deep Generative Model
Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
Long-term Human Motion Prediction with Scene Context
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images
Learning and Aggregating Deep Local Descriptors for Instance-level Recognition
A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem
Learn to Recover Visible Color for Video Surveillance in a Day
Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images
Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
BorderDet: Border Feature for Dense Object Detection
Regularization with Latent Space Virtual Adversarial Training
Du²Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels
Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning
Targeted Attack for Deep Hashing based Retrieval
Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Content-Aware Unsupervised Deep Homography Estimation
Multi-View Optimization of Local Feature Geometry
The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization
Learning Stereo from Single Images
Prototype Rectification for Few-Shot Learning
Learning Feature Descriptors using Camera Pose Supervision
Semantic Flow for Fast and Accurate Scene Parsing
Appearance Consensus Driven Self-Supervised Human Mesh Recovery
Aligning and Projecting Images to Class-conditional Generative Networks
Suppress and Balance: A Simple Gated Network for Salient Object Detection
Visual Memorability for Robotic Interestingness via Unsupervised Online Learning
Post-Training Piecewise Linear Quantization for Deep Neural Networks
Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification
In-Home Daily-Life Captioning Using Radio Signals
Self-Challenging Improves Cross-Domain Generalization
A Competence-aware Curriculum for Visual Concepts Learning via Question Answering
Multitask Learning Strengthens Adversarial Robustness
S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search
Improving Deep Video Compression by Resolution-adaptive Flow Coding
Motion Capture from Internet Videos
Appearance-Preserving 3D Convolution for Video-based Person Re-identification
Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation
Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded Apertures
Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling
Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction
Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network
Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation
CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Domain-invariant Stereo Matching Networks
Content Adaptive and Error Propagation Aware Deep Video Compression
Towards Automated Testing and Robustification by Semantic Adversarial Data Generation
Adversarial Generative Grammars for Human Activity Prediction
GDumb: A Simple Approach that Questions Our Progress in Continual Learning
Learning Lane Graph Representations for Motion Forecasting
What Matters in Unsupervised Optical Flow
Synthesis and Completion of Facades from Satellite Imagery
Mapillary Planet-Scale Depth Dataset
V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction
Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters
EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning
Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation
Cross-Domain Cascaded Deep Translation
“Look Ma, no landmarks!” – Unsupervised, Model-based Dense Face Alignment
Online Invariance Selection for Local Feature Descriptors
Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations
TextCaps: a Dataset for Image Captioning with Reading Comprehension
It is not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction
Learning What to Learn for Video Object Segmentation
SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing
LIMP: Learning Latent Shape Representations with Metric Preservation Priors
Unsupervised Sketch to Photo Synthesis
A Simple Way to Make Neural Networks Robust Against Diverse Image Corruptions
SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification
Hierarchical Face Aging through Disentangled Latent Characteristics
Hybrid Models for Open Set Recognition
TopoGAN: A Topology-Aware Generative Adversarial Network
Learning to Localize Actions from Moments
ForkGAN: Seeing into the Rainy Night
TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning
ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval
TSIT: A Simple and Versatile Framework for Image-to-Image Translation
ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices
HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve
A Unified Framework of Surrogate Loss by Refactoring and Interpolation
Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images
Memory-augmented Dense Predictive Coding for Video Representation Learning
PointMixup: Augmentation for Point Clouds
Identity-Guided Human Semantic Parsing for Person Re-Identification
Learning Gradient Fields for Shape Generation
COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder
Corner Proposal Network for Anchor-free, Two-stage Object Detection
PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and Click
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Learning Delicate Local Representations for Multi-Person Pose Estimation
Learning to Plan with Uncertain Topological Maps
Neural Design Network: Graphic Layout Generation with Constraints
Learning Open Set Network with Discriminative Reciprocal Points
Convolutional Occupancy Networks
Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry
TIDE: A General Toolbox for Identifying Object Detection Errors
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation
Circumventing Outliers of AutoAugment with Knowledge Distillation
S2DNet: Learning Image Features for Accurate Sparse-to-Dense Matching
RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving
Video Object Segmentation with Episodic Graph Memory Networks
Rethinking Bottleneck Structure for Efficient Mobile Network Design
Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks
Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets
Contrastive Learning for Weakly Supervised Phrase Grounding
Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors
TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images
Semi-Siamese Training for Shallow Face Learning
GAN Slimming: All-in-One GAN Compression by A Unified Optimization Framework
Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence Recognition
Binarized Neural Network for Single Image Super Resolution
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Adaptive Computationally Efficient Network for Monocular 3D Hand Pose Estimation
Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
Hamiltonian Dynamics for Real-World Shape Interpolation
Learning to Scale Multilingual Representations for Vision-Language Tasks
Multi-modal Transformer for Video Retrieval
Feature Representation Matters: End-to-End Learning for Reference-based Image Super-resolution
RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD Camera
Surface Normal Estimation of Tilted Images via Spatial Rectifier
Multimodal Shape Completion via Conditional Generative Adversarial Networks
Generative Sparse Detection Networks for 3D Single-shot Object Detection
Grounded Situation Recognition
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Unpaired Learning of Deep Image Denoising
Self-supervising Fine-grained Region Similarities for Large-scale Image Localization
Rotationally-Temporally Consistent Novel View Synthesis of Human Performance Video
Side-Aware Boundary Localization for More Precise Object Detection
SF-Net: Single-Frame Supervision for Temporal Action Localization
Negative Margin Matters: Understanding Margin in Few-shot Classification
Particularity beyond Commonality: Unpaired Identity Transfer with Multiple References
CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis
Transporting Labels via Hierarchical Optimal Transport for Semi-Supervised Learning
MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning
Learning to Factorize and Relight a City
Region Graph Embedding Network for Zero-Shot Learning
GRAB: A Dataset of Whole-Body Human Grasping of Objects
DEMEA: Deep Mesh Autoencoders for Non-Rigidly Deforming Objects
RANSAC-Flow: Generic Two-stage Image Alignment
Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images
Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & Application
MovieNet: A Holistic Dataset for Movie Understanding
Short-Term and Long-Term Context Aggregation Network for Video Inpainting
DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization
Face Super-Resolution Guided by 3D Facial Priors
Are Labels Necessary for Neural Architecture Search?
BLSM: A Bone-Level Skinned Model of the Human Mesh
Associative Alignment for Few-shot Image Classification
Cyclic Functional Mapping: Self-supervised Correspondence between Non-isometric Deformable Shapes
View-Invariant Probabilistic Embedding for Human Pose
Contact and Human Dynamics from Monocular Video
PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation
Points2Surf Learning Implicit Surfaces from Point Clouds
Few-Shot Scene-Adaptive Anomaly Detection
Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting
Entropy Minimisation Framework for Event-based Vision Model Estimation
PIoU Loss: Towards Accurate Oriented Object Detection in Complex Environments
TENet: Triple Excitation Network for Video Salient Object Detection
Deep Feedback Inverse Problem Solver
Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
Hallucinating Visual Instances in Total Absentia
Weakly-supervised 3D Shape Completion in the Wild
DTVNet: Dynamic Time-lapse Video Generation via Single Still Image
CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss
Collaborative Video Object Segmentation by Foreground-Background Integration
Adaptive Margin Diversity Regularizer for handling Data Imbalance in Zero-Shot SBIR
ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation
Calibration-free Structure-from-Motion with Calibrated Radial Trifocal Tensors
Occupancy Anticipation for Efficient Exploration and Navigation
Unified Image and Video Saliency Modeling
TAO: A Large-Scale Benchmark for Tracking Any Object
A Generalization of Otsu’s Method and Minimum Error Thresholding
A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks
Big Transfer (BiT): General Visual Representation Learning
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
Few-shot Action Recognition with Permutation-invariant Attention
Character Grounding and Re-Identification in Story of Videos and Text Descriptions
AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling
Learning Visual Context by Comparison
Large Scale Holistic Video Understanding
Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Predicting Visual Overlap of Images Through Interpretable Non-Metric Box Embeddings
Connecting Vision and Language with Localized Narratives
Adversarial T-shirt! Evading Person Detectors in A Physical World
Bounding-box Channels for Visual Relationship Detection
Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial Distortion
SRFlow: Learning the Super-Resolution Space with Normalizing Flow
DeepGMR: Learning Latent Gaussian Mixture Models for Registration
Active Perception using Light Curtains for Autonomous Driving
Invertible Neural BRDF for Object Inverse Rendering
Semi-supervised Semantic Segmentation via Strong-weak Dual-branch Network
Practical Deep Raw Image Denoising on Mobile Devices
SoundSpaces: Audio-Visual Navigation in 3D Environments
Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
Erasing Appearance Preservation in Optimization-based Smoothing
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler
Guided Deep Decoder: Unsupervised Image Pair Fusion
Filter Style Transfer between Photos
Dynamic Group Convolution for Accelerating Convolutional Neural Networks
RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering
Object-Contextual Representations for Semantic Segmentation
Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring
Joint Semantic Instance Segmentation on Graphs with the Semantic Mutex Watershed
Photon-Efficient 3D Imaging with A Non-Local Neural Network
GeLaTO: Generative Latent Textured Objects
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Directional Temporal Modeling for Action Recognition
Shonan Rotation Averaging: Global Optimality by Surfing SO(p)(n)
Semantic Curiosity for Active Visual Learning
ProgressFace: Scale-Aware Progressive Learning for Face Detection
CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos
Modeling the Effects of Windshield Refraction for Camera Calibration
PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
Visual Relation Grounding in Videos
Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
Controlling Style and Semantics in Weakly-Supervised Image Generation
Jointly learning visual motion and confidence from local patches in event cameras
SODA: Story Oriented Dense Video Captioning Evaluation Framework
Sketch-Guided Object Localization in Natural Images
A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement
STAR: Sparse Trained Articulated Human Body Regressor
Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer
Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians
Learning 3D Part Assembly from a Single Image
PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
Highly Efficient Salient Object Detection with 100K Parameters
HardGAN: A Haze-Aware Representation Distillation GAN for Single Image Dehazing
Lifespan Age Transformation Synthesis
Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation
Simulating Content Consistent Vehicle Datasets with Attribute Descent
Multiview Detection with Feature Perspective Transformation
Learning Object Relation Graph and Tentative Policy for Visual Navigation
Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition
Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning
Inducing Optimal Attribute Representations for Conditional GANs
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
Image-to-Voxel Model Translation for 3D Scene Reconstruction and Segmentation
Consistency Guided Scene Flow Estimation
Autoregressive Unsupervised Image Segmentation
Controllable Image Synthesis via SegVAE
Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search
Efficient Non-Line-of-Sight Imaging from Transient Sinograms
Texture Hallucination for Large-Factor Painting Super-Resolution
Learning Progressive Joint Propagation for Human Motion Prediction
Image Stitching and Rectification for Hand-Held Cameras
ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds
The Group Loss for Deep Metric Learning
Learning Object Depth from Camera Motion and Video Object Segmentation
OnlineAugment: Online Data Augmentation with Less Domain Knowledge
Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction
Intra-class Feature Variation Distillation for Semantic Segmentation
Temporal Distinct Representation Learning for Action Recognition
Representative Graph Neural Network
Deformation-Aware 3D Model Embedding and Retrieval
Atlas: End-to-End 3D Scene Reconstruction from Posed Images
Multiple Class Novelty Detection Under Data Distribution Shift
Colorization of Depth Map via Disentanglement
Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes
GeoGraph: Graph-based multi-view object detection with geometric cues end-to-end
Localizing the Common Action Among a Few Videos
TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification
Traffic Accident Benchmark for Causality Recognition
Face Anti-Spoofing with Human Material Perception
How Can I See My Future? FvTraj: Using First-person View for Pedestrian Trajectory Prediction
Multiple Expert Brainstorming for Domain Adaptive Person Re-identification
NASA Neural Articulated Shape Approximation
Towards Unique and Informative Captioning of Images
When Does Self-supervision Improve Few-shot Learning?
Two-branch Recurrent Network for Isolating Deepfakes in Videos
Incremental Few-Shot Meta-Learning via Indirect Discriminant Alignment
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation
Global Distance-distributions Separation for Unsupervised Person Re-identification
Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose
ALRe: Outlier Detection for Guided Refinement
Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations
Unsupervised Domain Attention Adaptation Network for Caricature Attribute Recognition
Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection
Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance
Improved Adversarial Training via Learned Optimizer
Component Divide-and-Conquer for Real-World Image Super-Resolution
Enabling Deep Residual Networks for Weakly Supervised Object Detection
Deep near-light photometric stereo for spatially varying reflectances
Learning Visual Representations with Caption Annotations
Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier
Regression of Instance Boundary by Aggregated CNN and GCN
Social Adaptive Module for Weakly-supervised Group Activity Recognition
RGB-D Salient Object Detection with Cross-Modality Modulation and Selection
RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection
Faster Person Re-Identification
Quantization Guided JPEG Artifact Correction
3PointTM: Faster Measurement of High-Dimensional Transmission Matrices
Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer
Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction
World-Consistent Video-to-Video Synthesis
GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild
Event-based Asynchronous Sparse Convolutional Networks
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
REMIND Your Neural Network to Prevent Catastrophic Forgetting
Image Classification in the Dark using Quanta Image Sensors
n-Reference Transfer Learning for Saliency Prediction
Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection
Bottom-Up Temporal Action Localization with Mutual Regularization
On Modulating the Gradient for Meta-Learning
Domain-Specific Mappings for Generative Adversarial Style Transfer
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
DHP: Differentiable Meta Pruning via HyperNetworks
Deep Transferring Quantization
Deep Credible Metric Learning for Unsupervised Domain Adaptation Person Re-identification
Arbitrary-Oriented Object Detection with Circular Smooth Label
Learning Event-Driven Video Deblurring and Interpolation
Learning to Combine: Knowledge Aggregation for Multi-Source Domain Adaptation
CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain Adaptation
Prototype Mixture Models for Few-shot Semantic Segmentation
Webly Supervised Image Classification with Self-Contained Confidence
Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization
Monocular 3D Object Detection via Feature Domain Adaptation
VPN: Learning Video-Pose Embedding for Activities of Daily Living
Soft Anchor-Point Object Detection
Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid
Soft Expert Reward Learning for Vision-and-Language Navigation
Part-aware Prototype Network for Few-shot Semantic Segmentation
Learning from Extrinsic and Intrinsic Supervisions for Domain Generalization
Joint Learning of Social Groups, Individuals Action and Sub-group Activities in Videos
Whole-Body Human Pose Estimation in the Wild
Relative Pose Estimation of Calibrated Cameras with Known SE(3) Invariants
Sequential Convolution and Runge-Kutta Residual Architecture for Image Compressed Sensing
Deep Hough Transform for Semantic Line Detection
Structured Landmark Detection via Topology-Adapting Deep Graph Learning
3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning
Learning to Balance Specificity and Invariance for In and Out of Domain Generalization
Contrastive Learning for Unpaired Image-to-Image Translation
DLow: Diversifying Latent Flows for Diverse Human Motion Prediction
GRNet: Gridding Residual Network for Dense Point Cloud Completion
Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition
Blind Face Restoration via Deep Multi-scale Component Dictionaries
Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methods
Inequality-Constrained and Robust 3D Face Model Fitting
Gabor Layers Enhance Network Robustness
Conditional Image Repainting via Semantic Bridge and Piecewise Value Function
Learnable Cost Volume Using the Cayley Representation
HALO: Hardware-Aware Learning to Optimize
Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition
Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision
Domain Adaptive Semantic Segmentation Using Weak Labels
Knowledge Distillation Meets Self-Supervision
Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions
Reconstructing the Noise Variance Manifold for Image Denoising
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints
VisualEchoes: Spatial Image Representation Learning through Echolocation
Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval
Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation
Spatially Aware Multimodal Transformers for TextVQA
Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector
URIE: Universal Image Enhancement for Visual Recognition in the Wild
Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
SPL-MLL: Selecting Predictable Landmarks for Multi-Label Learning
Unpaired Image-to-Image Translation using Adversarial Consistency Loss
Discriminability Distillation in Group Representation Learning
Monocular Expressive Body Regression through Body-Driven Attention
Dual Adversarial Network: Toward Real-world Noise Removal and Noise Generation
Linguistic Structure Guided Context Modeling for Referring Image Segmentation
Federated Visual Classification with Real-World Data Distribution
Robust Re-Identification by Multiple Views Knowledge Distillation
Defocus Deblurring Using Dual-Pixel Data
RhyRNN: Rhythmic RNN for Recognizing Events in Long and Complex Videos
Weighing Counts: Sequential Crowd Counting by Reinforcement Learning
Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
Learning to Learn with Variational Information Bottleneck for Domain Generalization
Deep Positional and Relational Feature Learning for Rotation-Invariant Point Cloud Analysis
Layered Neighborhood Expansion for Incremental Multiple Graph Matching
SCAN: Learning to Classify Images without Labels
Graph convolutional networks for learning with few clean and many noisy labels
Object-and-Action Aware Model for Visual Language Navigation
A Comprehensive Study of Weight Sharing in Graph Networks for 3D Human Pose Estimation
MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution
Efficient Semantic Video Segmentation with Per-frame Inference
Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers
Deep Spiking Neural Network: Energy Efficiency Through Time based Coding
InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling
Utilizing Patch-level Category Activation Patterns for Multiple Class Novelty Detection
Mapping in a Cycle: Sinkhorn Regularized Unsupervised Learning for Point Cloud Shapes