2022.5.9 Vision papers

 

05-04-2022

CoCa: Contrastive Captioners are Image-Text Foundation Models
by Jiahui Yu et al

05-04-2022

Sequencer: Deep LSTM for Image Classification
by Yuki Tatsunami et al

05-05-2022

GANimator: Neural Motion Synthesis from a Single Sequence
by Peizhuo Li et al

05-03-2022

Better plain ViT baselines for ImageNet-1k
by Lucas Beyer et al

05-03-2022

Subspace Diffusion Generative Models
by Bowen Jing et al

05-03-2022

Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review
by Jiaxin Li et al

05-05-2022

Language Models Can See: Plugging Visual Controls in Text Generation
by Yixuan Su et al

05-03-2022

Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)
by Alex Fang et al

05-05-2022

Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction
by Yining Hong et al

05-04-2022

COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles
by Jiaxun Cui et al

05-03-2022

DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks
by Shih-Yang Su et al

05-03-2022

i-Code: An Integrative and Composable Multimodal Learning Framework
by Ziyi Yang et al

05-05-2022

BlobGAN: Spatially Disentangled Scene Representations
by Dave Epstein et al

05-05-2022

Dual Octree Graph Networks for Learning Adaptive Volumetric Shape Representations
by Peng-Shuai Wang et al

05-05-2022

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects
by Bangbang Yang et al

05-04-2022

P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision
by He Zhao et al

05-05-2022

Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics
by Sizhe Li et al

05-03-2022

Toward Modeling Creative Processes for Algorithmic Painting
by Aaron Hertzmann

05-04-2022

All You May Need for VQA are Image Captions
by Soravit Changpinyo et al

05-03-2022

Visual Commonsense in Pretrained Unimodal and Multimodal Models
by Chenyu Zhang et al

05-05-2022

Holistic Approach to Measure Sample-level Adversarial Vulnerability and its Utility in Building Trustworthy Systems
by Gaurav Kumar Nayak et al

05-05-2022

Real-time Controllable Motion Transition for Characters
by Xiangjun Tang et al

05-05-2022

Neural Jacobian Fields: Learning Intrinsic Mappings of Arbitrary Meshes
by Noam Aigerman et al

05-03-2022

End-to-End Visual Editing with a Generatively Pre-Trained Artist
by Andrew Brown et al

05-03-2022

Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks
by Xiaoyu Pan et al

05-04-2022

Compound virtual screening by learning-to-rank with gradient boosting decision tree and enrichment-based cumulative gain
by Kairi Furui et al

05-03-2022

GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping
by Pan Ji et al

05-04-2022

Video Extrapolation in Space and Time
by Yunzhi Zhang et al

05-03-2022

Cross-modal Representation Learning for Zero-shot Action Recognition
by Chung-Ching Lin et al

05-04-2022

Pik-Fix: Restoring and Colorizing Old Photo
by Runsheng Xu et al

05-03-2022

Multimodal Detection of Unknown Objects on Roads for Autonomous Driving
by Daniel Bogdoll et al

05-03-2022

BioTouchPass: Handwritten Passwords for Touchscreen Biometrics
by Ruben Tolosana et al

05-06-2022

CLIP-CLOP: CLIP-Guided Collage and Photomontage
by Piotr Mirowski et al

05-03-2022

HL-Net: Heterophily Learning Network for Scene Graph Generation
by Xin Lin et al

05-05-2022

DropTrack -- automatic droplet tracking using deep learning for microfluidic applications
by Mihir Durve et al

05-04-2022

MM-Claims: A Dataset for Multimodal Claim Detection in Social Media
by Gullal S. Cheema et al

05-03-2022

A Comprehensive Survey of Image Augmentation Techniques for Deep Learning
by Mingle Xu et al

05-03-2022

Distilling Governing Laws and Source Input for Dynamical Systems from Videos
by Lele Luan et al

05-04-2022

Spot-adaptive Knowledge Distillation
by Jie Song et al

05-04-2022

SVTS: Scalable Video-to-Speech Synthesis
by Rodrigo Mira et al

05-03-2022

Multi-view Geometry: Correspondences Refinement Based on Algebraic Properties
by Trung-Kien Le et al

05-03-2022

An Empirical Analysis of the Use of Real-Time Reachability for the Safety Assurance of Autonomous Vehicles
by Patrick Musau et al

05-03-2022

A hybrid multi-object segmentation framework with model-based B-splines for microbial single cell analysis
by Karina Ruzaeva et al

05-03-2022

Simpler is Better: off-the-shelf Continual Learning Through Pretrained Backbones
by Francesco Pelosin

05-04-2022

Generalized Knowledge Distillation via Relationship Matching
by Han-Jia Ye et al

05-03-2022

Diverse Image Captioning with Grounded Style
by Franz Klein et al

05-03-2022

Outdoor Monocular Depth Estimation: A Research Review
by Pulkit Vyas et al

05-03-2022

BiOcularGAN: Bimodal Synthesis and Annotation of Ocular Images
by Darian Tomašević et al

05-05-2022

View-labels Are Indispensable: A Multifacet Complementarity Study of Multi-view Clustering
by Chuanxing Geng et al

05-05-2022

One Picture is Worth a Thousand Words: A New Wallet Recovery Process
by Hervé Chabannne et al

05-03-2022

Episodic Memory Question Answering
by Samyak Datta et al

05-05-2022

Do Different Deep Metric Learning Losses Lead to Similar Learned Features?
by Konstantin Kobs et al

05-03-2022

Compact Neural Networks via Stacking Designed Basic Units
by Weichao Lan et al

05-04-2022

Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification
by Haowei Zhu et al

05-05-2022

Neural 3D Scene Reconstruction with the Manhattan-world Assumption
by Haoyu Guo et al

05-03-2022

Copy Motion From One to Another: Fake Motion Video Generation
by Zhenguang Liu et al

05-05-2022

OCR Synthetic Benchmark Dataset for Indic Languages
by Naresh Saini et al

05-03-2022

RAFT-MSF: Self-Supervised Monocular Scene Flow using Recurrent Optimizer
by Bayram Bayramli et al

05-05-2022

Biologically inspired deep residual networks for computer vision applications
by Prathibha Varghese et al

05-05-2022

Intra and Cross-spectrum Iris Presentation Attack Detection in the NIR and Visible Domains Using Attention-based and Pixel-wise Supervised Learning
by Meiling Fang et al

05-03-2022

Automatic Segmentation of Aircraft Dents in Point Clouds
by Pasquale Lafiosca et al

05-03-2022

Cross-View Cross-Scene Multi-View Crowd Counting
by Qi Zhang et al

05-04-2022

RecipeSnap -- a lightweight image-to-recipe model
by Jianfa Chen et al

05-05-2022

Are GAN-based Morphs Threatening Face Recognition?
by Eklavya Sarkar et al

05-04-2022

EllSeg-Gen, towards Domain Generalization for head-mounted eyetracking
by Rakshit S. Kothari et al

05-04-2022

Self-supervised learning unveils morphological clusters behind lung cancer types and prognosis
by Adalberto Claudio Quiros et al

05-03-2022

3D Semantic Scene Perception using Distributed Smart Edge Sensors
by Simon Bultmann et al

05-05-2022

YOLOPose: Transformer-based Multi-Object 6D Pose Estimation using Keypoint Regression
by Arash Amini et al

05-05-2022

Declaration-based Prompt Tuning for Visual Question Answering
by Yuhang Liu et al

05-03-2022

Point Cloud Semantic Segmentation using Multi Scale Sparse Convolution Neural Network
by Yunzheng Su

05-04-2022

Self-Taught Metric Learning without Labels
by Sungyeon Kim et al

05-03-2022

Frequency-Selective Geometry Upsampling of Point Clouds
by Viktoria Heimann et al

05-04-2022

Towards Real-time Traffic Sign and Traffic Light Detection on Embedded Systems
by Oshada Jayasinghe et al

05-03-2022

Sampling-free obstacle gradients and reactive planning in Neural Radiance Fields (NeRF)
by Michael Pantic et al

05-04-2022

Compressive Ptychography using Deep Image and Generative Priors
by Semih Barutcu et al

05-04-2022

Hypercomplex Image-to-Image Translation
by Eleonora Grassucci et al

05-05-2022

What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning
by Jae Hee Lee et al

05-05-2022

MMINR: Multi-frame-to-Multi-frame Inference with Noise Resistance for Precipitation Nowcasting with Radar
by Feng Sun et al

05-03-2022

Cross-Domain Object Detection with Mean-Teacher Transformer
by Jinze Yu et al

05-04-2022

Neuroevolutionary Multi-objective approaches to Trajectory Prediction in Autonomous Vehicles
by Fergal Stapleton et al

05-04-2022

Evaluating Transferability for Covid 3D Localization Using CT SARS-CoV-2 segmentation models
by Constantine Maganaris et al

05-05-2022

Exploiting Correspondences with All-pairs Correlations for Multi-view Depth Estimation
by Kai Cheng et al

05-04-2022

Prediction of fish location by combining fisheries data and sea bottom temperature forecasting
by Matthieu Ospici et al

05-04-2022

Homography-Based Loss Function for Camera Pose Regression
by Clémentin Boittiaux et al

05-03-2022

Effect of Random Histogram Equalization on Breast Calcification Analysis Using Deep Learning
by Adarsh Bhandary Panambur et al

05-03-2022

A Bidirectional Conversion Network for Cross-Spectral Face Recognition
by Zhicheng Cao et al

05-05-2022

Hardware System Implementation for Human Detection using HOG and SVM Algorithm
by Van-Cam Nguyen et al

05-05-2022

Parametric Reshaping of Portraits in Videos
by Xiangjun Tang et al

05-05-2022

Text to artistic image generation
by Qinghe Tian et al

05-05-2022

ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation
by Arthur Moreau et al

05-04-2022

Surface Reconstruction from Point Clouds: A Survey and a Benchmark
by Zhangjin Huang et al

05-06-2022

QLEVR: A Diagnostic Dataset for Quantificational Language and Elementary Visual Reasoning
by Zechen Li et al

05-03-2022

MS Lesion Segmentation: Revisiting Weighting Mechanisms for Federated Learning
by Dongnan Liu et al

05-04-2022

Zero-Episode Few-Shot Contrastive Predictive Coding: Solving intelligence tests without prior training
by T. Barak et al

05-05-2022

Text Detection on Technical Drawings for the Digitization of Brown-field Processes
by Tobias Schlagenhauf et al

05-03-2022

Masked Generative Distillation
by Zhendong Yang et al

05-03-2022

RU-Net: Regularized Unrolling Network for Scene Graph Generation
by Xin Lin et al

05-03-2022

Multitask Network for Joint Object Detection, Semantic Segmentation and Human Pose Estimation in Vehicle Occupancy Monitoring
by Nikolas Ebert et al

05-05-2022

The Batch Artifact Scanning Protocol: A new method using computed tomography (CT) to rapidly create three-dimensional models of objects from large collections en masse
by Katrina Yezzi-Woodley et al

05-05-2022

A Deep Reinforcement Learning Framework for Rapid Diagnosis of Whole Slide Pathological Images
by Tingting Zheng et al

05-04-2022

Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager
by Yuan Zhou et al

05-05-2022

Gait Recognition in the Wild: A Benchmark
by Zheng Zhu et al

05-05-2022

Visually plausible human-object interaction capture from wearable sensors
by Vladimir Guzov et al

05-04-2022

Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose Recognition with Synthetic Data
by Cheng-Yen Yang et al

05-05-2022

Cross-view Transformers for real-time Map-view Semantic Segmentation
by Brady Zhou et al

05-05-2022

Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling
by Xin Chen et al

05-05-2022

Activity Detection in Long Surgical Videos using Spatio-Temporal Models
by Aidean Sharghi et al

05-05-2022

Large Scale Transfer Learning for Differentially Private Image Classification
by Harsh Mehta et al

05-04-2022

A Bayesian Detect to Track System for Robust Visual Object Tracking and Semi-Supervised Model Learning
by Yan Shen et al

05-03-2022

FedMix: Mixed Supervised Federated Learning for Medical Image Segmentation
by Jeffry Wicaksana et al

05-03-2022

End2End Multi-View Feature Matching using Differentiable Pose Optimization
by Barbara Roessle et al

05-04-2022

Scene Clustering Based Pseudo-labeling Strategy for Multi-modal Aerial View Object Classification
by Jun Yu et al

05-04-2022

Dynamic Sparse R-CNN
by Qinghang Hong et al

05-04-2022

Domino Saliency Metrics: Improving Existing Channel Saliency Metrics with Structural Information
by Kaveena Persand et al

05-03-2022

Joint Image Compression and Denoising via Latent-Space Scalability
by Saeed Ranjbar Alvar et al

05-05-2022

Koopman pose predictions for temporally consistent human walking estimations
by Marc Mitjans et al

05-03-2022

Assessing Dataset Bias in Computer Vision
by Athiya Deviyani

05-03-2022

UCL-Dehaze: Towards Real-world Image Dehazing via Unsupervised Contrastive Learning
by Yongzhen Wang et al

05-05-2022

BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection
by Min Yang et al

05-04-2022

ANUBIS: Review and Benchmark Skeleton-Based Action Recognition Methods with a New Dataset
by Zhenyue Qin et al

05-03-2022

Os Dados dos Brasileiros sob Risco na Era da Intelig\^encia Artificial?
by Raoni F. da S. Teixeira et al

05-04-2022

Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation for Scene Recognition
by Alejandro López-Cifuentes et al

05-03-2022

Detection of Propaganda Techniques in Visuo-Lingual Metaphor in Memes
by Sunil Gundapu et al

05-03-2022

Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction
by Zihao Chen et al

05-03-2022

Deep Multi-Scale U-Net Architecture and Noise-Robust Training Strategies for Histopathological Image Segmentation
by Nikhil Cherian Kurian et al

05-04-2022

ShoeRinsics: Shoeprint Prediction for Forensics with Intrinsic Decomposition
by Samia Shafique et al

05-04-2022

Mobile-URSONet: an Embeddable Neural Network for Onboard Spacecraft Pose Estimation
by Julien Posso et al

05-05-2022

Segmentation with Super Images: A New 2D Perspective on 3D Medical Image Analysis
by Ikboljon Sobirov et al

05-03-2022

Object Class Aware Video Anomaly Detection through Image Translation
by Mohammad Baradaran et al

05-04-2022

Self-Supervised Learning for Invariant Representations from Multi-Spectral and SAR Images
by Pallavi Jain et al

05-06-2022

Forget Less, Count Better: A Domain-Incremental Self-Distillation Learning Benchmark for Lifelong Crowd Counting
by Jiaqi Gao et al

05-04-2022

BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking
by Dorian Henning et al

05-03-2022

Splicing Detection and Localization In Satellite Imagery Using Conditional GANs
by Emily R. Bartusiak et al

05-04-2022

An Analysis of Generative Methods for Multiple Image Inpainting
by Coloma Ballester et al

05-06-2022

Incremental Data-Uploading for Full-Quantum Classification
by Maniraman Periyasamy et al

05-04-2022

TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
by Haodong Duan et al

05-03-2022

Synthesized Speech Detection Using Convolutional Transformer-Based Spectrogram Analysis
by Emily R. Bartusiak et al

05-05-2022

Generate and Edit Your Own Character in a Canonical View
by Jeong-gi Kwak et al

05-03-2022

SpineNetV2: Automated Detection, Labelling and Radiological Grading Of Clinical MR Scans
by Rhydian Windsor et al

05-03-2022

Smart City Intersections: Intelligence Nodes for Future Metropolises
by Zoran Kostić et al

05-05-2022

Building Brains: Subvolume Recombination for Data Augmentation in Large Vessel Occlusion Detection
by Florian Thamm et al

05-03-2022

Comparison of CoModGANs, LaMa and GLIDE for Art Inpainting- Completing M.C Eschers Print Gallery
by Lucia Cipolina-Kun et al

05-05-2022

Approximate Convex Decomposition for 3D Meshes with Collision-Aware Concavity and Tree Search
by Xinyue Wei et al

05-03-2022

Frequency Domain-Based Detection of Generated Audio
by Emily R. Bartusiak et al

05-03-2022

Application of belief functions to medical image segmentation: A review
by Ling Huang et al

05-03-2022

In Defense of Image Pre-Training for Spatiotemporal Recognition
by Xianhang Li et al

05-06-2022

Continual Object Detection via Prototypical Task Correlation Guided Gating Mechanism
by Binbin Yang et al

05-04-2022

Self-Supervised Super-Resolution for Multi-Exposure Push-Frame Satellites
by Ngoc Long Nguyen et al

05-04-2022

UnrealNAS: Can We Search Neural Architectures with Unreal Data?
by Zhen Dong et al

05-03-2022

License Plate Privacy in Collaborative Visual Analysis of Traffic Scenes
by Saeed Ranjbar Alvar et al

05-04-2022

DeepPortraitDrawing: Generating Human Body Images from Freehand Sketches
by Xian Wu et al

05-04-2022

SDF-based RGB-D Camera Tracking in Neural Scene Representations
by Leonard Bruns et al

05-04-2022

Generative Adversarial Network Based Synthetic Learning and a Novel Domain Relevant Loss Term for Spine Radiographs
by Ethan Schonfeld et al

05-04-2022

Understanding Transfer Learning for Chest Radiograph Clinical Report Generation with Modified Transformer Architectures
by Edward Vendrow et al

05-05-2022

Learn-to-Race Challenge 2022: Benchmarking Safe Learning and Cross-domain Generalisation in Autonomous Racing
by Jonathan Francis et al

05-05-2022

Invariant Content Synergistic Learning for Domain Generalization of Medical Image Segmentation
by Yuxin Kang et al

05-05-2022

AdaTriplet: Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching
by Khanh Nguyen et al

05-05-2022

Multi-view Point Cloud Registration based on Evolutionary Multitasking with Bi-Channel Knowledge Sharing Mechanism
by Yue Wu et al

05-05-2022

Evaluating Context for Deep Object Detectors
by Osman Semih Kayhan et al

05-05-2022

Atlas-powered deep learning (ADL) -- application to diffusion weighted MRI
by Davood Karimi et al

05-06-2022

Quantification of Robotic Surgeries with Vision-Based Deep Learning
by Dani Kiyasseh et al

05-06-2022

Controlled Dropout for Uncertainty Estimation
by Mehedi Hasan et al

05-06-2022

Dual-Level Decoupled Transformer for Video Captioning
by Yiqi Gao et al

05-06-2022

Crop Type Identification for Smallholding Farms: Analyzing Spatial, Temporal and Spectral Resolutions in Satellite Imagery
by Depanshu Sani et al

05-05-2022

Revisiting Pretraining for Semi-Supervised Learning in the Low-Label Regime
by Xun Xu et al

05-06-2022

Weakly Supervised 3D Point Cloud Segmentation via Multi-Prototype Learning
by Yongyi Su et al

05-06-2022

BDIS: Bayesian Dense Inverse Searching Method for Real-Time Stereo Surgical Image Matching
by Jingwei Song et al

05-05-2022

CNN-Augmented Visual-Inertial SLAM with Planar Constraints
by Pan Ji et al

05-06-2022

A Fingerprint Detection Method by Fingerprint Ridge Orientation Check
by Kim JuSong et al

05-06-2022

Investigating and Explaining the Frequency Bias in Image Classification
by ZhiYu Lin et al

05-06-2022

Predicting Future Occupancy Grids in Dynamic Environment with Spatio-Temporal Learning
by Khushdeep Singh Mann et al

05-05-2022

Multi-mode Tensor Train Factorization with Spatial-spectral Regularization for Remote Sensing Images Recovery
by Gaohang Yu et al

05-06-2022

Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection
by Ziteng Cui et al

05-06-2022

Prompt Distribution Learning
by Yuning Lu et al

05-06-2022

Semantics-Guided Moving Object Segmentation with 3D LiDAR
by Shuo Gu et al

05-04-2022

InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy
by Weichen Fan et al

05-05-2022

Generating Representative Samples for Few-Shot Classification
by Jingyi Xu et al

05-06-2022

All Grains, One Scheme (AGOS): Learning Multi-grain Instance Representation for Aerial Scene Classification
by Qi Bi et al

05-05-2022

Scene Graph Expansion for Semantics-Guided Image Outpainting
by Chiao-An Yang et al

05-03-2022

Improved Orientation Estimation and Detection with Hybrid Object Detection Networks for Automotive Radar
by Michael Ulrich et al

05-04-2022

GAN Inversion for Data Augmentation to Improve Colonoscopy Lesion Classification
by Mayank Golhar et al

05-04-2022

Immiscible Color Flows in Optimal Transport Networks for Image Classification
by Alessandro Lonardi et al

05-03-2022

Biometric Signature Verification Using Recurrent Neural Networks
by Ruben Tolosana et al

05-03-2022

Understanding Urban Water Consumption using Remotely Sensed Data
by Shaswat Mohanty et al

05-05-2022

FisheyeDistill: Self-Supervised Monocular Depth Estimation with Ordinal Distillation for Fisheye Cameras
by Qingan Yan et al

05-06-2022

MINI: Mining Implicit Novel Instances for Few-Shot Object Detection
by Yuhang Cao et al

05-06-2022

From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data
by Zhenghang Yuan et al

05-06-2022

A High-Accuracy Unsupervised Person Re-identification Method Using Auxiliary Information Mined from Datasets
by Hehan Teng et al

 
Craig Smith