2022.5.16 Vision papers

 

05-12-2022

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code
by Yong Dai et al

05-10-2022

NeRF-Editing: Geometry Editing of Neural Radiance Fields
by Yu-Jie Yuan et al

05-12-2022

Learned Vertex Descent: A New Direction for 3D Human Model Fitting
by Enric Corona et al

05-12-2022

3D Moments from Near-Duplicate Photos
by Qianqian Wang et al

05-10-2022

KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints
by Marko Mihajlovic et al

05-12-2022

Simple Open-Vocabulary Object Detection with Vision Transformers
by Matthias Minderer et al

05-11-2022

TDT: Teaching Detectors to Track without Fully Annotated Videos
by Shuzhi Yu et al

05-12-2022

Topologically-Aware Deformation Fields for Single-View 3D Reconstruction
by Shivam Duggal et al

05-11-2022

Learning to Retrieve Videos by Asking Questions
by Avinash Madasu et al

05-11-2022

Multi-Class 3D Object Detection with Single-Class Supervision
by Mao Ye et al

05-11-2022

Surface Representation for Point Clouds
by Haoxi Ran et al

05-11-2022

View Synthesis with Sculpted Neural Points
by Yiming Zuo et al

05-12-2022

Distinction Maximization Loss: Efficiently Improving Classification Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply Replacing the Loss and Calibrating
by David Macêdo et al

05-11-2022

Diverse Video Generation from a Single Video
by Niv Haim et al

05-11-2022

DISARM: Detecting the Victims Targeted by Harmful Memes
by Shivam Sharma et al

05-10-2022

Learning Visual Styles from Audio-Visual Associations
by Tingle Li et al

05-10-2022

Learning to Answer Visual Questions from Web Videos
by Antoine Yang et al

05-10-2022

Hyperparameter optimization of hybrid quantum neural networks for car classification
by Asel Sagingalieva et al

05-13-2022

A Unified Framework for Implicit Sinkhorn Differentiation
by Marvin Eisenberger et al

05-12-2022

Whats in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
by David M. Chan et al

05-10-2022

Explainable Deep Learning Methods in Medical Diagnosis: A Survey
by Cristiano Patrício et al

05-11-2022

RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization
by Xintao Wang et al

05-10-2022

Secure Federated Learning for Neuroimaging
by Dimitris Stripelis et al

05-10-2022

Metric Learning based Interactive Modulation for Real-World Super-Resolution
by Chong Mou et al

05-11-2022

Continuous wavelet transform of multiview images using wavelets based on voxel patterns
by Vladimir Saveljev

05-11-2022

HULC: 3D Human Motion Capture with Pose Manifold Sampling and Dense Contact Guidance
by Soshi Shimada et al

05-12-2022

Tensor Decompositions for Hyperspectral Data Processing in Remote Sensing: A Comprehensive Review
by Minghua Wang et al

05-10-2022

OTFPF: Optimal Transport-Based Feature Pyramid Fusion Network for Brain Age Estimation with 3D Overlapped ConvNeXt
by Yu Fu et al

05-11-2022

A Continual Deepfake Detection Benchmark: Dataset, Methods, and Essentials
by Chuqiao Li et al

05-12-2022

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations
by Negin Heravi et al

05-11-2022

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling
by Yannan Nellie Wu et al

05-11-2022

Video-ReTime: Learning Temporally Varying Speediness for Time Remapping
by Simon Jenni et al

05-12-2022

Embodied vision for learning object representations
by Arthur Aubret et al

05-10-2022

Object Detection in Indian Food Platters using Transfer Learning with YOLOv4
by Deepanshu Pandey et al

05-11-2022

Leveraging Uncertainty for Deep Interpretable Classification and Weakly-Supervised Segmentation of Histology Images
by Soufiane Belharbi et al

05-10-2022

Identical Image Retrieval using Deep Learning
by Sayan Nath et al

05-12-2022

ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training
by Yue Zhao et al

05-11-2022

DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision
by Erik Wallin et al

05-10-2022

Weakly-supervised segmentation of referring expressions
by Robin Strudel et al

05-12-2022

Image Segmentation with Topological Priors
by Shakir Showkat Sofi et al

05-11-2022

Contrastive Supervised Distillation for Continual Representation Learning
by Tommaso Barletti et al

05-12-2022

Pseudo-Label Guided Multi-Contrast Generalization for Non-Contrast Organ-Aware Segmentation
by Ho Hin Lee et al

05-12-2022

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks
by Xintian Wu et al

05-10-2022

An Efficient Calculation of Quaternion Correlation of Signals and Color Images
by Artyom M. Grigoryan et al

05-10-2022

Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
by Otavio Braga et al

05-10-2022

Using Deep Learning-based Features Extracted from CT scans to Predict Outcomes in COVID-19 Patients
by Sai Vidyaranya Nuthalapati et al

05-12-2022

Accounting for the Sequential Nature of States to Learn Features for Reinforcement Learning
by Nathan Michlo et al

05-12-2022

Infrared Invisible Clothing:Hiding from Infrared Detectors at Multiple Angles in Real World
by Xiaopei Zhu et al

05-13-2022

Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction
by Daniel Reisenbüchler et al

05-11-2022

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
by Otavio Braga et al

05-11-2022

Scene Consistency Representation Learning for Video Scene Segmentation
by Haoqian Wu et al

05-12-2022

Economical Precise Manipulation and Auto Eye-Hand Coordination with Binocular Visual Reinforcement Learning
by Yiwen Chen et al

05-13-2022

StyLandGAN: A StyleGAN based Landscape Image Synthesis using Depth-map
by Gunhee Lee et al

05-10-2022

The Impact of Partial Occlusion on Pedestrian Detectability
by Shane Gilroy et al

05-10-2022

Non-Isometric Shape Matching via Functional Maps on Landmark-Adapted Bases
by Mikhail Panine et al

05-12-2022

Test-time Fourier Style Calibration for Domain Generalization
by Xingchen Zhao et al

05-12-2022

SimCPSR: Simple Contrastive Learning for Paper Submission Recommendation System
by Duc H. Le et al

05-11-2022

CNN-LSTM Based Multimodal MRI and Clinical Data Fusion for Predicting Functional Outcome in Stroke Patients
by Nima Hatami et al

05-12-2022

Smooth-Reduce: Leveraging Patches for Improved Certified Robustness
by Ameya Joshi et al

05-12-2022

Delving into High-Quality Synthetic Face Occlusion Segmentation Datasets
by Kenny T. R. Voo et al

05-12-2022

Localized Vision-Language Matching for Open-vocabulary Object Detection
by Maria A. Bravo et al

05-12-2022

Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection
by Mingyu Yang et al

05-10-2022

Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training
by Cheng Xue et al

05-10-2022

Reduce Information Loss in Transformers for Pluralistic Image Inpainting
by Qiankun Liu et al

05-10-2022

UNITS: Unsupervised Intermediate Training Stage for Scene Text Detection
by Youhui Guo et al

05-10-2022

MNet: Rethinking 2D/3D Networks for Anisotropic Medical Image Segmentation
by Zhangfu Dong et al

05-10-2022

Disentangling A Single MR Modality
by Lianrui Zuo et al

05-10-2022

An asynchronous event-based algorithm for periodic signals
by David El-Chai Ben-Ezra et al

05-11-2022

S3E-GNN: Sparse Spatial Scene Embedding with Graph Neural Networks for Camera Relocalization
by Ran Cheng et al

05-12-2022

Weakly-Supervised Action Detection Guided by Audio Narration
by Keren Ye et al

05-10-2022

A Closer Look at Blind Super-Resolution: Degradation Models, Baselines, and Performance Upper Bounds
by Wenlong Zhang et al

05-13-2022

Comparison of attention models and post-hoc explanation methods for embryo stage identification: a case study
by Tristan Gomez et al

05-13-2022

The Effectiveness of Temporal Dependency in Deepfake Video Detection
by Will Rowan et al

05-12-2022

Deep Decomposition and Bilinear Pooling Network for Blind Night-Time Image Quality Evaluation
by Qiuping Jiang et al

05-10-2022

Spatial Monitoring and Insect Behavioural Analysis Using Computer Vision for Precision Pollination
by Malika Nisal Ratnayake et al

05-12-2022

D3T-GAN: Data-Dependent Domain Transfer GANs for Few-shot Image Generation
by Xintian Wu et al

05-12-2022

Enhanced Single-shot Detector for Small Object Detection in Remote Sensing Images
by Pourya Shamsolmoali et al

05-11-2022

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
by Otavio Braga et al

05-12-2022

TaDeR: A New Task Dependency Recommendation for Project Management Platform
by Quynh Nguyen et al

05-12-2022

Group R-CNN for Weakly Semi-supervised Object Detection with Points
by Shilong Zhang et al

05-12-2022

Ray Priors through Reprojection: Improving Neural Radiance Fields for Novel View Extrapolation
by Jian Zhang et al

05-13-2022

RTMaps-based Local Dynamic Map for multi-ADAS data fusion
by Marcos Nieto et al

05-10-2022

Assessing Streamline Plausibility Through Randomized Iterative Spherical-Deconvolution Informed Tractogram Filtering
by Antonia Hain et al

05-12-2022

Dynamic Dense RGB-D SLAM using Learning-based Visual Odometry
by Shihao Shen et al

05-12-2022

Building Facade Parsing R-CNN
by Sijie Wang et al

05-10-2022

Automatic Detection of Microaneurysms in OCT Images Using Bag of Features
by Elahe Sadat Kazemi Nasab et al

05-10-2022

WG-VITON: Wearing-Guide Virtual Try-On for Top and Bottom Clothes
by Soonchan Park et al

05-13-2022

Blind Image Inpainting with Sparse Directional Filter Dictionaries for Lightweight CNNs
by Jenny Schmalfuss et al

05-12-2022

FPSRS: A Fusion Approach for Paper Submission Recommendation System
by Son T. Huynh et al

05-13-2022

Meta Balanced Network for Fair Face Recognition
by Mei Wang et al

05-10-2022

Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains
by Haiyang Yang et al

05-11-2022

Performance of a deep learning system for detection of referable diabetic retinopathy in real clinical settings
by Verónica Sánchez-Gutiérrez et al

05-12-2022

Talking Face Generation with Multilingual TTS
by Hyoung-Kyu Song et al

05-12-2022

Deep morphological recognition of kidney stones using intra-operative endoscopic digital videos
by Vincent Estrade et al

05-11-2022

Cross-domain Few-shot Meta-learning Using Stacking
by Hongyu Wang et al

05-13-2022

Knowledge Distillation Meets Open-Set Semi-Supervised Learning
by Jing Yang et al

05-10-2022

Self-supervised regression learning using domain knowledge: Applications to improving self-supervised denoising in imaging
by Il Yong Chun et al

05-12-2022

Video-based assessment of intraoperative surgical skill
by Sanchit Hira et al

05-11-2022

Invisible-to-Visible: Privacy-Aware Human Segmentation using Airborne Ultrasound via Collaborative Learning Probabilistic U-Net
by Risako Tanigawa et al

05-11-2022

AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation
by Xu Cao et al

05-10-2022

Shadow-Aware Dynamic Convolution for Shadow Removal
by Yimin Xu et al

05-12-2022

Knowledge Distillation for Multi-Target Domain Adaptation in Real-Time Person Re-Identification
by Félix Remigereau et al

05-11-2022

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results
by Yawei Li et al

05-10-2022

Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network
by Dasong Li et al

05-11-2022

Recurrent Encoder-Decoder Networks for Vessel Trajectory Prediction with Uncertainty Estimation
by Samuele Capobianco et al

05-13-2022

Modeling Semantic Composition with Syntactic Hypergraph for Video Question Answering
by Zenan Xu et al

05-11-2022

MEWS: Real-time Social Media Manipulation Detection and Analysis
by Trenton W. Ford et al

05-13-2022

FontNet: Closing the gap to font designer performance in font synthesis
by Ammar Ul Hassan Muhammad et al

05-11-2022

RustSEG -- Automated segmentation of corrosion using deep learning
by B. Burton et al

05-13-2022

Open-Eye: An Open Platform to Study Human Performance on Identifying AI-Synthesized Faces
by Hui Guo et al

05-12-2022

PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning
by Hongbin Liu et al

05-10-2022

Accelerating the Training of Video Super-Resolution
by Lijian Lin et al

05-12-2022

Fall detection using multimodal data
by Thao V. Ha et al

05-13-2022

A Survey of Left Atrial Appendage Segmentation and Analysis in 3D and 4D Medical Images
by Hrvoje Leventić et al

05-13-2022

A microstructure estimation Transformer inspired by sparse representation for diffusion MRI
by Tianshu Zheng et al

05-12-2022

Tensor-based Emotion Editing in the StyleGAN Latent Space
by René Haas et al

05-10-2022

Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild
by Fuyan Ma et al

05-13-2022

Virtual passengers for real car solutions: synthetic datasets
by Paola Natalia Canas et al

05-12-2022

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
by Xuesong Chen et al

05-11-2022

Face Detection on Mobile: Five Implementations and Analysis
by Kostiantyn Khabarlak

05-11-2022

AutoLC: Search Lightweight and Top-Performing Architecture for Remote Sensing Image Land-Cover Classification
by Chenyu Zheng et al

05-12-2022

Target Aware Network Architecture Search and Compression for Efficient Knowledge Transfer
by S. H. Shabbeer Basha et al

05-13-2022

Unsupervised Structure-Texture Separation Network for Oracle Character Recognition
by Mei Wang et al

05-11-2022

NMR: Neural Manifold Representation for Autonomous Driving
by Unnikrishnan R. Nair et al

05-11-2022

Computational behavior recognition in child and adolescent psychiatry: A statistical and machine learning analysis plan
by Nicole N. Lønfeldt et al

05-13-2022

Contrastive Domain Disentanglement for Generalizable Medical Image Segmentation
by Ran Gu et al

05-10-2022

Transformer-based Cross-Modal Recipe Embeddings with Large Batch Training
by Jing Yang et al

05-13-2022

Self-Supervised Masking for Unsupervised Anomaly Detection and Localization
by Chaoqin Huang et al

05-11-2022

An Empirical Study Of Self-supervised Learning Approaches For Object Detection With Transformers
by Gokul Karthik Kumar et al

05-11-2022

Bi-level Alignment for Cross-Domain Crowd Counting
by Shenjian Gong et al

05-12-2022

Blueprint Separable Residual Network for Efficient Image Super-Resolution
by Zheyuan Li et al

05-11-2022

Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos
by Shuo Yang et al

05-11-2022

AFFIRM: Affinity Fusion-based Framework for Iteratively Random Motion correction of multi-slice fetal brain MRI
by Wen Shi et al

05-12-2022

Teaching Independent Parts Separately(TIPS-GAN) : Improving Accuracy and Stability in Unsupervised Adversarial 2D to 3D Human Pose Estimation
by Peter Hardy et al

05-11-2022

RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation
by Pingchuan Ma et al

05-13-2022

Monocular Human Digitization via Implicit Re-projection Networks
by Min-Gyu Park et al

05-10-2022

Learning Non-target Knowledge for Few-shot Semantic Segmentation
by Yuanwei Liu et al

05-11-2022

TextMatcher: Cross-Attentional Neural Network to Compare Image and Text
by Valentina Arrigoni et al

05-11-2022

READ: Large-Scale Neural Scene Rendering for Autonomous Driving
by Zhuopeng Li et al

05-13-2022

An empirical study of CTC based models for OCR of Indian languages
by Minesh Mathew et al

05-10-2022

Salient Object Detection via Bounding-box Supervision
by Mengqi He et al

05-11-2022

Multi-Label Logo Recognition and Retrieval based on Weighted Fusion of Neural Features
by Marisa Bernabeu et al

05-11-2022

An Objective Method for Pedestrian Occlusion Level Classification
by Shane Gilroy et al

05-11-2022

Deep Depth Completion: A Survey
by Junjie Hu et al

05-11-2022

Arbitrary Shape Text Detection via Boundary Transformer
by Shi-Xue Zhang et al

05-10-2022

DcnnGrasp: Towards Accurate Grasp Pattern Recognition with Adaptive Regularizer Learning
by Xiaoqin Zhang et al

05-11-2022

Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning
by Mengshun Hu et al

05-11-2022

Review on Panoramic Imaging and Its Applications in Scene Understanding
by Shaohua Gao et al

05-13-2022

FRIH: Fine-grained Region-aware Image Harmonization
by Jinlong Peng et al

05-11-2022

Deep Learning and Computer Vision Techniques for Microcirculation Analysis: A Review
by Maged Abdalla Helmy Mohamed Abdou et al

05-11-2022

ReFine: Re-randomization before Fine-tuning for Cross-domain Few-shot Learning
by Jaehoon Oh et al

05-11-2022

Revisiting Random Channel Pruning for Neural Network Compression
by Yawei Li et al

05-12-2022

Real-time Virtual-Try-On from a Single Example Image through Deep Inverse Graphics and Learned Differentiable Renderers
by Robin Kips et al

05-13-2022

Slimmable Video Codec
by Zhaocheng Liu et al

05-10-2022

Deep fusion of gray level co-occurrence matrices for lung nodule classification
by Ahmed Saihood et al

05-13-2022

KG-SP: Knowledge Guided Simple Primitives for Open World Compositional Zero-Shot Learning
by Shyamgopal Karthik et al

05-13-2022

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
by Yuchao Gu et al

05-10-2022

VesNet-RL: Simulation-based Reinforcement Learning for Real-World US Probe Navigation
by Yuan Bi et al

05-12-2022

LANTERN-RD: Enabling Deep Learning for Mitigation of the Invasive Spotted Lanternfly
by Srivatsa Kundurthy

05-13-2022

Multi-encoder Network for Parameter Reduction of a Kernel-based Interpolation Architecture
by Issa Khalifeh et al

05-13-2022

Scribble2D5: Weakly-Supervised Volumetric Image Segmentation via Scribble Annotations
by Qiuhui Chen et al

05-12-2022

Overparameterization Improves StyleGAN Inversion
by Yohan Poirier-Ginter et al

05-10-2022

Few-Shot Image Classification Benchmarks are Too Far From Reality: Build Back Better with Semantic Task Sampling
by Etienne Bennequin et al

05-10-2022

On Scale Space Radon Transform, Properties and Image Reconstruction
by Nafaa Nacereddine et al

05-10-2022

Student Collaboration Improves Self-Supervised Learning: Dual-Loss Adaptive Masked Autoencoder for Brain Cell Image Analysis
by Son T. Ly et al

 
Craig Smith