05-04-2022
|
CoCa: Contrastive Captioners are Image-Text Foundation
Models
by
Jiahui Yu
et al
|
|
|
|
05-04-2022
|
Sequencer: Deep LSTM for Image Classification
by
Yuki Tatsunami
et al
|
|
|
|
05-05-2022
|
GANimator: Neural Motion Synthesis from a Single
Sequence
by
Peizhuo Li
et al
|
|
|
|
05-03-2022
|
Better plain ViT baselines for ImageNet-1k
by
Lucas Beyer
et al
|
|
|
|
05-03-2022
|
Subspace Diffusion Generative Models
by
Bowen Jing
et al
|
|
|
|
05-03-2022
|
Deep Learning in Multimodal Remote Sensing Data Fusion:
A Comprehensive Review
by
Jiaxin Li
et al
|
|
|
|
05-05-2022
|
Language Models Can See: Plugging Visual Controls in
Text Generation
by
Yixuan Su
et al
|
|
|
|
05-03-2022
|
Data Determines Distributional Robustness in
Contrastive Language Image Pre-training (CLIP)
by
Alex Fang
et al
|
|
|
|
05-05-2022
|
Fixing Malfunctional Objects With Learned Physical
Simulation and Functional Prediction
by
Yining Hong
et al
|
|
|
|
05-04-2022
|
COOPERNAUT: End-to-End Driving with Cooperative
Perception for Networked Vehicles
by
Jiaxun Cui
et al
|
|
|
|
05-03-2022
|
DANBO: Disentangled Articulated Neural Body
Representations via Graph Neural Networks
by
Shih-Yang Su
et al
|
|
|
|
05-03-2022
|
i-Code: An Integrative and Composable Multimodal
Learning Framework
by
Ziyi Yang
et al
|
|
|
|
05-05-2022
|
BlobGAN: Spatially Disentangled Scene Representations
by
Dave Epstein
et al
|
|
|
|
05-05-2022
|
Dual Octree Graph Networks for Learning Adaptive
Volumetric Shape Representations
by
Peng-Shuai Wang
et al
|
|
|
|
05-05-2022
|
Neural Rendering in a Room: Amodal 3D Understanding and
Free-Viewpoint Rendering for the Closed Scene Composed
of Pre-Captured Objects
by
Bangbang Yang
et al
|
|
|
|
05-04-2022
|
P3IV: Probabilistic Procedure Planning from
Instructional Videos with Weak Supervision
by
He Zhao
et al
|
|
|
|
05-05-2022
|
Contact Points Discovery for Soft-Body Manipulations
with Differentiable Physics
by
Sizhe Li
et al
|
|
|
|
05-03-2022
|
Toward Modeling Creative Processes for Algorithmic
Painting
by
Aaron Hertzmann
|
|
|
|
05-04-2022
|
All You May Need for VQA are Image Captions
by
Soravit Changpinyo
et al
|
|
|
|
05-03-2022
|
Visual Commonsense in Pretrained Unimodal and
Multimodal Models
by
Chenyu Zhang
et al
|
|
|
|
05-05-2022
|
Holistic Approach to Measure Sample-level Adversarial
Vulnerability and its Utility in Building Trustworthy
Systems
by
Gaurav Kumar Nayak
et al
|
|
|
|
05-05-2022
|
Real-time Controllable Motion Transition for Characters
by
Xiangjun Tang
et al
|
|
|
|
05-05-2022
|
Neural Jacobian Fields: Learning Intrinsic Mappings of
Arbitrary Meshes
by
Noam Aigerman
et al
|
|
|
|
05-03-2022
|
End-to-End Visual Editing with a Generatively
Pre-Trained Artist
by
Andrew Brown
et al
|
|
|
|
05-03-2022
|
Predicting Loose-Fitting Garment Deformations Using
Bone-Driven Motion Networks
by
Xiaoyu Pan
et al
|
|
|
|
05-04-2022
|
Compound virtual screening by learning-to-rank with
gradient boosting decision tree and enrichment-based
cumulative gain
by
Kairi Furui
et al
|
|
|
|
05-03-2022
|
GeoRefine: Self-Supervised Online Depth Refinement for
Accurate Dense Mapping
by
Pan Ji
et al
|
|
|
|
05-04-2022
|
Video Extrapolation in Space and Time
by
Yunzhi Zhang
et al
|
|
|
|
05-03-2022
|
Cross-modal Representation Learning for Zero-shot
Action Recognition
by
Chung-Ching Lin
et al
|
|
|
|
05-04-2022
|
Pik-Fix: Restoring and Colorizing Old Photo
by
Runsheng Xu
et al
|
|
|
|
05-03-2022
|
Multimodal Detection of Unknown Objects on Roads for
Autonomous Driving
by
Daniel Bogdoll
et al
|
|
|
|
05-03-2022
|
BioTouchPass: Handwritten Passwords for Touchscreen
Biometrics
by
Ruben Tolosana
et al
|
|
|
|
05-06-2022
|
CLIP-CLOP: CLIP-Guided Collage and Photomontage
by
Piotr Mirowski
et al
|
|
|
|
05-03-2022
|
HL-Net: Heterophily Learning Network for Scene Graph
Generation
by
Xin Lin
et al
|
|
|
|
05-05-2022
|
DropTrack -- automatic droplet tracking using deep
learning for microfluidic applications
by
Mihir Durve
et al
|
|
|
|
05-04-2022
|
MM-Claims: A Dataset for Multimodal Claim Detection in
Social Media
by
Gullal S. Cheema
et al
|
|
|
|
05-03-2022
|
A Comprehensive Survey of Image Augmentation Techniques
for Deep Learning
by
Mingle Xu
et al
|
|
|
|
05-03-2022
|
Distilling Governing Laws and Source Input for
Dynamical Systems from Videos
by
Lele Luan
et al
|
|
|
|
05-04-2022
|
Spot-adaptive Knowledge Distillation
by
Jie Song
et al
|
|
|
|
05-04-2022
|
SVTS: Scalable Video-to-Speech Synthesis
by
Rodrigo Mira
et al
|
|
|
|
05-03-2022
|
Multi-view Geometry: Correspondences Refinement Based
on Algebraic Properties
by
Trung-Kien Le
et al
|
|
|
|
05-03-2022
|
An Empirical Analysis of the Use of Real-Time
Reachability for the Safety Assurance of Autonomous
Vehicles
by
Patrick Musau
et al
|
|
|
|
05-03-2022
|
A hybrid multi-object segmentation framework with
model-based B-splines for microbial single cell
analysis
by
Karina Ruzaeva
et al
|
|
|
|
05-03-2022
|
Simpler is Better: off-the-shelf Continual Learning
Through Pretrained Backbones
by
Francesco Pelosin
|
|
|
|
05-04-2022
|
Generalized Knowledge Distillation via Relationship
Matching
by
Han-Jia Ye
et al
|
|
|
|
05-03-2022
|
Diverse Image Captioning with Grounded Style
by
Franz Klein
et al
|
|
|
|
05-03-2022
|
Outdoor Monocular Depth Estimation: A Research Review
by
Pulkit Vyas
et al
|
|
|
|
05-03-2022
|
BiOcularGAN: Bimodal Synthesis and Annotation of Ocular
Images
by
Darian Tomašević
et al
|
|
|
|
05-05-2022
|
View-labels Are Indispensable: A Multifacet
Complementarity Study of Multi-view Clustering
by
Chuanxing Geng
et al
|
|
|
|
05-05-2022
|
One Picture is Worth a Thousand Words: A New Wallet
Recovery Process
by
Hervé Chabannne
et al
|
|
|
|
05-03-2022
|
Episodic Memory Question Answering
by
Samyak Datta
et al
|
|
|
|
05-05-2022
|
Do Different Deep Metric Learning Losses Lead to
Similar Learned Features?
by
Konstantin Kobs
et al
|
|
|
|
05-03-2022
|
Compact Neural Networks via Stacking Designed Basic
Units
by
Weichao Lan
et al
|
|
|
|
05-04-2022
|
Dual Cross-Attention Learning for Fine-Grained Visual
Categorization and Object Re-Identification
by
Haowei Zhu
et al
|
|
|
|
05-05-2022
|
Neural 3D Scene Reconstruction with the Manhattan-world
Assumption
by
Haoyu Guo
et al
|
|
|
|
05-03-2022
|
Copy Motion From One to Another: Fake Motion Video
Generation
by
Zhenguang Liu
et al
|
|
|
|
05-05-2022
|
OCR Synthetic Benchmark Dataset for Indic Languages
by
Naresh Saini
et al
|
|
|
|
05-03-2022
|
RAFT-MSF: Self-Supervised Monocular Scene Flow using
Recurrent Optimizer
by
Bayram Bayramli
et al
|
|
|
|
05-05-2022
|
Biologically inspired deep residual networks for
computer vision applications
by
Prathibha Varghese
et al
|
|
|
|
05-05-2022
|
Intra and Cross-spectrum Iris Presentation Attack
Detection in the NIR and Visible Domains Using
Attention-based and Pixel-wise Supervised Learning
by
Meiling Fang
et al
|
|
|
|
05-03-2022
|
Automatic Segmentation of Aircraft Dents in Point
Clouds
by
Pasquale Lafiosca
et al
|
|
|
|
05-03-2022
|
Cross-View Cross-Scene Multi-View Crowd Counting
by
Qi Zhang
et al
|
|
|
|
05-04-2022
|
RecipeSnap -- a lightweight image-to-recipe model
by
Jianfa Chen
et al
|
|
|
|
05-05-2022
|
Are GAN-based Morphs Threatening Face Recognition?
by
Eklavya Sarkar
et al
|
|
|
|
05-04-2022
|
EllSeg-Gen, towards Domain Generalization for
head-mounted eyetracking
by
Rakshit S. Kothari
et al
|
|
|
|
05-04-2022
|
Self-supervised learning unveils morphological clusters
behind lung cancer types and prognosis
by
Adalberto Claudio Quiros
et al
|
|
|
|
05-03-2022
|
3D Semantic Scene Perception using Distributed Smart
Edge Sensors
by
Simon Bultmann
et al
|
|
|
|
05-05-2022
|
YOLOPose: Transformer-based Multi-Object 6D Pose
Estimation using Keypoint Regression
by
Arash Amini
et al
|
|
|
|
05-05-2022
|
Declaration-based Prompt Tuning for Visual Question
Answering
by
Yuhang Liu
et al
|
|
|
|
05-03-2022
|
Point Cloud Semantic Segmentation using Multi Scale
Sparse Convolution Neural Network
by
Yunzheng Su
|
|
|
|
05-04-2022
|
Self-Taught Metric Learning without Labels
by
Sungyeon Kim
et al
|
|
|
|
05-03-2022
|
Frequency-Selective Geometry Upsampling of Point Clouds
by
Viktoria Heimann
et al
|
|
|
|
05-04-2022
|
Towards Real-time Traffic Sign and Traffic Light
Detection on Embedded Systems
by
Oshada Jayasinghe
et al
|
|
|
|
05-03-2022
|
Sampling-free obstacle gradients and reactive planning
in Neural Radiance Fields (NeRF)
by
Michael Pantic
et al
|
|
|
|
05-04-2022
|
Compressive Ptychography using Deep Image and
Generative Priors
by
Semih Barutcu
et al
|
|
|
|
05-04-2022
|
Hypercomplex Image-to-Image Translation
by
Eleonora Grassucci
et al
|
|
|
|
05-05-2022
|
What is Right for Me is Not Yet Right for You: A
Dataset for Grounding Relative Directions via
Multi-Task Learning
by
Jae Hee Lee
et al
|
|
|
|
05-05-2022
|
MMINR: Multi-frame-to-Multi-frame Inference with Noise
Resistance for Precipitation Nowcasting with Radar
by
Feng Sun
et al
|
|
|
|
05-03-2022
|
Cross-Domain Object Detection with Mean-Teacher
Transformer
by
Jinze Yu
et al
|
|
|
|
05-04-2022
|
Neuroevolutionary Multi-objective approaches to
Trajectory Prediction in Autonomous Vehicles
by
Fergal Stapleton
et al
|
|
|
|
05-04-2022
|
Evaluating Transferability for Covid 3D Localization
Using CT SARS-CoV-2 segmentation models
by
Constantine Maganaris
et al
|
|
|
|
05-05-2022
|
Exploiting Correspondences with All-pairs Correlations
for Multi-view Depth Estimation
by
Kai Cheng
et al
|
|
|
|
05-04-2022
|
Prediction of fish location by combining fisheries data
and sea bottom temperature forecasting
by
Matthieu Ospici
et al
|
|
|
|
05-04-2022
|
Homography-Based Loss Function for Camera Pose
Regression
by
Clémentin Boittiaux
et al
|
|
|
|
05-03-2022
|
Effect of Random Histogram Equalization on Breast
Calcification Analysis Using Deep Learning
by
Adarsh Bhandary Panambur
et al
|
|
|
|
05-03-2022
|
A Bidirectional Conversion Network for Cross-Spectral
Face Recognition
by
Zhicheng Cao
et al
|
|
|
|
05-05-2022
|
Hardware System Implementation for Human Detection
using HOG and SVM Algorithm
by
Van-Cam Nguyen
et al
|
|
|
|
05-05-2022
|
Parametric Reshaping of Portraits in Videos
by
Xiangjun Tang
et al
|
|
|
|
05-05-2022
|
Text to artistic image generation
by
Qinghe Tian
et al
|
|
|
|
05-05-2022
|
ImPosIng: Implicit Pose Encoding for Efficient Camera
Pose Estimation
by
Arthur Moreau
et al
|
|
|
|
05-04-2022
|
Surface Reconstruction from Point Clouds: A Survey and
a Benchmark
by
Zhangjin Huang
et al
|
|
|
|
05-06-2022
|
QLEVR: A Diagnostic Dataset for Quantificational
Language and Elementary Visual Reasoning
by
Zechen Li
et al
|
|
|
|
05-03-2022
|
MS Lesion Segmentation: Revisiting Weighting Mechanisms
for Federated Learning
by
Dongnan Liu
et al
|
|
|
|
05-04-2022
|
Zero-Episode Few-Shot Contrastive Predictive Coding:
Solving intelligence tests without prior training
by
T. Barak
et al
|
|
|
|
05-05-2022
|
Text Detection on Technical Drawings for the
Digitization of Brown-field Processes
by
Tobias Schlagenhauf
et al
|
|
|
|
05-03-2022
|
Masked Generative Distillation
by
Zhendong Yang
et al
|
|
|
|
05-03-2022
|
RU-Net: Regularized Unrolling Network for Scene Graph
Generation
by
Xin Lin
et al
|
|
|
|
05-03-2022
|
Multitask Network for Joint Object Detection, Semantic
Segmentation and Human Pose Estimation in Vehicle
Occupancy Monitoring
by
Nikolas Ebert
et al
|
|
|
|
05-05-2022
|
The Batch Artifact Scanning Protocol: A new method
using computed tomography (CT) to rapidly create
three-dimensional models of objects from large
collections en masse
by
Katrina Yezzi-Woodley
et al
|
|
|
|
05-05-2022
|
A Deep Reinforcement Learning Framework for Rapid
Diagnosis of Whole Slide Pathological Images
by
Tingting Zheng
et al
|
|
|
|
05-04-2022
|
Dual Branch Neural Network for Sea Fog Detection in
Geostationary Ocean Color Imager
by
Yuan Zhou
et al
|
|
|
|
05-05-2022
|
Gait Recognition in the Wild: A Benchmark
by
Zheng Zhu
et al
|
|
|
|
05-05-2022
|
Visually plausible human-object interaction capture
from wearable sensors
by
Vladimir Guzov
et al
|
|
|
|
05-04-2022
|
Unsupervised Domain Adaptation Learning for
Hierarchical Infant Pose Recognition with Synthetic
Data
by
Cheng-Yen Yang
et al
|
|
|
|
05-05-2022
|
Cross-view Transformers for real-time Map-view Semantic
Segmentation
by
Brady Zhou
et al
|
|
|
|
05-05-2022
|
Hybrid CNN Based Attention with Category Prior for User
Image Behavior Modeling
by
Xin Chen
et al
|
|
|
|
05-05-2022
|
Activity Detection in Long Surgical Videos using
Spatio-Temporal Models
by
Aidean Sharghi
et al
|
|
|
|
05-05-2022
|
Large Scale Transfer Learning for Differentially
Private Image Classification
by
Harsh Mehta
et al
|
|
|
|
05-04-2022
|
A Bayesian Detect to Track System for Robust Visual
Object Tracking and Semi-Supervised Model Learning
by
Yan Shen
et al
|
|
|
|
05-03-2022
|
FedMix: Mixed Supervised Federated Learning for Medical
Image Segmentation
by
Jeffry Wicaksana
et al
|
|
|
|
05-03-2022
|
End2End Multi-View Feature Matching using
Differentiable Pose Optimization
by
Barbara Roessle
et al
|
|
|
|
05-04-2022
|
Scene Clustering Based Pseudo-labeling Strategy for
Multi-modal Aerial View Object Classification
by
Jun Yu
et al
|
|
|
|
05-04-2022
|
Dynamic Sparse R-CNN
by
Qinghang Hong
et al
|
|
|
|
05-04-2022
|
Domino Saliency Metrics: Improving Existing Channel
Saliency Metrics with Structural Information
by
Kaveena Persand
et al
|
|
|
|
05-03-2022
|
Joint Image Compression and Denoising via Latent-Space
Scalability
by
Saeed Ranjbar Alvar
et al
|
|
|
|
05-05-2022
|
Koopman pose predictions for temporally consistent
human walking estimations
by
Marc Mitjans
et al
|
|
|
|
05-03-2022
|
Assessing Dataset Bias in Computer Vision
by
Athiya Deviyani
|
|
|
|
05-03-2022
|
UCL-Dehaze: Towards Real-world Image Dehazing via
Unsupervised Contrastive Learning
by
Yongzhen Wang
et al
|
|
|
|
05-05-2022
|
BasicTAD: an Astounding RGB-Only Baseline for Temporal
Action Detection
by
Min Yang
et al
|
|
|
|