2022.5.9 Vision papers

05-04-2022	CoCa: Contrastive Captioners are Image-Text Foundation Models by Jiahui Yu et al
05-04-2022	Sequencer: Deep LSTM for Image Classification by Yuki Tatsunami et al
05-05-2022	GANimator: Neural Motion Synthesis from a Single Sequence by Peizhuo Li et al
05-03-2022	Better plain ViT baselines for ImageNet-1k by Lucas Beyer et al
05-03-2022	Subspace Diffusion Generative Models by Bowen Jing et al
05-03-2022	Deep Learning in Multimodal Remote Sensing Data Fusion: A Comprehensive Review by Jiaxin Li et al
05-05-2022	Language Models Can See: Plugging Visual Controls in Text Generation by Yixuan Su et al
05-03-2022	Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) by Alex Fang et al
05-05-2022	Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction by Yining Hong et al
05-04-2022	COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles by Jiaxun Cui et al
05-03-2022	DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks by Shih-Yang Su et al
05-03-2022	i-Code: An Integrative and Composable Multimodal Learning Framework by Ziyi Yang et al
05-05-2022	BlobGAN: Spatially Disentangled Scene Representations by Dave Epstein et al
05-05-2022	Dual Octree Graph Networks for Learning Adaptive Volumetric Shape Representations by Peng-Shuai Wang et al
05-05-2022	Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects by Bangbang Yang et al
05-04-2022	P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision by He Zhao et al
05-05-2022	Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics by Sizhe Li et al
05-03-2022	Toward Modeling Creative Processes for Algorithmic Painting by Aaron Hertzmann
05-04-2022	All You May Need for VQA are Image Captions by Soravit Changpinyo et al
05-03-2022	Visual Commonsense in Pretrained Unimodal and Multimodal Models by Chenyu Zhang et al
05-05-2022	Holistic Approach to Measure Sample-level Adversarial Vulnerability and its Utility in Building Trustworthy Systems by Gaurav Kumar Nayak et al
05-05-2022	Real-time Controllable Motion Transition for Characters by Xiangjun Tang et al
05-05-2022	Neural Jacobian Fields: Learning Intrinsic Mappings of Arbitrary Meshes by Noam Aigerman et al
05-03-2022	End-to-End Visual Editing with a Generatively Pre-Trained Artist by Andrew Brown et al
05-03-2022	Predicting Loose-Fitting Garment Deformations Using Bone-Driven Motion Networks by Xiaoyu Pan et al
05-04-2022	Compound virtual screening by learning-to-rank with gradient boosting decision tree and enrichment-based cumulative gain by Kairi Furui et al
05-03-2022	GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping by Pan Ji et al
05-04-2022	Video Extrapolation in Space and Time by Yunzhi Zhang et al
05-03-2022	Cross-modal Representation Learning for Zero-shot Action Recognition by Chung-Ching Lin et al
05-04-2022	Pik-Fix: Restoring and Colorizing Old Photo by Runsheng Xu et al
05-03-2022	Multimodal Detection of Unknown Objects on Roads for Autonomous Driving by Daniel Bogdoll et al
05-03-2022	BioTouchPass: Handwritten Passwords for Touchscreen Biometrics by Ruben Tolosana et al
05-06-2022	CLIP-CLOP: CLIP-Guided Collage and Photomontage by Piotr Mirowski et al
05-03-2022	HL-Net: Heterophily Learning Network for Scene Graph Generation by Xin Lin et al
05-05-2022	DropTrack -- automatic droplet tracking using deep learning for microfluidic applications by Mihir Durve et al
05-04-2022	MM-Claims: A Dataset for Multimodal Claim Detection in Social Media by Gullal S. Cheema et al
05-03-2022	A Comprehensive Survey of Image Augmentation Techniques for Deep Learning by Mingle Xu et al
05-03-2022	Distilling Governing Laws and Source Input for Dynamical Systems from Videos by Lele Luan et al
05-04-2022	Spot-adaptive Knowledge Distillation by Jie Song et al
05-04-2022	SVTS: Scalable Video-to-Speech Synthesis by Rodrigo Mira et al
05-03-2022	Multi-view Geometry: Correspondences Refinement Based on Algebraic Properties by Trung-Kien Le et al
05-03-2022	An Empirical Analysis of the Use of Real-Time Reachability for the Safety Assurance of Autonomous Vehicles by Patrick Musau et al
05-03-2022	A hybrid multi-object segmentation framework with model-based B-splines for microbial single cell analysis by Karina Ruzaeva et al
05-03-2022	Simpler is Better: off-the-shelf Continual Learning Through Pretrained Backbones by Francesco Pelosin
05-04-2022	Generalized Knowledge Distillation via Relationship Matching by Han-Jia Ye et al
05-03-2022	Diverse Image Captioning with Grounded Style by Franz Klein et al
05-03-2022	Outdoor Monocular Depth Estimation: A Research Review by Pulkit Vyas et al
05-03-2022	BiOcularGAN: Bimodal Synthesis and Annotation of Ocular Images by Darian Tomašević et al
05-05-2022	View-labels Are Indispensable: A Multifacet Complementarity Study of Multi-view Clustering by Chuanxing Geng et al
05-05-2022	One Picture is Worth a Thousand Words: A New Wallet Recovery Process by Hervé Chabannne et al
05-03-2022	Episodic Memory Question Answering by Samyak Datta et al
05-05-2022	Do Different Deep Metric Learning Losses Lead to Similar Learned Features? by Konstantin Kobs et al
05-03-2022	Compact Neural Networks via Stacking Designed Basic Units by Weichao Lan et al
05-04-2022	Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification by Haowei Zhu et al
05-05-2022	Neural 3D Scene Reconstruction with the Manhattan-world Assumption by Haoyu Guo et al
05-03-2022	Copy Motion From One to Another: Fake Motion Video Generation by Zhenguang Liu et al
05-05-2022	OCR Synthetic Benchmark Dataset for Indic Languages by Naresh Saini et al
05-03-2022	RAFT-MSF: Self-Supervised Monocular Scene Flow using Recurrent Optimizer by Bayram Bayramli et al
05-05-2022	Biologically inspired deep residual networks for computer vision applications by Prathibha Varghese et al
05-05-2022	Intra and Cross-spectrum Iris Presentation Attack Detection in the NIR and Visible Domains Using Attention-based and Pixel-wise Supervised Learning by Meiling Fang et al
05-03-2022	Automatic Segmentation of Aircraft Dents in Point Clouds by Pasquale Lafiosca et al
05-03-2022	Cross-View Cross-Scene Multi-View Crowd Counting by Qi Zhang et al
05-04-2022	RecipeSnap -- a lightweight image-to-recipe model by Jianfa Chen et al
05-05-2022	Are GAN-based Morphs Threatening Face Recognition? by Eklavya Sarkar et al
05-04-2022	EllSeg-Gen, towards Domain Generalization for head-mounted eyetracking by Rakshit S. Kothari et al
05-04-2022	Self-supervised learning unveils morphological clusters behind lung cancer types and prognosis by Adalberto Claudio Quiros et al
05-03-2022	3D Semantic Scene Perception using Distributed Smart Edge Sensors by Simon Bultmann et al
05-05-2022	YOLOPose: Transformer-based Multi-Object 6D Pose Estimation using Keypoint Regression by Arash Amini et al
05-05-2022	Declaration-based Prompt Tuning for Visual Question Answering by Yuhang Liu et al
05-03-2022	Point Cloud Semantic Segmentation using Multi Scale Sparse Convolution Neural Network by Yunzheng Su
05-04-2022	Self-Taught Metric Learning without Labels by Sungyeon Kim et al
05-03-2022	Frequency-Selective Geometry Upsampling of Point Clouds by Viktoria Heimann et al
05-04-2022	Towards Real-time Traffic Sign and Traffic Light Detection on Embedded Systems by Oshada Jayasinghe et al
05-03-2022	Sampling-free obstacle gradients and reactive planning in Neural Radiance Fields (NeRF) by Michael Pantic et al
05-04-2022	Compressive Ptychography using Deep Image and Generative Priors by Semih Barutcu et al
05-04-2022	Hypercomplex Image-to-Image Translation by Eleonora Grassucci et al
05-05-2022	What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning by Jae Hee Lee et al
05-05-2022	MMINR: Multi-frame-to-Multi-frame Inference with Noise Resistance for Precipitation Nowcasting with Radar by Feng Sun et al
05-03-2022	Cross-Domain Object Detection with Mean-Teacher Transformer by Jinze Yu et al
05-04-2022	Neuroevolutionary Multi-objective approaches to Trajectory Prediction in Autonomous Vehicles by Fergal Stapleton et al
05-04-2022	Evaluating Transferability for Covid 3D Localization Using CT SARS-CoV-2 segmentation models by Constantine Maganaris et al
05-05-2022	Exploiting Correspondences with All-pairs Correlations for Multi-view Depth Estimation by Kai Cheng et al
05-04-2022	Prediction of fish location by combining fisheries data and sea bottom temperature forecasting by Matthieu Ospici et al
05-04-2022	Homography-Based Loss Function for Camera Pose Regression by Clémentin Boittiaux et al
05-03-2022	Effect of Random Histogram Equalization on Breast Calcification Analysis Using Deep Learning by Adarsh Bhandary Panambur et al
05-03-2022	A Bidirectional Conversion Network for Cross-Spectral Face Recognition by Zhicheng Cao et al
05-05-2022	Hardware System Implementation for Human Detection using HOG and SVM Algorithm by Van-Cam Nguyen et al
05-05-2022	Parametric Reshaping of Portraits in Videos by Xiangjun Tang et al
05-05-2022	Text to artistic image generation by Qinghe Tian et al
05-05-2022	ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation by Arthur Moreau et al
05-04-2022	Surface Reconstruction from Point Clouds: A Survey and a Benchmark by Zhangjin Huang et al
05-06-2022	QLEVR: A Diagnostic Dataset for Quantificational Language and Elementary Visual Reasoning by Zechen Li et al
05-03-2022	MS Lesion Segmentation: Revisiting Weighting Mechanisms for Federated Learning by Dongnan Liu et al
05-04-2022	Zero-Episode Few-Shot Contrastive Predictive Coding: Solving intelligence tests without prior training by T. Barak et al
05-05-2022	Text Detection on Technical Drawings for the Digitization of Brown-field Processes by Tobias Schlagenhauf et al
05-03-2022	Masked Generative Distillation by Zhendong Yang et al
05-03-2022	RU-Net: Regularized Unrolling Network for Scene Graph Generation by Xin Lin et al
05-03-2022	Multitask Network for Joint Object Detection, Semantic Segmentation and Human Pose Estimation in Vehicle Occupancy Monitoring by Nikolas Ebert et al
05-05-2022	The Batch Artifact Scanning Protocol: A new method using computed tomography (CT) to rapidly create three-dimensional models of objects from large collections en masse by Katrina Yezzi-Woodley et al
05-05-2022	A Deep Reinforcement Learning Framework for Rapid Diagnosis of Whole Slide Pathological Images by Tingting Zheng et al
05-04-2022	Dual Branch Neural Network for Sea Fog Detection in Geostationary Ocean Color Imager by Yuan Zhou et al
05-05-2022	Gait Recognition in the Wild: A Benchmark by Zheng Zhu et al
05-05-2022	Visually plausible human-object interaction capture from wearable sensors by Vladimir Guzov et al
05-04-2022	Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose Recognition with Synthetic Data by Cheng-Yen Yang et al
05-05-2022	Cross-view Transformers for real-time Map-view Semantic Segmentation by Brady Zhou et al
05-05-2022	Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling by Xin Chen et al
05-05-2022	Activity Detection in Long Surgical Videos using Spatio-Temporal Models by Aidean Sharghi et al
05-05-2022	Large Scale Transfer Learning for Differentially Private Image Classification by Harsh Mehta et al
05-04-2022	A Bayesian Detect to Track System for Robust Visual Object Tracking and Semi-Supervised Model Learning by Yan Shen et al
05-03-2022	FedMix: Mixed Supervised Federated Learning for Medical Image Segmentation by Jeffry Wicaksana et al
05-03-2022	End2End Multi-View Feature Matching using Differentiable Pose Optimization by Barbara Roessle et al
05-04-2022	Scene Clustering Based Pseudo-labeling Strategy for Multi-modal Aerial View Object Classification by Jun Yu et al
05-04-2022	Dynamic Sparse R-CNN by Qinghang Hong et al
05-04-2022	Domino Saliency Metrics: Improving Existing Channel Saliency Metrics with Structural Information by Kaveena Persand et al
05-03-2022	Joint Image Compression and Denoising via Latent-Space Scalability by Saeed Ranjbar Alvar et al
05-05-2022	Koopman pose predictions for temporally consistent human walking estimations by Marc Mitjans et al
05-03-2022	Assessing Dataset Bias in Computer Vision by Athiya Deviyani
05-03-2022	UCL-Dehaze: Towards Real-world Image Dehazing via Unsupervised Contrastive Learning by Yongzhen Wang et al
05-05-2022	BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection by Min Yang et al

05-04-2022	ANUBIS: Review and Benchmark Skeleton-Based Action Recognition Methods with a New Dataset by Zhenyue Qin et al
05-03-2022	Os Dados dos Brasileiros sob Risco na Era da Intelig\^encia Artificial? by Raoni F. da S. Teixeira et al
05-04-2022	Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation for Scene Recognition by Alejandro López-Cifuentes et al
05-03-2022	Detection of Propaganda Techniques in Visuo-Lingual Metaphor in Memes by Sunil Gundapu et al
05-03-2022	Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction by Zihao Chen et al
05-03-2022	Deep Multi-Scale U-Net Architecture and Noise-Robust Training Strategies for Histopathological Image Segmentation by Nikhil Cherian Kurian et al
05-04-2022	ShoeRinsics: Shoeprint Prediction for Forensics with Intrinsic Decomposition by Samia Shafique et al
05-04-2022	Mobile-URSONet: an Embeddable Neural Network for Onboard Spacecraft Pose Estimation by Julien Posso et al
05-05-2022	Segmentation with Super Images: A New 2D Perspective on 3D Medical Image Analysis by Ikboljon Sobirov et al
05-03-2022	Object Class Aware Video Anomaly Detection through Image Translation by Mohammad Baradaran et al
05-04-2022	Self-Supervised Learning for Invariant Representations from Multi-Spectral and SAR Images by Pallavi Jain et al
05-06-2022	Forget Less, Count Better: A Domain-Incremental Self-Distillation Learning Benchmark for Lifelong Crowd Counting by Jiaqi Gao et al
05-04-2022	BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking by Dorian Henning et al
05-03-2022	Splicing Detection and Localization In Satellite Imagery Using Conditional GANs by Emily R. Bartusiak et al
05-04-2022	An Analysis of Generative Methods for Multiple Image Inpainting by Coloma Ballester et al
05-06-2022	Incremental Data-Uploading for Full-Quantum Classification by Maniraman Periyasamy et al
05-04-2022	TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition by Haodong Duan et al
05-03-2022	Synthesized Speech Detection Using Convolutional Transformer-Based Spectrogram Analysis by Emily R. Bartusiak et al
05-05-2022	Generate and Edit Your Own Character in a Canonical View by Jeong-gi Kwak et al
05-03-2022	SpineNetV2: Automated Detection, Labelling and Radiological Grading Of Clinical MR Scans by Rhydian Windsor et al
05-03-2022	Smart City Intersections: Intelligence Nodes for Future Metropolises by Zoran Kostić et al
05-05-2022	Building Brains: Subvolume Recombination for Data Augmentation in Large Vessel Occlusion Detection by Florian Thamm et al
05-03-2022	Comparison of CoModGANs, LaMa and GLIDE for Art Inpainting- Completing M.C Eschers Print Gallery by Lucia Cipolina-Kun et al
05-05-2022	Approximate Convex Decomposition for 3D Meshes with Collision-Aware Concavity and Tree Search by Xinyue Wei et al
05-03-2022	Frequency Domain-Based Detection of Generated Audio by Emily R. Bartusiak et al
05-03-2022	Application of belief functions to medical image segmentation: A review by Ling Huang et al
05-03-2022	In Defense of Image Pre-Training for Spatiotemporal Recognition by Xianhang Li et al
05-06-2022	Continual Object Detection via Prototypical Task Correlation Guided Gating Mechanism by Binbin Yang et al
05-04-2022	Self-Supervised Super-Resolution for Multi-Exposure Push-Frame Satellites by Ngoc Long Nguyen et al
05-04-2022	UnrealNAS: Can We Search Neural Architectures with Unreal Data? by Zhen Dong et al
05-03-2022	License Plate Privacy in Collaborative Visual Analysis of Traffic Scenes by Saeed Ranjbar Alvar et al
05-04-2022	DeepPortraitDrawing: Generating Human Body Images from Freehand Sketches by Xian Wu et al
05-04-2022	SDF-based RGB-D Camera Tracking in Neural Scene Representations by Leonard Bruns et al
05-04-2022	Generative Adversarial Network Based Synthetic Learning and a Novel Domain Relevant Loss Term for Spine Radiographs by Ethan Schonfeld et al
05-04-2022	Understanding Transfer Learning for Chest Radiograph Clinical Report Generation with Modified Transformer Architectures by Edward Vendrow et al
05-05-2022	Learn-to-Race Challenge 2022: Benchmarking Safe Learning and Cross-domain Generalisation in Autonomous Racing by Jonathan Francis et al
05-05-2022	Invariant Content Synergistic Learning for Domain Generalization of Medical Image Segmentation by Yuxin Kang et al
05-05-2022	AdaTriplet: Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching by Khanh Nguyen et al
05-05-2022	Multi-view Point Cloud Registration based on Evolutionary Multitasking with Bi-Channel Knowledge Sharing Mechanism by Yue Wu et al
05-05-2022	Evaluating Context for Deep Object Detectors by Osman Semih Kayhan et al
05-05-2022	Atlas-powered deep learning (ADL) -- application to diffusion weighted MRI by Davood Karimi et al
05-06-2022	Quantification of Robotic Surgeries with Vision-Based Deep Learning by Dani Kiyasseh et al
05-06-2022	Controlled Dropout for Uncertainty Estimation by Mehedi Hasan et al
05-06-2022	Dual-Level Decoupled Transformer for Video Captioning by Yiqi Gao et al
05-06-2022	Crop Type Identification for Smallholding Farms: Analyzing Spatial, Temporal and Spectral Resolutions in Satellite Imagery by Depanshu Sani et al
05-05-2022	Revisiting Pretraining for Semi-Supervised Learning in the Low-Label Regime by Xun Xu et al
05-06-2022	Weakly Supervised 3D Point Cloud Segmentation via Multi-Prototype Learning by Yongyi Su et al
05-06-2022	BDIS: Bayesian Dense Inverse Searching Method for Real-Time Stereo Surgical Image Matching by Jingwei Song et al
05-05-2022	CNN-Augmented Visual-Inertial SLAM with Planar Constraints by Pan Ji et al
05-06-2022	A Fingerprint Detection Method by Fingerprint Ridge Orientation Check by Kim JuSong et al
05-06-2022	Investigating and Explaining the Frequency Bias in Image Classification by ZhiYu Lin et al
05-06-2022	Predicting Future Occupancy Grids in Dynamic Environment with Spatio-Temporal Learning by Khushdeep Singh Mann et al
05-05-2022	Multi-mode Tensor Train Factorization with Spatial-spectral Regularization for Remote Sensing Images Recovery by Gaohang Yu et al
05-06-2022	Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection by Ziteng Cui et al
05-06-2022	Prompt Distribution Learning by Yuning Lu et al
05-06-2022	Semantics-Guided Moving Object Segmentation with 3D LiDAR by Shuo Gu et al
05-04-2022	InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy by Weichen Fan et al
05-05-2022	Generating Representative Samples for Few-Shot Classification by Jingyi Xu et al
05-06-2022	All Grains, One Scheme (AGOS): Learning Multi-grain Instance Representation for Aerial Scene Classification by Qi Bi et al
05-05-2022	Scene Graph Expansion for Semantics-Guided Image Outpainting by Chiao-An Yang et al
05-03-2022	Improved Orientation Estimation and Detection with Hybrid Object Detection Networks for Automotive Radar by Michael Ulrich et al
05-04-2022	GAN Inversion for Data Augmentation to Improve Colonoscopy Lesion Classification by Mayank Golhar et al
05-04-2022	Immiscible Color Flows in Optimal Transport Networks for Image Classification by Alessandro Lonardi et al
05-03-2022	Biometric Signature Verification Using Recurrent Neural Networks by Ruben Tolosana et al
05-03-2022	Understanding Urban Water Consumption using Remotely Sensed Data by Shaswat Mohanty et al
05-05-2022	FisheyeDistill: Self-Supervised Monocular Depth Estimation with Ordinal Distillation for Fisheye Cameras by Qingan Yan et al
05-06-2022	MINI: Mining Implicit Novel Instances for Few-Shot Object Detection by Yuhang Cao et al
05-06-2022	From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data by Zhenghang Yuan et al
05-06-2022	A High-Accuracy Unsupervised Person Re-identification Method Using Auxiliary Information Mined from Datasets by Hehan Teng et al

Craig SmithMay 9, 2022