Publications
* indicates equal contribution among authors
† indicates co-corresponding authors
2026
- CVPRORCA: Exploring Conditions for Diffusion models in Robotic ControlAccepted to The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
- CVPRMuCo: Multi-turn Contrastive Learning for Multimodal Embedding ModelAccepted to The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
- CVPRWSPARK: Simple Post-training for Adapting pRetrained Knowledge to Robot ControlAccepted to CVPR 2026 Workshop on Scalable Robot Learning Systems, 2026
- ICLR
- ICLRAligned Novel View Image and Geometry Synthesis via Cross-modal Attention InstillationAccepted to The Fourteenth International Conference on Learning Representations (ICLR), 2026
- ICLRWVisualScratchpad: Inference-time Visual Concepts Analysis in Vision Language ModelsIn ICLR 2026 Workshop on Principled Design for Trustworthy AI - Interpretability, Robustness, and Safety across Modalities, 2026
- AAAIWActiveNeuS: Active 3D Reconstruction using Neural Implicit Surface UncertaintyIn AAAI 2026 Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD), 2026
2025
- NeurIPSToken Bottleneck: One Token to Remember DynamicsIn The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025
- CVPRMasking meets Supervision: A Strong Learning AllianceIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2025
- ICLRMorphing Tokens Draw Strong Masked Image ModelsIn The Thirteenth International Conference on Learning Representations (ICLR), 2025
2024
- ECCVLeveraging Temporal Contextualization for Video Action RecognitionEuropean Conference on Computer Vision (ECCV), 2024
- ECCVHyperbolic Entailment Filtering for Underspecified Images and TextsIn European Conference on Computer Vision (ECCV), 2024
- ECCVLearning with Unmasked Tokens Drives Stronger Vision LearnersIn European Conference on Computer Vision (ECCV), 2024
2023
- ICMLRobust Camera Pose Refinement for Multi-Resolution Hash EncodingIn Proceedings of the 40th International Conference on Machine Learning, 23–29 jul 2023
- CVPRWNeural Transformation Network To Generate Diverse Views for Contrastive LearningIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun 2023
- ICLRWhat Do Self-Supervised Vision Transformers Learn?In The Eleventh International Conference on Learning Representations (ICLR), 2023
2021
- ICCVJust a Few Points are All You Need for Multi-View Stereo: A Novel Semi-Supervised Learning Method for Multi-View StereoIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2021
- CVPRMeta Batch-Instance Normalization for Generalizable Person Re-IdentificationIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2021
2020
- ECCVAttract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain AdaptationIn European Conference on Computer Vision (ECCV), 2020
- CVPRHi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-IdentificationIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2020
- WACVRPM-Net: Robust Pixel-Level Matching Networks for Self-Supervised Video Object SegmentationIn Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2020
2019
- BMVCPseudo-labeling curriculum for unsupervised domain adaptationIn 30th British Machine Vision Conference (BMVC), 2019
- ICCVSelf-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object DetectionIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019
- ICCVSelf-Ensembling With GAN-Based Data Augmentation for Domain Adaptation in Semantic SegmentationIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019
- CVPRDiversify and Match: A Domain Adaptive Representation Learning Paradigm for Object DetectionIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
- WACVCNN-Based Semantic Segmentation Using Level Set LossIn IEEE Winter Conference on Applications of Computer Vision (WACV), 2019