Renrui Zhang

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models 2025 Jiayi Lei
Renrui Zhang
Xiangfei Hu
Weifeng Lin
Zhen Li
Wenjian Sun
Ruoyi Du
Le Zhuo
Zhongyu Li
Xinyue Li
+ PDF Chat Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step 2025 Z. J. Guo
Renrui Zhang
Chengzhuo Tong
Zhao Zhizheng
Peng Gao
Hongsheng Li
Pheng‐Ann Heng
+ PDF Chat Training-free Regional Prompting for Diffusion Transformers 2024 Anthony Chen
Jianjin Xu
Wenzhao Zheng
Gaole Dai
Yida Wang
Renrui Zhang
H Wang
Shanghang Zhang
+ PDF Chat PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions 2024 Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
Siyuan Huang
Junlin Xie
Yuhui Qiao
Peng Gao
Hongsheng Li
+ PDF Chat SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners 2024 Z. J. Guo
Renrui Zhang
Xiangyang Zhu
Chengzhuo Tong
Peng Gao
Chunyuan Li
Pheng‐Ann Heng
+ PDF Chat LLaVA-OneVision: Easy Visual Task Transfer 2024 Bo Li
Yuanhan Zhang
Dong Guo
Renrui Zhang
Feng Li
Hao Zhang
Kaichen Zhang
Yanwei Li
Ziwei Liu
Chunyuan Li
+ PDF Chat LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models 2024 Feng Li
Renrui Zhang
Hao Zhang
Yuanhan Zhang
Bo Li
Wei Li
Zejun Ma
Chunyuan Li
+ PDF Chat RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation 2024 Jiaming Liu
Mengzhen Liu
Zhenyu Wang
Lily Lee
Kaichen Zhou
Pengju An
Senqiao Yang
Renrui Zhang
Yandong Guo
Shanghang Zhang
+ PDF Chat Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation 2024 Jiaming Liu
Chenxuan Li
Guanqun Wang
Lily Lee
Kaichen Zhou
Sixiang Chen
Chuyan Xiong
Jiaxin Ge
Renrui Zhang
Shanghang Zhang
+ PDF Chat RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision 2024 Mingjie Pan
Jiaming Liu
Renrui Zhang
Peixiang Huang
Xiaoqi Li
Hongwei Xie
Bing Wang
Li Liu
Shanghang Zhang
+ PDF Chat MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI 2024 Kaining Ying
Fanqing Meng
Jin Wang
Zhiqian Li
Lin Han
Yue Yang
Hao Zhang
Wenbo Zhang
Yuqi Lin
Shuo Liu
+ PDF Chat Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation 2024 Shilin Yan
Renrui Zhang
Z. J. Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Hao Dong
Zhongjiang He
Peng Gao
+ PDF Chat OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning 2024 Lingyi Hong
Shilin Yan
Renrui Zhang
Wanyun Li
Xinyu Zhou
Pinxue Guo
Kaixun Jiang
Yi-Ting Chen
Jinglun Li
Zhaoyu Chen
+ Dynamic Embedding Size Search with Minimum Regret for Streaming Recommender System 2023 Bowei He
Xu He
Renrui Zhang
Yingxue Zhang
Ruiming Tang
Chen Ma
+ PDF Chat Revisiting Event-Based Video Frame Interpolation 2023 Jiaben Chen
Yichen Zhu
Dongze Lian
Jiaqi Yang
Yifu Wang
Renrui Zhang
Xinhang Liu
Shenhan Qian
Laurent Kneip
Shenghua Gao
+ PDF Chat Decorate the Newcomers: Visual Domain Prompt for Continual Test Time Adaptation 2023 Yulu Gan
Yan Bai
Yihang Lou
Xianzheng Ma
Renrui Zhang
Nian Shi
Lin Luo
+ PDF Chat iQuery: Instruments as Queries for Audio-Visual Sound Separation 2023 Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
+ PDF Chat Learning 3D Representations from 2D Pre-Trained Models via Image-to-Point Masked Autoencoders 2023 Renrui Zhang
Liuhui Wang
Yu Qiao
Peng Gao
Hongsheng Li
+ PDF Chat Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis 2023 Renrui Zhang
Liuhui Wang
Ziyu Guo
Jianbo Shi
+ Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis 2023 Renrui Zhang
Liuhui Wang
Ziyu Guo
Jianbo Shi
+ Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking 2023 Peng Gao
Renrui Zhang
Rongyao Fang
Ziyi Lin
Hongyang Li
Hongsheng Li
Yu Qiao
+ Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis 2023 Renrui Zhang
Liuhui Wang
Yali Wang
Peng Gao
Hongsheng Li
Jianbo Shi
+ ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance 2023 Z. J. Guo
Yiwen Tang
Renrui Zhang
Dong Wang
Zhigang Wang
Bin Zhao
Xuelong Li
+ LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model 2023 Peng Gao
Jiaming Han
Renrui Zhang
Ziyi Lin
Shijie Geng
Aojun Zhou
Wei Zhang
Pan Lu
Conghui He
Xiangyu Yue
+ Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation 2023 Shilin Yan
Renrui Zhang
Z. J. Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Zhongjiang He
Peng Gao
+ Revisiting Event-based Video Frame Interpolation 2023 Jiaben Chen
Yichen Zhu
Dongze Lian
Jiaqi Yang
Yifu Wang
Renrui Zhang
Xinhang Liu
Shenhan Qian
Laurent Kneip
Shenghua Gao
+ Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following 2023 Z. J. Guo
Renrui Zhang
Xiangyang Zhu
Yiwen Tang
Xianzheng Ma
Jiaming Han
Kexin Chen
Peng Gao
Xianzhi Li
Hongsheng Li
+ RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision 2023 Mingjie Pan
Jiaming Liu
Renrui Zhang
Peixiang Huang
Xiaoqi Li
Li Liu
Shanghang Zhang
+ MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning 2023 Ke Wang
Houxing Ren
Aojun Zhou
Zimu Lu
Sichun Luo
Weikang Shi
Renrui Zhang
Linqi Song
Mingjie Zhan
Hongsheng Li
+ 3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V 2023 Dingning Liu
Xiaomeng Dong
Renrui Zhang
Xu Luo
Peng Gao
Xiaoshui Huang
Yongshun Gong
Zhihui Wang
+ PDF Chat Can Language Understand Depth? 2022 Renrui Zhang
Ziyao Zeng
Ziyu Guo
Yafeng Li
+ PDF Chat PointCLIP: Point Cloud Understanding by CLIP 2022 Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
+ Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training 2022 Renrui Zhang
Z. J. Guo
Peng Gao
Rongyao Fang
Bin Zhao
Dong Wang
Yu Qiao
Hongsheng Li
+ Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification 2022 Renrui Zhang
Wei Zhang
Rongyao Fang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
+ iQuery: Instruments as Queries for Audio-Visual Sound Separation 2022 Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
+ Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders 2022 Renrui Zhang
Liuhui Wang
Yu Qiao
Peng Gao
Hongsheng Li
+ PDF Chat Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification 2022 Renrui Zhang
Wei Zhang
Rongyao Fang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
+ TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning 2022 Peixiang Huang
Li Liu
Renrui Zhang
Song Zhang
Xinli Xu
Baichao Wang
Guoyi Liu
+ Can Language Understand Depth? 2022 Renrui Zhang
Ziyao Zeng
Z. J. Guo
Yafeng Li
+ PDF Chat PointCLIP: Point Cloud Understanding by CLIP 2021 Renrui Zhang
Z. J. Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
+ PDF Chat Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling 2021 Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
+ CLIP-Adapter: Better Vision-Language Models with Feature Adapters 2021 Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
+ Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling 2021 Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
+ VT-CLIP: Enhancing Vision-Language Models with Visual-guided Texts 2021 Renrui Zhang
Longtian Qiu
Wei Zhang
Ziyao Zeng
+ PointCLIP: Point Cloud Understanding by CLIP 2021 Renrui Zhang
Z. J. Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
7
+ PDF Chat End-to-End Object Detection with Transformers 2020 Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
6
+ PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space 2017 Charles R. Qi
Yi Li
Hao Su
Leonidas Guibas
4
+ PDF Chat Robust fine-tuning of zero-shot models 2022 Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
Rebecca Roelofs
Raphael Gontijo Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
4
+ Learning Transferable Visual Models From Natural Language Supervision 2021 Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
4
+ PDF Chat PointCLIP: Point Cloud Understanding by CLIP 2022 Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
4
+ Dual-stream Network for Visual Recognition 2021 Mingyuan Mao
Renrui Zhang
Honghui Zheng
Peng Gao
Teli Ma
Yan Peng
Errui Ding
Shumin Han
4
+ PDF Chat Learning to Prompt for Vision-Language Models 2022 Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
4
+ PDF Chat Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 2021 Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
Baining Guo
4
+ PDF Chat Describing Textures in the Wild 2014 Mircea Cimpoi
Subhransu Maji
Iasonas Kokkinos
Sammy Mohamed
Andrea Vedaldi
3
+ Attention Is All You Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Ɓukasz Kaiser
Illia Polosukhin
3
+ PDF Chat EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification 2019 Patrick Helber
Benjamin Bischke
Andreas Dengel
Damian Borth
3
+ PDF Chat Training data-efficient image transformers & distillation through attention 2021 Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jeǔou
3
+ Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline 2021 Ankit Goyal
Hei Law
Bowei Liu
Alejandro Newell
Jia Deng
3
+ PDF Chat Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering 2018 Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Jay Gould
Lei Zhang
3
+ PDF Chat Momentum Contrast for Unsupervised Visual Representation Learning 2020 Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross Girshick
3
+ PDF Chat 3D ShapeNets: A deep representation for volumetric shapes 2015 Zhirong Wu
Shuran Song
Aditya Khosla
Fisher Yu
Linguang Zhang
Xiaoou Tang
Jianxiong Xiao
3
+ PDF Chat PCT: Point cloud transformer 2021 Meng-Hao Guo
Jun-Xiong Cai
Zheng-Ning Liu
Tai‐Jiang Mu
Ralph R. Martin
Shi‐Min Hu
3
+ LXMERT: Learning Cross-Modality Encoder Representations from Transformers 2019 Hao Tan
Mohit Bansal
3
+ PDF Chat Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data 2019 Mikaela Angelina Uy
Quang-Hieu Pham
Binh‐Son Hua
Thanh Thi Nguyen
Sai-Kit Yeung
3
+ Parameter-Efficient Transfer Learning for NLP 2019 Neil Houlsby
Andrei Giurgiu
StanisƂaw JastrzÈ©bski
Bruna Morrone
Quentin de Laroussilhe
Andréa Gesmundo
Mona Attariyan
Sylvain Gelly
3
+ PDF Chat Dynamic Graph CNN for Learning on Point Clouds 2019 Yue Wang
Yongbin Sun
Ziwei Liu
Sanjay E. Sarma
Michael M. Bronstein
Justin Solomon
3
+ PDF Chat MAttNet: Modular Attention Network for Referring Expression Comprehension 2018 Licheng Yu
Zhe Lin
Xiaohui Shen
Shuicheng Yan
Xin Lu
Mohit Bansal
Tamara L. Berg
3
+ PDF Chat Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing 2022 Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
3
+ PDF Chat Focal Loss for Dense Object Detection 2017 Tsung-Yi Lin
Priya Goyal
Ross Girshick
Kaiming He
Piotr DollĂĄr
3
+ PDF Chat Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers 2021 Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
Yabiao Wang
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip H. S. Torr
3
+ PDF Chat Fast Convergence of DETR with Spatially Modulated Co-Attention 2021 Peng Gao
Minghang Zheng
Xiaogang Wang
Jifeng Dai
Hongsheng Li
3
+ PDF Chat Walk in the Cloud: Learning Curves for Point Clouds Shape Analysis 2021 Tiange Xiang
Chaoyi Zhang
Yang Song
Jianhui Yu
Weidong Cai
3
+ PDF Chat PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 2017 Raffaelli Charles
Hao Su
Kaichun Mo
Leonidas Guibas
2
+ Making Pre-trained Language Models Better Few-shot Learners 2021 Tianyu Gao
Adam Fisch
Danqi Chen
2
+ PDF Chat Multi-view 3D Object Detection Network for Autonomous Driving 2017 Xiaozhi Chen
Huimin Ma
Ji Wan
Bo Li
Tian Xia
2
+ PDF Chat PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds 2021 Mutian Xu
Runyu Ding
Hengshuang Zhao
Xiaojuan Qi
2
+ PDF Chat VirTex: Learning Visual Representations from Textual Annotations 2021 Karan Desai
Justin Johnson
2
+ PDF Chat Localizing Visual Sounds the Hard Way 2021 Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
2
+ Fine-Grained Visual Classification of Aircraft 2013 Subhransu Maji
Esa Rahtu
Juho Kannala
Matthew B. Blaschko
Andrea Vedaldi
2
+ Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets. 2021 Chenfeng Xu
Shijia Yang
Bohan Zhai
Bichen Wu
Xiangyu Yue
Wei Zhan
PĂ©ter Vajda
Kurt Keutzer
Masayoshi Tomizuka
2
+ PDF Chat Emerging Properties in Self-Supervised Vision Transformers 2021 Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jeǔou
Julien Mairal
Piotr Bojanowski
Armand Joulin
2
+ PDF Chat VQA: Visual Question Answering 2015 Stanislaw Antol
Aishwarya Agrawal
Jiasen Lu
Margaret Mitchell
Dhruv Batra
C. Lawrence Zitnick
Devi Parikh
2
+ PDF Chat Rethinking Few-Shot Image Classification: A Good Embedding is All You Need? 2020 Yonglong Tian
Yue Wang
Dilip Krishnan
Joshua B. Tenenbaum
Phillip Isola
2
+ Bootstrap your own latent: A new approach to self-supervised Learning 2020 Jean-Bastien Grill
Florian Strub
Florent Altché
Corentin Tallec
Pierre H. Richemond
Elena Buchatskaya
Carl Doersch
Bernardo Ávila Pires
Zhaohan Daniel Guo
Mohammad Gheshlaghi Azar
2
+ PDF Chat KPConv: Flexible and Deformable Convolution for Point Clouds 2019 Hugues Thomas
Charles R. Qi
Jean‐Emmanuel Deschaud
Beatriz Marcotegui
François Goulette
Leonidas Guibas
2
+ PDF Chat DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs 2017 Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Murphy
Alan Yuille
2
+ UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild 2012 Khurram Soomro
Amir Zamir
Mubarak Shah
2
+ PDF Chat Attention on Attention for Image Captioning 2019 Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
2
+ Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering 2019 Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
2
+ End-to-End Object Detection with Adaptive Clustering Transformer 2020 Minghang Zheng
Peng Gao
Xiaogang Wang
Hongsheng Li
Hao Dong
2
+ MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 2017 Andrew Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
Marco Andreetto
Hartwig Adam
2
+ PDF Chat Learning to Compare: Relation Network for Few-Shot Learning 2018 Flood Sung
Yongxin Yang
Li Zhang
Tao Xiang
Philip H. S. Torr
Timothy M. Hospedales
2
+ PDF Chat Image Captioning with Semantic Attention 2016 Quanzeng You
Hailin Jin
Zhaowen Wang
Fang Chen
Jiebo Luo
2
+ Attention is All you Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Ɓukasz Kaiser
Illia Polosukhin
2