Saining Xie

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps 2025 Nanye Ma
Shangyuan Tong
Haolin Jia
Hexiang Hu
Yu-Chuan Su
Mingda Zhang
Xuan Yang
Yandong Li
Tommi Jaakkola
Xuhui Jia
+ PDF Chat MetaMorph: Multimodal Understanding and Generation via Instruction Tuning 2024 Shengbang Tong
Daiming Fan
Jianfei Zhu
Yunyang Xiong
Xinlei Chen
Koustuv Sinha
Michael Rabbat
Yann LeCun
Saining Xie
Zhuang Liu
+ PDF Chat Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces 2024 Jihan Yang
Shusheng Yang
Anjali Gupta
Rilyn Han
Li Fei-Fei
Saining Xie
+ PDF Chat Altogether: Image Captioning via Re-aligning Alt-text 2024 Hu Xu
Po-Yao Huang
Xiaoqing Ellen Tan
Ching-Feng Yeh
Jacob Kahn
Christine Jou
Gargi Ghosh
Omer Levy
Luke Zettlemoyer
Wen-tau Yih
+ PDF Chat Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think 2024 Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
+ PDF Chat DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing 2024 June Suk Choi
Kyungmin Lee
Jongheon Jeong
Saining Xie
Jinwoo Shin
Kimin Lee
+ PDF Chat AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark 2024 Wenhao Chai
Enxin Song
Yilun Du
Chenlin Meng
Vashisht Madhavan
Omer Bar-Tal
Jenq‐Neng Hwang
Saining Xie
Christopher D. Manning
+ PDF Chat Fast Encoding and Decoding for Implicit Video Representation 2024 Hao Chen
Saining Xie
Ser-Nam Lim
Abhinav Shrivastava
+ PDF Chat On Scaling Up 3D Gaussian Splatting Training 2024 Hexu Zhao
Haoyang Weng
Daohan Lu
Ang Li
Jinyang Li
Aurojit Panda
Saining Xie
+ PDF Chat Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs 2024 Shengbang Tong
Ellis Brown
Penghao Wu
S. Woo
Manoj Middepogu
Sai Charitha Akula
Jihan Yang
Shusheng Yang
Adithya Iyer
Xichen Pan
+ PDF Chat Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning 2024 Yuexiang Zhai
Hao Bai
Zipeng Lin
Jiayi Pan
Shengbang Tong
Yifei Zhou
Alane Suhr
Saining Xie
Yann LeCun
Yi Ma
+ PDF Chat MoDE: CLIP Data Experts via Clustering 2024 Jiawei Ma
Po-Yao Huang
Saining Xie
Shang-Wen Li
Luke Zettlemoyer
Shih‐Fu Chang
Wen-tau Yih
Xu Hu
+ PDF Chat V-IRL: Grounding Virtual Intelligence in Real Life 2024 Jihan Yang
Runyu Ding
Ellis Brown
Xiaojuan Qi
Saining Xie
+ Image Sculpting: Precise Object Editing with 3D Geometry Control 2024 Jiraphon Yenphraphai
Xichen Pan
Sainan Liu
Daniele Panozzo
Saining Xie
+ Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs 2024 Shengbang Tong
Zhuang Liu
Yuexiang Zhai
Yi Ma
Yann LeCun
Saining Xie
+ SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers 2024 Nanye Ma
Larry B. Goldstein
Michael S. Albergo
Nicholas M. Boffi
Eric Vanden‐Eijnden
Saining Xie
+ Deconstructing Denoising Diffusion Models for Self-Supervised Learning 2024 Xinlei Chen
Zhuang Liu
Saining Xie
Kaiming He
+ PDF Chat Going Denser with Open-Vocabulary Part Segmentation 2023 Peize Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
+ PDF Chat Scalable Diffusion Models with Transformers 2023 William Peebles
Saining Xie
+ PDF Chat CiT: Curation in Training for Effective Vision-Language Data 2023 Xu Hu
Saining Xie
Po-Yao Huang
Licheng Yu
Russell Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
+ PDF Chat ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders 2023 Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
+ ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders 2023 Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
+ CiT: Curation in Training for Effective Vision-Language Data 2023 Xu Hu
Saining Xie
Po-Yao Huang
Licheng Yu
Russell Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
+ Going Denser with Open-Vocabulary Part Segmentation 2023 Peize Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
+ Demystifying CLIP Data 2023 Xu Hu
Saining Xie
Xiaoqing Ellen Tan
Po-Yao Huang
Russell Howes
Vasu Sharma
Shang-Wen Li
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
+ V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs 2023 Penghao Wu
Saining Xie
+ PDF Chat Masked Feature Prediction for Self-Supervised Visual Pre-Training 2022 Chen Wei
Haoqi Fan
Saining Xie
Chao-Yuan Wu
Alan Yuille
Christoph Feichtenhofer
+ PDF Chat Masked Autoencoders Are Scalable Vision Learners 2022 Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross Girshick
+ A ConvNet for the 2020s 2022 Zhuang Liu
Hanzi Mao
Chao-Yuan Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
+ Exploring Long-Sequence Masked Autoencoders 2022 Ronghang Hu
Shoubhik Debnath
Saining Xie
Xinlei Chen
+ Scalable Diffusion Models with Transformers 2022 William Peebles
Saining Xie
+ PDF Chat SLIP: Self-supervision Meets Language-Image Pre-training 2022 Norman Mu
Alexander Kirillov
David Wagner
Saining Xie
+ PDF Chat Masked Autoencoders Are Scalable Vision Learners 2021 Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross Girshick
+ PDF Chat An Empirical Study of Training Self-Supervised Vision Transformers 2021 Xinlei Chen
Saining Xie
Kaiming He
+ PDF Chat Pri3D: Can 3D Priors Help 2D Representation Learning? 2021 Ji Hou
Saining Xie
Benjamin Graham
Angela Dai
Matthias Niesner
+ PDF Chat Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts 2021 Ji Hou
Benjamin Graham
Matthias Niesner
Saining Xie
+ PDF Chat On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness 2021 Eric Mintun
Alexander Kirillov
Saining Xie
+ Pri3D: Can 3D Priors Help 2D Representation Learning? 2021 Ji Hou
Saining Xie
Benjamin Graham
Angela Dai
Matthias Nießner
+ An Empirical Study of Training Self-Supervised Vision Transformers 2021 Xinlei Chen
Saining Xie
Kaiming He
+ Masked Autoencoders Are Scalable Vision Learners 2021 Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross Girshick
+ On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness 2021 Eric Mintun
Alexander Kirillov
Saining Xie
+ Benchmarking Detection Transfer Learning with Vision Transformers 2021 Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollár
Kaiming He
Ross Girshick
+ A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision 2021 Ajinkya Tejankar
Maziar Sanjabi
Bichen Wu
Saining Xie
Madian Khabsa
Hamed Pirsiavash
Hamed Firooz
+ Masked Feature Prediction for Self-Supervised Visual Pre-Training 2021 Wei Chen
Haoqi Fan
Saining Xie
Chao-Yuan Wu
Alan Yuille
Christoph Feichtenhofer
+ SLIP: Self-supervision meets Language-Image Pre-training 2021 Norman Mu
Alexander M. Kirillov
David Wagner
Saining Xie
+ PDF Chat Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts 2020 Ji Hou
Benjamin A.T. Graham
Matthias Nießner
Saining Xie
+ PDF Chat FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions 2020 Alvin Wan
Xiaoliang Dai
Peizhao Zhang
Zijian He
Yuandong Tian
Saining Xie
Bichen Wu
Matthew Yu
Tao Xu
Kan Chen
+ PDF Chat Momentum Contrast for Unsupervised Visual Representation Learning 2020 Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross Girshick
+ Are Labels Necessary for Neural Architecture Search? 2020 Chenxi Liu
Piotr Dollár
Kaiming He
Ross Girshick
Alan Yuille
Saining Xie
+ FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions 2020 Alvin Wan
Xiaoliang Dai
Peizhao Zhang
Zijian He
Yuandong Tian
Saining Xie
Bichen Wu
Matthew Yu
Tao Xu
Kan Chen
+ Graph Structure of Neural Networks 2020 Jiaxuan You
Jure Leskovec
Kaiming He
Saining Xie
+ PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding 2020 Saining Xie
Jiatao Gu
Demi Guo
Charles R. Qi
Leonidas Guibas
Or Litany
+ PDF Chat Are Labels Necessary for Neural Architecture Search? 2020 Chenxi Liu
Piotr Dollár
Kaiming He
Ross Girshick
Alan Yuille
Saining Xie
+ Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts 2020 Ji Hou
Benjamin Graham
Matthias Nießner
Saining Xie
+ PDF Chat PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding 2020 Saining Xie
Jiatao Gu
Demi Guo
Charles R. Qi
Leonidas Guibas
Or Litany
+ Decoupling Representation and Classifier for Long-Tailed Recognition 2019 Bingyi Kang
Saining Xie
Marcus Rohrbach
Zhicheng Yan
Albert Gordo
Jiashi Feng
Yannis Kalantidis
+ PDF Chat On Network Design Spaces for Visual Recognition 2019 Ilija Radosavovic
Justin Johnson
Saining Xie
Wan‐Yen Lo
Piotr Dollár
+ PDF Chat Exploring Randomly Wired Neural Networks for Image Recognition 2019 Saining Xie
Alexander Kirillov
Ross Girshick
Kaiming He
+ Exploring Randomly Wired Neural Networks for Image Recognition 2019 Saining Xie
Alexander Kirillov
Ross Girshick
Kaiming He
+ Sample-Efficient Neural Architecture Search by Learning Action Space 2019 Linnan Wang
Saining Xie
Teng Li
Rodrigo Fonseca
Yuandong Tian
+ Momentum Contrast for Unsupervised Visual Representation Learning 2019 Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross Girshick
+ Decoupling Representation and Classifier for Long-Tailed Recognition 2019 Bingyi Kang
Saining Xie
Marcus Rohrbach
Zhicheng Yan
Albert Gordo
Jiashi Feng
Yannis Kalantidis
+ On Network Design Spaces for Visual Recognition 2019 Ilija Radosavovic
Justin C. Johnson
Saining Xie
Wan‐Yen Lo
Piotr Dollár
+ PDF Chat Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification 2018 Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Murphy
+ Rethinking Spatiotemporal Feature Learning For Video Understanding. 2017 Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Murphy
+ PDF Chat Aggregated Residual Transformations for Deep Neural Networks 2017 Saining Xie
Ross Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
+ PDF Chat Holistically-Nested Edge Detection 2017 Saining Xie
Zhuowen Tu
+ Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification 2017 Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Murphy
+ Aggregated Residual Transformations for Deep Neural Networks 2016 Saining Xie
Ross Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
+ PDF Chat Top-Down Learning for Structured Labeling with Convolutional Pseudoprior 2016 Saining Xie
Xun Huang
Zhuowen Tu
+ PDF Chat Holistically-Nested Edge Detection 2015 Saining Xie
Zhuowen Tu
+ Convolutional Pseudo-Prior for Structured Labeling. 2015 Saining Xie
Xun Huang
Zhuowen Tu
+ Top-Down Learning for Structured Labeling with Convolutional Pseudoprior 2015 Saining Xie
Xun Huang
Zhuowen Tu
+ Holistically-Nested Edge Detection 2015 Saining Xie
Zhuowen Tu
+ Deeply-Supervised Nets 2014 Chen‐Yu Lee
Saining Xie
Patrick W. Gallagher
Zhengyou Zhang
Zhuowen Tu
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
26
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
15
+ PDF Chat Momentum Contrast for Unsupervised Visual Representation Learning 2020 Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross Girshick
14
+ PDF Chat Aggregated Residual Transformations for Deep Neural Networks 2017 Saining Xie
Ross Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
13
+ PDF Chat Fully convolutional networks for semantic segmentation 2015 Jonathan Long
Evan Shelhamer
Trevor Darrell
12
+ A Simple Framework for Contrastive Learning of Visual Representations 2020 Ting Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
11
+ PDF Chat Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation 2014 Ross Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
11
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
11
+ Representation Learning with Contrastive Predictive Coding 2018 Aäron van den Oord
Yazhe Li
Oriol Vinyals
10
+ PDF Chat ImageNet Large Scale Visual Recognition Challenge 2015 Olga Russakovsky
Jia Deng
Hao Su
Jonathan Krause
Sanjeev Satheesh
Sean Ma
Zhiheng Huang
Andrej Karpathy
Aditya Khosla
Michael S. Bernstein
10
+ PDF Chat Unsupervised Visual Representation Learning by Context Prediction 2015 Carl Doersch
Abhinav Gupta
Alexei A. Efros
10
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr Dollár
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
9
+ PDF Chat Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles 2016 Mehdi Noroozi
Paolo Favaro
9
+ PDF Chat Feature Pyramid Networks for Object Detection 2017 Tsung-Yi Lin
Piotr Dollár
Ross Girshick
Kaiming He
Bharath Hariharan
Serge Belongie
8
+ PDF Chat Learning Transferable Architectures for Scalable Image Recognition 2018 Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
8
+ Learning deep representations by mutual information estimation and maximization 2018 R Devon Hjelm
Alex Fedorov
Samuel Lavoie-Marchildon
Karan Grewal
Phil Bachman
Adam Trischler
Yoshua Bengio
8
+ Neural Architecture Search with Reinforcement Learning 2016 Barret Zoph
Quoc V. Le
8
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
8
+ PDF Chat 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks 2019 Christopher Choy
JunYoung Gwak
Silvio Savarese
7
+ PDF Chat Going deeper with convolutions 2015 Christian Szegedy
Wei Liu
Yangqing Jia
Pierre Sermanet
Scott Reed
Dragomir Anguelov
Dumitru Erhan
Vincent Vanhoucke
Andrew Rabinovich
7
+ PDF Chat LVIS: A Dataset for Large Vocabulary Instance Segmentation 2019 Agrim Gupta
Piotr Dollár
Ross Girshick
7
+ PDF Chat Progressive Neural Architecture Search 2018 Chenxi Liu
Barret Zoph
Maxim Neumann
Jonathon Shlens
Hua Wei
Li-Jia Li
Li Fei-Fei
Alan Yuille
Jonathan Huang
Kevin Murphy
7
+ PDF Chat Emerging Properties in Self-Supervised Vision Transformers 2021 Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jeǵou
Julien Mairal
Piotr Bojanowski
Armand Joulin
7
+ MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 2017 Andrew Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
Marco Andreetto
Hartwig Adam
7
+ PDF Chat 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks 2018 Benjamin Graham
Martin Engelcke
Laurens van der Maaten
7
+ PDF Chat Densely Connected Convolutional Networks 2017 Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
7
+ PDF Chat GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud 2019 Li Yi
Wang Zhao
He Wang
Minhyuk Sung
Leonidas Guibas
6
+ PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space 2017 Charles R. Qi
Yi Li
Hao Su
Leonidas Guibas
6
+ Improved Baselines with Momentum Contrastive Learning 2020 Xinlei Chen
Haoqi Fan
Ross Girshick
Kaiming He
6
+ PDF Chat ImVoteNet: Boosting 3D Object Detection in Point Clouds With Image Votes 2020 Charles R. Qi
Xinlei Chen
Or Litany
Leonidas Guibas
6
+ PDF Chat Identity Mappings in Deep Residual Networks 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
6
+ PDF Chat Frustum PointNets for 3D Object Detection from RGB-D Data 2018 Charles R. Qi
Wei Liu
Chenxia Wu
Hao Su
Leonidas Guibas
6
+ Regularized Evolution for Image Classifier Architecture Search 2019 Esteban Real
Alok Aggarwal
Yanping Huang
Quoc V. Le
6
+ PDF Chat ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes 2017 Angela Dai
Anne Lynn S. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
6
+ PDF Chat Deep Hough Voting for 3D Object Detection in Point Clouds 2019 Charles R. Qi
Or Litany
Kaiming He
Leonidas Guibas
6
+ PDF Chat Rethinking the Inception Architecture for Computer Vision 2016 Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jon Shlens
Zbigniew Wojna
6
+ PDF Chat Unsupervised Learning of Visual Representations Using Videos 2015 Xiaolong Wang
Abhinav Gupta
6
+ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018 Jacob Devlin
Ming‐Wei Chang
Kenton Lee
Kristina Toutanova
6
+ SGDR: Stochastic Gradient Descent with Warm Restarts 2016 Ilya Loshchilov
Frank Hutter
6
+ Unsupervised Representation Learning by Predicting Image Rotations 2018 Spyros Gidaris
Praveer Singh
Nikos Komodakis
5
+ DARTS: Differentiable Architecture Search 2018 Hanxiao Liu
Karen Simonyan
Yiming Yang
5
+ PDF Chat Deep Clustering for Unsupervised Learning of Visual Features 2018 Mathilde Caron
Piotr Bojanowski
Armand Joulin
Matthijs Douze
5
+ PDF Chat Rethinking ImageNet Pre-Training 2019 Kaiming He
Ross Girshick
Piotr Dollár
5
+ Bootstrap your own latent: A new approach to self-supervised Learning 2020 Jean-Bastien Grill
Florian Strub
Florent Altché
Corentin Tallec
Pierre H. Richemond
Elena Buchatskaya
Carl Doersch
Bernardo Ávila Pires
Zhaohan Daniel Guo
Mohammad Gheshlaghi Azar
5
+ PDF Chat Learning Features by Watching Objects Move 2017 Deepak Pathak
Ross Girshick
Piotr Dollár
Trevor Darrell
Bharath Hariharan
5
+ PDF Chat PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 2017 Raffaelli Charles
Hao Su
Kaichun Mo
Leonidas Guibas
5
+ PDF Chat Exploring the Limits of Weakly Supervised Pretraining 2018 Dhruv Mahajan
Ross Girshick
Vignesh Ramanathan
Kaiming He
Manohar Paluri
Yixuan Li
Ashwin Bharambe
Laurens van der Maaten
5
+ PDF Chat Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset 2017 João Carreira
Andrew Zisserman
5
+ PDF Chat Xception: Deep Learning with Depthwise Separable Convolutions 2017 François Chollet
5
+ PDF Chat 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans 2019 Ji Hou
Angela Dai
Matthias NieBner
5