Wanrong Zhu

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat Conformal prediction after efficiency-oriented model selection 2024 Ruiting Liang
Wanrong Zhu
Rina Foygel Barber
+ PDF Chat MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension 2024 Zekun Li
Xianjun Yang
Kyuri Choi
Wanrong Zhu
Ryan Hsieh
HyeonJung Kim
Jin Hyuk Lim
Sungyoung Ji
Byungju Lee
Xifeng Yan
+ PDF Chat List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs 2024 Yan An
Zhengyuan Yang
Junda Wu
Wanrong Zhu
Jianwei Yang
Linjie Li
Kevin Lin
Jianfeng Wang
Julian McAuley
Jianfeng Gao
+ PDF Chat Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models 2024 Wanrong Zhu
Jennifer Healey
Ruiyi Zhang
William Yang Wang
Tong Sun
+ PDF Chat VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View 2024 Raphael Schumann
Wanrong Zhu
Weixi Feng
Tsu-Jui Fu
Stefan Riezler
William Yang Wang
+ High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization 2024 Wanrong Zhu
Zhipeng Lou
Ziyang Wei
Wei Biao Wu
+ Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning 2023 Xinyi Wang
Wanrong Zhu
William Yang Wang
+ Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with Text 2023 Wanrong Zhu
Jack Hessel
Anas Awadalla
Samir Yitzhak Gadre
Jesse Dodge
Alex Chengyu Fang
Youngjae Yu
Ludwig Schmidt
William Yang Wang
Yejin Choi
+ Multimodal Procedural Planning via Dual Text-Image Prompting 2023 Yujie Lu
Pan Lu
Zhiyu Chen
Wanrong Zhu
Xin Eric Wang
William Yang Wang
+ Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation 2023 Wanrong Zhu
Xinyi Wang
Yujie Lu
Tsu-Jui Fu
Xin Eric Wang
Miguel P. Eckstein
William Yang Wang
+ LayoutGPT: Compositional Visual Planning and Generation with Large Language Models 2023 Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Akula
Xuehai He
Sugato Basu
Xin Eric Wang
William Yang Wang
+ VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View 2023 Raphael Schumann
Wanrong Zhu
Weixi Feng
Tsu-Jui Fu
Stefan Riezler
William Yang Wang
+ Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality 2023 Ziyang Wei
Wanrong Zhu
Wei Biao Wu
+ VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use 2023 Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
Ludwig Schimdt
+ OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models 2023 Anas Awadalla
Irena Gao
Josh Gardner
Jack Hessel
Yusuf Hanafy
Wanrong Zhu
Kalyani Marathe
Yonatan Bitton
Samir Yitzhak Gadre
Shiori Sagawa
+ Visualize Before You Write: Imagination-Guided Open-Ended Text Generation 2023 Wanrong Zhu
Yan An
Yujie Lu
Wenda Xu
Xin Wang
Miguel P. Eckstein
William Yang Wang
+ ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation 2023 Wanrong Zhu
Xin Wang
An Yan
Miguel P. Eckstein
William Yang Wang
+ Approximate co-sufficient sampling with regularization 2023 Wanrong Zhu
Rina Foygel Barber
+ GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation 2023 An Yan
Zhengyuan Yang
Wanrong Zhu
Kevin Lin
Linjie Li
Jianfeng Wang
Jianwei Yang
Yiwu Zhong
Julian McAuley
Jianfeng Gao
+ PDF Chat Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation 2023 Wanrong Zhu
Xinyi Wang
Yujie Lu
Tsu-Jui Fu
Xin Wang
Miguel P. Eckstein
William Yang Wang
+ PDF Chat Diagnosing Vision-and-Language Navigation: What Really Matters 2022 Wanrong Zhu
Yuankai Qi
Pradyumna Narayana
Kazoo Sone
Sugato Basu
Xin Wang
Qi Wu
Miguel P. Eckstein
William Yang Wang
+ End-to-end Dense Video Captioning as Sequence Generation 2022 Wanrong Zhu
Bo Pang
Ashish Thapliyal
William Yang Wang
Radu Soricut
+ PDF Chat Imagination-Augmented Natural Language Understanding 2022 Yujie Lu
Wanrong Zhu
Xin Wang
Miguel P. Eckstein
William Yang Wang
+ Neuro-Symbolic Procedural Planning with Commonsense Prompting 2022 Yujie Lu
Weixi Feng
Wanrong Zhu
Wenda Xu
Xin Eric Wang
Miguel P. Eckstein
William Yang Wang
+ Visualize Before You Write: Imagination-Guided Open-Ended Text Generation 2022 Wanrong Zhu
Yan An
Yujie Lu
Wenda Xu
Xin Eric Wang
Miguel P. Eckstein
William Yang Wang
+ CLIP also Understands Text: Prompting CLIP for Phrase Understanding 2022 Yan An
Jiacheng Li
Wanrong Zhu
Yujie Lu
William Yang Wang
Julian McAuley
+ Imagination-Augmented Natural Language Understanding 2022 Yujie Lu
Wanrong Zhu
Xin Eric Wang
Miguel P. Eckstein
William Yang Wang
+ PDF Chat Online Covariance Matrix Estimation in Stochastic Gradient Descent 2021 Wanrong Zhu
Xi Chen
Wei Biao Wu
+ Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation 2021 Wanrong Zhu
Xin Wang
Tsu-Jui Fu
An Yan
Pradyumna Narayana
Kazoo Sone
Sugato Basu
William Yang Wang
+ ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation 2021 Wanrong Zhu
Xin Wang
An Yan
Miguel P. Eckstein
William Yang Wang
+ Diagnosing Vision-and-Language Navigation: What Really Matters 2021 Wanrong Zhu
Yuankai Qi
Pradyumna Narayana
Kazoo Sone
Sugato Basu
Xin Wang
Qi Wu
Miguel P. Eckstein
William Yang Wang
+ A Fully Online Approach for Covariance Matrices Estimation of Stochastic Gradient Descent Solutions 2020 Wanrong Zhu
Xi Chen
Wei Biao Wu
+ PDF Chat Online Covariance Matrix Estimation in Stochastic Gradient Descent 2020 Wanrong Zhu
Xi Chen
Wei Biao Wu
+ Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation 2020 Wanrong Zhu
Xin Wang
Tsu-Jui Fu
An Yan
Pradyumna Narayana
Kazoo Sone
Sugato Basu
William Yang Wang
+ Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations 2020 Wanrong Zhu
Xin Wang
Pradyumna Narayana
Kazoo Sone
Sugato Basu
William Yang Wang
+ Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations 2020 Wanrong Zhu
Xin Wang
Pradyumna Narayana
Kazoo Sone
Sugato Basu
William Yang Wang
+ Online Covariance Matrix Estimation in Stochastic Gradient Descent 2020 Wanrong Zhu
Xi Chen
Wei Biao Wu
+ Text Infilling 2019 Wanrong Zhu
Zhiting Hu
Eric P. Xing
+ Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation 2019 Zhiting Hu
Haoran Shi
Bowen Tan
Wentao Wang
Zichao Yang
Tiancheng Zhao
Junxian He
Lianhui Qin
Di Wang
Xuezhe Ma
+ Text Infilling 2019 Wanrong Zhu
Zhiting Hu
Eric Xing
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat CIDEr: Consensus-based image description evaluation 2015 Ramakrishna Vedantam
C. Lawrence Zitnick
Devi Parikh
7
+ Learning Transferable Visual Models From Natural Language Supervision 2021 Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
6
+ UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation. 2020 Huaishao Luo
Lei Ji
Botian Shi
Haoyang Huang
Nan Duan
Tianrui Li
Xilin Chen
Ming Zhou
5
+ VisualBERT: A Simple and Performant Baseline for Vision and Language 2019 Liunian Harold Li
Mark Yatskar
Da Yin
Cho‐Jui Hsieh
Kai-Wei Chang
5
+ Adam: A Method for Stochastic Optimization 2014 Diederik P. Kingma
Jimmy Ba
5
+ PDF Chat UNITER: UNiversal Image-TExt Representation Learning 2020 Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
5
+ PDF Chat SPICE: Semantic Propositional Image Caption Evaluation 2016 Peter Anderson
Basura Fernando
Mark Johnson
Stephen Jay Gould
5
+ PDF Chat VideoBERT: A Joint Model for Video and Language Representation Learning 2019 Chen Sun
Austin Myers
Carl Vondrick
Kevin Murphy
Cordelia Schmid
5
+ Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation 2020 Jiannan Xiang
Xin Wang
William Yang Wang
4
+ PDF Chat Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout 2019 Hao Tan
Licheng Yu
Mohit Bansal
4
+ Attention Is All You Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
4
+ Attention is All you Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
4
+ PDF Chat Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation 2019 Xin Wang
Qiuyuan Huang
Aslı Çelikyılmaz
Jianfeng Gao
Dinghan Shen
Yuan-Fang Wang
William Yang Wang
Lei Zhang
4
+ PDF Chat Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training 2020 Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
4
+ Toward Controlled Generation of Text 2017 Zhiting Hu
Zichao Yang
Xiaodan Liang
Ruslan Salakhutdinov
Eric P. Xing
4
+ PDF Chat Approximation Methods which Converge with Probability one 1954 J. R. Blum
3
+ PDF Chat The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation 2019 Chih‐Yao Ma
Zuxuan Wu
Ghassan AlRegib
Caiming Xiong
Zsolt Kira
3
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
3
+ LAION-5B: An open large-scale dataset for training next generation image-text models 2022 C Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
Mehdi Cherti
Theo Coombes
Aarush Katta
C. H. Mullis
Mitchell Wortsman
3
+ PDF Chat Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation 2018 Xin Wang
Wenhan Xiong
Hongmin Wang
William Yang Wang
3
+ Unsupervised Text Style Transfer using Language Models as Discriminators 2018 Zichao Yang
Zhiting Hu
Chris Dyer
Eric P. Xing
Taylor Berg-Kirkpatrick
3
+ Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers 2020 Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
3
+ ON STOCHASTIC APPROXIMATION 1956 A Dvoretzky
3
+ PDF Chat High-Resolution Image Synthesis with Latent Diffusion Models 2022 Robin Rombach
Andreas Blattmann
Dominik Lorenz
Patrick Esser
Björn Ommer
3
+ PDF Chat Statistical inference for model parameters in stochastic gradient descent 2020 Xi Chen
Jason D. Lee
Xin Tong
Yichen Zhang
3
+ PDF Chat Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments 2018 Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Jay Gould
Anton van den Hengel
3
+ Zero-Shot Text-to-Image Generation 2021 Aditya Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
3
+ Speaker-Follower Models for Vision-and-Language Navigation 2018 Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis‐Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
3
+ PDF Chat Batch means and spectral variance estimators in Markov chain Monte Carlo 2010 James M. Flegal
Galin L. Jones
3
+ Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation 2019 Vihan Jain
Gabriel Magalhaes
Alexander Ku
Ashish Vaswani
Eugene Ie
Jason Baldridge
3
+ Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View 2020 Harsh Mehta
Yoav Artzi
Jason Baldridge
Eugene Ie
Piotr Mirowski
3
+ PDF Chat TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments 2019 Howard Chen
Alane Suhr
Dipendra Misra
Noah Snavely
Yoav Artzi
3
+ Scalable statistical inference for averaged implicit stochastic gradient descent 2019 Yixin Fang
3
+ PDF Chat On Asymptotic Normality in Stochastic Approximation 1968 Václav Fabian
3
+ ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks 2019 Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
3
+ A Convergence Theorem for Non Negative Almost Supermartingales and Some Applications 1985 Herbert Robbins
David Siegmund
3
+ LXMERT: Learning Cross-Modality Encoder Representations from Transformers 2019 Hao Tan
Mohit Bansal
3
+ PDF Chat Statistical Inference for the Population Landscape via Moment-Adjusted Stochastic Gradients 2019 Tengyuan Liang
Weijie Su
3
+ BERTScore: Evaluating Text Generation with BERT 2020 Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
3
+ BERTScore: Evaluating Text Generation with BERT 2019 Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
3
+ Language Models are Few-Shot Learners 2020 T. B. Brown
Benjamin F. Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
3
+ Asymptotic and finite-sample properties of estimators based on stochastic gradients 2017 Panos Toulis
Edoardo M. Airoldi
2
+ Multi30K: Multilingual English-German Image Descriptions 2016 Desmond Elliott
Stella Frank
Khalil Sima’an
Lucia Specia
2
+ A Progressive Block Empirical Likelihood Method for Time Series 2013 Young Min Kim
Soumendra N. Lahiri
Daniel J. Nordman
2
+ Learning to Navigate in Cities Without a Map 2018 Piotr Mirowski
Matthew Koichi Grimes
Mateusz Malinowski
Karl Moritz Hermann
Keith Anderson
Denis Teplyashin
Karen Simonyan
Koray Kavukcuoglu
Andrew Zisserman
Raia Hadsell
2
+ PDF Chat Style Transfer in Text: Exploration and Evaluation 2018 Zhenxin Fu
Xiaoye Tan
Nanyun Peng
Dongyan Zhao
Rui Yan
2
+ Self-Monitoring Navigation Agent via Auxiliary Progress Estimation 2019 Chih‐Yao Ma
Jiasen Lu
Zuxuan Wu
Ghassan AlRegib
Zsolt Kira
Richard Socher
Caiming Xiong
2
+ PDF Chat No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling 2018 Xin Wang
Wenhu Chen
Yuan-Fang Wang
William Yang Wang
2
+ PDF Chat Story Ending Generation with Incremental Encoding and Commonsense Knowledge 2019 Jian Guan
Yansen Wang
Minlie Huang
2
+ PDF Chat Stacked Cross Attention for Image-Text Matching 2018 Kuang-Huei Lee
Xi Chen
Gang Hua
Houdong Hu
Xiaodong He
2