Conghui He

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat WanJuanSiLu: A High-Quality Open-Source Webtext Dataset for Low-Resource Languages 2025 Jia Yu
Fei Yuan
Rui Min
Jing Yu
Pei Chu
Jiayang Li
Wei Li
Zengqiang Zhang
Zhenxiang Li
Zhikun Ren
+ PDF Chat OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding? 2025 Yifei Li
Junbo Niu
Zhengfei Miao
Chunjiang Ge
Yuanhang Zhou
Quan-Jie He
Xiaoyi Dong
Haodong Duan
Shuangrui Ding
Rui Qian
+ PDF Chat Accelerating Diffusion Transformers with Dual Feature Caching 2024 Chang Zou
Enming Zhang
Rui Guo
Haohang Xu
Conghui He
Xuming Hu
Linfeng Zhang
+ PDF Chat Where am I? Cross-View Geo-localization with Natural Language Descriptions 2024 Junyan Ye
Honglin Lin
Leyan Ou
Dayuan Chen
Zihao Wang
Conghui He
Weijia Li
+ PDF Chat GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training 2024 Renqiu Xia
Mingsheng Li
Hancheng Ye
Wenjie Wu
Hongbin Zhou
Jiakang Yuan
Tianshuo Peng
Xinyu Cai
Xiangchao Yan
Bin Wang
+ PDF Chat InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions 2024 Pan Zhang
Xiaoyi Dong
Yapeng Cao
Yuhang Zang
Rui Qian
Xilin Wei
Lin Chen
Yifei Li
Junbo Niu
Shuangrui Ding
+ PDF Chat OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations 2024 Linke Ouyang
Yuan Qu
Hongbin Zhou
Jiawei Zhu
Rui Zhang
Qunshu Lin
Bin Wang
Zhiyuan Zhao
Man Jiang
Xiaomeng Zhao
+ PDF Chat Chimera: Improving Generalist Model with Domain-Specific Experts 2024 Tzu‐Rong Peng
Mingsheng Li
Hongbin Zhou
Renqiu Xia
Renrui Zhang
Lei Bai
Mao Song
Bin Wang
Conghui He
Aojun Zhou
+ PDF Chat Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling 2024 Zhe Chen
Weiyun Wang
Yue Cao
Yangzhou Liu
Zhangwei Gao
Erfei Cui
Jinguo Zhu
Sheng‐Long Ye
Hao Tian
Zhaoyang Liu
+ PDF Chat OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation 2024 Junyuan Zhang
Qintong Zhang
Bin Wang
Linke Ouyang
Zichen Wen
Ying Li
Ka-Ho Chow
Conghui He
Wentao Zhang
+ PDF Chat Can LLMs be Good Graph Judger for Knowledge Graph Construction? 2024 H. K. Huang
Chong Chen
Conghui He
Yang Li
Jiawei Jiang
Wentao Zhang
+ PDF Chat Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction 2024 Qintong Zhang
Victor Shea-Jay Huang
Bin Wang
Junyuan Zhang
Zhengren Wang
Hao Liang
Shawn Wang
Matthieu Lin
Conghui He
Wentao Zhang
+ PDF Chat MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models 2024 Ziyu Liu
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Haodong Duan
Conghui He
Yuanjun Xiong
Dahua Lin
Jiaqi Wang
+ PDF Chat PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction 2024 Long Xing
Qidong Huang
Xiaoyi Dong
Jiajie Lu
Pan Zhang
Yuhang Zang
Yuhang Cao
Conghui He
Jiaqi Wang
Feng Wu
+ PDF Chat DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception 2024 Zhiyuan Zhao
Hengrui Kang
Bin Wang
Conghui He
+ PDF Chat LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models 2024 Junyan Ye
Baichuan Zhou
Zilong Huang
Zhang Jun-an
Tianyi Bai
Hengrui Kang
Libin Chen
Hong‐Lin Lin
Zihao Wang
Tonghai Wu
+ PDF Chat Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining 2024 Tianyi Bai
Ling Yang
Zhen Hao Wong
Jiahui Peng
Xinlin Zhuang
Chi Zhang
Lijun Wu
Qiu Jiantao
Wentao Zhang
Binhang Yuan
+ PDF Chat Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models 2024 Bozhou Li
Hao Liang
Yang Li
Fangcheng Fu
Hongzhi Yin
Conghui He
Wentao Zhang
+ PDF Chat MinerU: An Open-Source Solution for Precise Document Content Extraction 2024 Bin Wang
Chao Xu
Xiaomeng Zhao
Linke Ouyang
Fan Wu
Zhiyuan Zhao
Rui Xu
Kaiwen Liu
Yuan Qu
Fukai Shang
+ PDF Chat BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search 2024 Linzhuang Sun
Hao Liang
Jingxuan Wei
Bihui Yu
Conghui He
Zenan Zhou
Wentao Zhang
+ PDF Chat Harnessing Diversity for Important Data Selection in Pretraining Large Language Models 2024 Chi Zhang
Huaping Zhong
Kuan Zhang
Chengliang Chai
Rui Wang
Xinlin Zhuang
Tianyi Bai
Jiantao Qiu
Lei Cao
Ju Fan
+ PDF Chat CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation 2024 Bin Wang
Fan Wu
Linke Ouyang
Zhuangcheng Gu
Rui Zhang
Renqiu Xia
Bo Zhang
Conghui He
+ PDF Chat UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios 2024 Baichuan Zhou
Haote Yang
Dairong Chen
Junyan Ye
Tianyi Bai
Jinhua Yu
Songyang Zhang
Dahua Lin
Conghui He
Weijia Li
+ PDF Chat CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis 2024 Weijia Li
J. He
Junyan Ye
Huaping Zhong
Z. Y. Zheng
Zilong Huang
Dahua Lin
Conghui He
+ PDF Chat Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning 2024 Weijia Li
Jinhua Yu
Dairong Chen
Yi Lin
Runmin Dong
Xiang Zhang
Conghui He
Haohuan Fu
+ PDF Chat Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network 2024 Junyan Ye
Zhutao Lv
Weijia Li
Jinhua Yu
Haote Yang
Huaping Zhong
Conghui He
+ PDF Chat SkyDiffusion: Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm 2024 Junyan Ye
Jun He
Weijia Li
Zhutao Lv
Jinhua Yu
Haote Yang
Conghui He
+ PDF Chat Synth-Empathy: Towards High-Quality Synthetic Empathy Data 2024 Hao Liang
Linzhuang Sun
Jingxuan Wei
Xijie Huang
Linkun Sun
Bihui Yu
Conghui He
Wentao Zhang
+ PDF Chat SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models 2024 Zheng Liu
Hao Liang
Wentao Xiong
Qinhan Yu
Conghui He
Bin Cui
Wentao Zhang
+ PDF Chat Navigating the Data Trading Crossroads: An Interdisciplinary Survey 2024 Yi Yu
Jingru Yu
Xuhong Wang
Juanjuan Li
Yilun Lin
Conghui He
Yanqing Yang
Yu Qiao
Li Li
Fei‐Yue Wang
+ PDF Chat KeyVideoLLM: Towards Large-scale Video Keyframe Selection 2024 Hao Liang
Jiapeng Li
Tianyi Bai
Chong Chen
Conghui He
Bin Cui
Wentao Zhang
+ PDF Chat InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output 2024 Pan Zhang
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Rui Qian
Lin Chen
Qipeng Guo
Haodong Duan
Bin Wang
Linke Ouyang
+ PDF Chat LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training 2024 Tong Zhu
Xiaoye Qu
Daize Dong
Jiacheng Ruan
Jingqi Tong
Conghui He
Yu Cheng
+ PDF Chat DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models 2024 Renqiu Xia
Song Mao
Xiangchao Yan
Hongbin Zhou
Bo Zhang
Hao-Yang Peng
Jiahao Pi
Daocheng Fu
Wenjie Wu
Hancheng Ye
+ PDF Chat OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text 2024 Qingyun Li
Zhe Chen
Weiyun Wang
Wenhai Wang
Sheng‐Long Ye
Zhenjiang Jin
Guanzhou Chen
Yinan He
Zhangwei Gao
Erfei Cui
+ PDF Chat OpenDataLab: Empowering General Artificial Intelligence with Open Datasets 2024 Conghui He
Wei Li
Zhenjiang Jin
Chao Xu
Bin Wang
Dahua Lin
+ PDF Chat DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data 2024 Bin Wang
Linke Ouyang
Fan Wu
Wenchang Ning
Xiao Han
Zhiyuan Zhao
Jiahui Peng
Yiying Jiang
Dahua Lin
Conghui He
+ PDF Chat A Survey of Multimodal Large Language Model from A Data-centric Perspective 2024 Tianyi Bai
Hao Liang
Binwang Wan
L. Yang
Bozhou Li
Yifan Wang
Bin Cui
Conghui He
Binhang Yuan
Wentao Zhang
+ PDF Chat FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models 2024 Wei Li
Ren Ma
Jiang Wu
Chenya Gu
Jiahui Peng
Jinyang Len
Songyang Zhang
Hang Yan
Dahua Lin
Conghui He
+ PDF Chat How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites 2024 Zhe Chen
Weiyun Wang
Hao Tian
Sheng‐Long Ye
Zhangwei Gao
Erfei Cui
Wenwen Tong
Kongzhi Hu
Jiapeng Luo
Zheng Ma
+ PDF Chat UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition 2024 Bin Wang
Zhuangcheng Gu
Chao Xu
Bo Zhang
Botian Shi
Conghui He
+ PDF Chat InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD 2024 Xiaoyi Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
Linke Ouyang
Songyang Zhang
Haodong Duan
Wenwei Zhang
Yining Li
+ PDF Chat 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions 2024 Weijia Li
Haote Yang
Zhenghao Hu
Juepeng Zheng
Gui-Song Xia
Conghui He
+ PDF Chat SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation 2024 Junyan Ye
Qiyan Luo
Jinhua Yu
Huaping Zhong
Z. Y. Zheng
Conghui He
Weijia Li
+ PDF Chat H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model 2024 Chao Pang
Jiang Wu
Jiayu Li
Yi Liu
Jiaxing Sun
Weijia Li
Xingxing Weng
Shuai Wang
Litong Feng
Gui-Song Xia
+ PDF Chat InternLM2 Technical Report 2024 Zheng Cai
Maosong Cao
Haojiong Chen
Chaoyu Chen
Keyu Chen
Xin Chen
Xun Chen
Zehui Chen
Zhi Chen
Pei Chu
+ PDF Chat VIGC: Visual Instruction Generation and Correction 2024 Bin Wang
Fan Wu
Xiao Han
Jiahui Peng
Huaping Zhong
Pan Zhang
Xiaoyi Dong
Weijia Li
Wei Li
Jiaqi Wang
+ PDF Chat Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations 2024 Jiaxing Sun
Weiquan Huang
Jiang Wu
Chenya Gu
Wei Li
Songyang Zhang
Hang Yan
Conghui He
+ PDF Chat LOCR: Location-Guided Transformer for Optical Character Recognition 2024 Yu Sun
Dongzhan Zhou
Lin Chen
Conghui He
Wanli Ouyang
Han-Sen Zhong
+ PDF Chat WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset 2024 Jiantao Qiu
Haijun Lv
Zhenjiang Jin
Rui Wang
Wenchang Ning
Jia Yu
ChaoBin Zhang
Pei Chu
Yuan Qu
Shi Jin
+ PDF Chat SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation 2024 Shuangrui Ding
Zihan Liu
Xiaoyi Dong
Pan Zhang
Rui Qian
Conghui He
Dahua Lin
Jiaqi Wang
+ PDF Chat ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training 2024 Le Zhuo
Zewen Chi
Minghao Xu
Heyan Huang
He‐Qi Zheng
Conghui He
Xian-Ling Mao
Wentao Zhang
+ PDF Chat LongWanjuan: Towards Systematic Measurement for Long Text Quality 2024 Kai Lv
Xiaoran Liu
Qipeng Guo
Hang Yan
Conghui He
Xipeng Qiu
Dahua Lin
+ PDF Chat SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models 2024 Peng Gao
Renrui Zhang
Chris Liu
Longtian Qiu
Siyuan Huang
Weifeng Lin
Shitian Zhao
Shijie Geng
Ziyi Lin
Jin Peng
+ PDF Chat InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model 2024 Xiaoyi Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
Linke Ouyang
Xilin Wei
Songyang Zhang
Haodong Duan
Maosong Cao
+ PDF Chat Exploring the Interactive Guidance for Unified and Effective Image Matting 2024 Dinghao Yang
Bin Wang
Weijia Li
Yiqi Lin
Conghui He
+ PDF Chat Exploring the Interactive Guidance for Unified and Effective Image Matting 2024 Dinghao Yang
Bin Wang
Weijia Li
Yiqi Lin
Conghui He
+ PDF Chat LOCR: Location-Guided Transformer for Optical Character Recognition 2024 Yu Sun
Dongzhan Zhou
Lin Chen
Conghui He
Wanli Ouyang
Han-Sen Zhong
+ PDF Chat V3Det: Vast Vocabulary Visual Detection Dataset 2023 Jiaqi Wang
Pan Zhang
Tao Chu
Yuhang Cao
Yujie Zhou
Tong Wu
Bin Wang
Conghui He
Dahua Lin
+ PDF Chat SEPT: Towards Scalable and Efficient Visual Pre-training 2023 Yiqi Lin
Huaibin Zheng
Zhong Hua-ping
Jinjing Zhu
Weijia Li
Conghui He
Lin Wang
+ PDF Chat Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving 2023 Xiaosong Jia
Penghao Wu
Li Chen
Jiangwei Xie
Conghui He
Junchi Yan
Hongyang Li
+ PDF Chat OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images 2023 Weijia Li
Yawen Lai
Linning Xu
Yuanbo Xiangli
Jinhua Yu
Conghui He
Gui-Song Xia
Dahua Lin
+ V3Det: Vast Vocabulary Visual Detection Dataset 2023 Jiaqi Wang
Pan Zhang
Tao Chu
Yuhang Cao
Yujie Zhou
Tong Wu
Bin Wang
Conghui He
Dahua Lin
+ LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model 2023 Peng Gao
Jiaming Han
Renrui Zhang
Ziyi Lin
Shijie Geng
Aojun Zhou
Wei Zhang
Pan Lu
Conghui He
Xiangyu Yue
+ Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving 2023 Xiaosong Jia
Penghao Wu
Li Chen
Jiangwei Xie
Conghui He
Junchi Yan
Hongyang Li
+ MMBench: Is Your Multi-modal Model an All-around Player? 2023 Yuan Liu
Haodong Duan
Yuanhan Zhang
Bo Li
Songyang Zhang
Wangbo Zhao
Yike Yuan
Jiaqi Wang
Conghui He
Ziwei Liu
+ WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models 2023 Conghui He
Zhenjiang Jin
Chao Xu
Jiantao Qiu
Bin Wang
Wei Li
Hang Yan
Jiaqi Wang
Dahua Lin
+ VIGC: Visual Instruction Generation and Correction 2023 Bin Wang
Fan Wu
Xiao Han
Jiahui Peng
Huaping Zhong
Pan Zhang
Xiaoyi Dong
Weijia Li
Wei Li
Jiaqi Wang
+ MLLM-DataEngine: An Iterative Refinement Approach for MLLM 2023 Zhiyuan Zhao
Linke Ouyang
Bin Wang
Siyuan Huang
Pan Zhang
Xiaoyi Dong
Jiaqi Wang
Conghui He
+ MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models 2023 Yidong Liu
FuKai Shang
Fang Wang
Rui Xu
Jun Wang
Wei Li
Li Yao
Conghui He
+ InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition 2023 Pan Zhang
Xiaoyi Dong
Bin Wang
Yuhang Cao
Chao Xu
Linke Ouyang
Zhiyuan Zhao
Shuangrui Ding
Songyang Zhang
Haodong Duan
+ ShareGPT4V: Improving Large Multi-Modal Models with Better Captions 2023 Chen Lin
Jisong Li
Xiaoyi Dong
Pan Zhang
Conghui He
Jiaqi Wang
Feng Zhao
Dahua Lin
+ Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization 2023 Zhiyuan Zhao
Bin Wang
Linke Ouyang
Xiaoyi Dong
Jiaqi Wang
Conghui He
+ OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation 2023 Qidong Huang
Xiaoyi Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Nenghai Yu
+ Parrot Captions Teach CLIP to Spot Text 2023 Yiqi Lin
Conghui He
Alex Jinpeng Wang
Bin Wang
Weijia Li
Mike Zheng Shou
+ PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark 2022 Li Chen
Chonghao Sima
Yang Li
Zehan Zheng
Jiajie Xu
Xiangwei Geng
Hongyang Li
Conghui He
Jianping Shi
Yu Qiao
+ Exploring the Interactive Guidance for Unified and Effective Image Matting 2022 Stephen. D. H Yang
Bin Wang
Weijia Li
YiQi Lin
Conghui He
+ OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images 2022 Weijia Li
Yawen Lai
Linning Xu
Yuanbo Xiangli
Jinhua Yu
Conghui He
Gui-Song Xia
Dahua Lin
+ SEPT: Towards Scalable and Efficient Visual Pre-Training 2022 Yiqi Lin
Huaibin Zheng
Zhong Hua-ping
Jinjing Zhu
Weijia Li
Conghui He
Lin Wang
+ PDF Chat PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark 2022 Li Chen
Chonghao Sima
Yang Li
Zehan Zheng
Jiajie Xu
Xiangwei Geng
Hongyang Li
Conghui He
Jianping Shi
Yu Qiao
+ PDF Chat Influence Selection for Active Learning 2021 Zhuoming Liu
Hao Ding
Huaping Zhong
Weijia Li
Jifeng Dai
Conghui He
+ Influence Selection for Active Learning 2021 Zhuoming Liu
Hao Ding
Huaping Zhong
Weijia Li
Jifeng Dai
Conghui He
+ INTERN: A New Learning Paradigm Towards General Vision 2021 Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhenfei Yin
Yinan He
Jianing Teng
Qinghong Sun
Mengya Gao
Jihao Liu
+ FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-Based Point Clouds 2020 Tai Wang
Conghui He
Zhe Wang
Jianping Shi
Dahua Lin
+ swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight 2019 Jiarui Fang
Liandeng Li
Haohuan Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
+ PDF Chat swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight 2018 Liandeng Li
Jiarui Fang
Haohuan Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
+ 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight 2017 Haohuan Fu
Conghui He
Bingwei Chen
Zekun Yin
Zhenguo Zhang
Wenqiang Zhang
Tingjian Zhang
Wei Xue
Weiguo Liu
Wanwang Yin
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
8
+ PDF Chat ImageNet Large Scale Visual Recognition Challenge 2015 Olga Russakovsky
Jia Deng
Hao Su
Jonathan Krause
Sanjeev Satheesh
Sean Ma
Zhiheng Huang
Andrej Karpathy
Aditya Khosla
Michael S. Bernstein
3
+ PDF Chat Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 2021 Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
Baining Guo
3
+ PDF Chat Scalability in Perception for Autonomous Driving: Waymo Open Dataset 2020 Pei Sun
Henrik Kretzschmar
Xerxes Dotiwalla
AurĂŠlien Chouard
Vijaysai Patnaik
Paul Tsui
James C. Y. Guo
Yin Zhou
Yuning Chai
Benjamin Caine
3
+ MMDetection: Open MMLab Detection Toolbox and Benchmark 2019 Kai Chen
Jiaqi Wang
Jiangmiao Pang
Yuhang Cao
Yu Xiong
Xiaoxiao Li
Shuyang Sun
Wansen Feng
Ziwei Liu
Jiarui Xu
3
+ PDF Chat BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning 2020 Fisher Yu
Haofeng Chen
Xin Wang
Wenqi Xian
Yingying Chen
Fangchen Liu
Vashisht Madhavan
Trevor Darrell
3
+ PDF Chat End-to-End Object Detection with Transformers 2020 Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
3
+ PDF Chat Big Transfer (BiT): General Visual Representation Learning 2020 Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
Joan Puigcerver
Jessica Yung
Sylvain Gelly
Neil Houlsby
2
+ PDF Chat BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers 2022 Zhiqi Li
Wenhai Wang
Hongyang Li
Enze Xie
Chonghao Sima
Tong LĂź
Yu Qiao
Jifeng Dai
2
+ PDF Chat EVA: Exploring the Limits of Masked Visual Representation Learning at Scale 2023 Yuxin Fang
Wen Wang
Binhui Xie
Quan Sun
Ledell Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
2
+ PDF Chat Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 2016 Shaoqing Ren
Kaiming He
Ross Girshick
Jian Sun
2
+ cuDNN: Efficient Primitives for Deep Learning 2014 Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan Cohen
John Tran
Bryan Catanzaro
Evan Shelhamer
2
+ TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems 2016 MartĹ́n Abadi
Ashish Agarwal
Paul Barham
Eugene Brevdo
Zhifeng Chen
Craig Citro
Gregory S. Corrado
Andy Davis
Jay B. Dean
Matthieu Devin
2
+ MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems 2015 Tianqi Chen
Mu Li
Yutian Li
Min Lin
Naiyan Wang
Minjie Wang
Tianjun Xiao
Bing Xu
Chiyuan Zhang
Zheng Zhang
2
+ PDF Chat Multi-Modal Fusion Transformer for End-to-End Autonomous Driving 2021 Aditya Prakash
Kashyap Chitta
Andreas Geiger
2
+ Improved Baselines with Momentum Contrastive Learning 2020 Xinlei Chen
Haoqi Fan
Ross Girshick
Kaiming He
2
+ PDF Chat NEAT: Neural Attention Fields for End-to-End Autonomous Driving 2021 Kashyap Chitta
Aditya Prakash
Andreas Geiger
2
+ PDF Chat FCOS: Fully Convolutional One-Stage Object Detection 2019 Zhi Tian
Chunhua Shen
Hao Chen
Tong He
2
+ Cascade R-CNN: High Quality Object Detection and Instance Segmentation 2019 Zhaowei Cai
Nuno Vasconcelos
2
+ PDF Chat nuScenes: A Multimodal Dataset for Autonomous Driving 2020 Holger Caesar
Varun Bankiti
Alex Lang
Sourabh Vora
Venice Erin Liong
Qiang Xu
Anush Krishnan
Yu Pan
Giancarlo Baldan
Oscar Beijbom
2
+ SSD: Single Shot MultiBox Detector 2016 Wei Liu
Dragomir Anguelov
Dumitru Erhan
Christian Szegedy
Scott Reed
Cheng-Yang Fu
Alexander C. Berg
2
+ BEiT: BERT Pre-Training of Image Transformers 2021 Hangbo Bao
Dong Li
Furu Wei
2
+ PDF Chat VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection 2018 Yin Zhou
Oncel Tuzel
2
+ PDF Chat CARAFE: Content-Aware ReAssembly of FEatures 2019 Jiaqi Wang
Kai Chen
Rui Xu
Ziwei Liu
Chen Change Loy
Dahua Lin
2
+ PDF Chat Revisiting Unreasonable Effectiveness of Data in Deep Learning Era 2017 Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
2
+ PDF Chat PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud 2019 Shaoshuai Shi
Xiaogang Wang
Hongsheng Li
2
+ PDF Chat FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters 2016 Forrest Iandola
Matthew W. Moskewicz
Khalid Ashraf
Kurt Keutzer
2
+ Self-supervised Pretraining of Visual Features in the Wild 2021 Priya Goyal
Mathilde Caron
Benjamin Lefaudeux
Min Xu
Pengchao Wang
Vivek S. Pai
Mannat Singh
Vitaliy Liptchinsky
Ishan Misra
Armand Joulin
2
+ PDF Chat 3D-LaneNet: End-to-End 3D Multiple Lane Detection 2019 Noa Garnett
Rafi Cohen
Tomer Pe'er
Roee Lahav
Dan Levi
2
+ PDF Chat Structured Bird’s-Eye-View Traffic Scene Understanding from Onboard Images 2021 Yiğit Baran Can
Alexander Liniger
Danda Pani Paudel
Luc Van Gool
2
+ PDF Chat Going deeper with convolutions 2015 Christian Szegedy
Wei Liu
Yangqing Jia
Pierre Sermanet
Scott Reed
Dragomir Anguelov
Dumitru Erhan
Vincent Vanhoucke
Andrew Rabinovich
2
+ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018 Jacob Devlin
Ming‐Wei Chang
Kenton Lee
Kristina Toutanova
2
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
2
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr DollĂĄr
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
2
+ Scaling SGD Batch Size to 32K for ImageNet Training. 2017 Yang You
Igor Gitman
Boris Ginsburg
2
+ PDF Chat The Cityscapes Dataset for Semantic Urban Scene Understanding 2016 Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
2
+ PDF Chat Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation 2014 Ross Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
2
+ PDF Chat Hybrid Task Cascade for Instance Segmentation 2019 Kai Chen
Wanli Ouyang
Chen Change Loy
Dahua Lin
Jiangmiao Pang
Jiaqi Wang
Yu Xiong
Xiaoxiao Li
Shuyang Sun
Wansen Feng
2
+ PDF Chat CityPersons: A Diverse Dataset for Pedestrian Detection 2017 Shanshan Zhang
Rodrigo Benenson
Bernt Schiele
1
+ PDF Chat Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-Tuning 2017 Weifeng Ge
Yizhou Yu
1
+ Second Order Stochastic Optimization in Linear Time. 2016 Naman Agarwal
Brian Bullins
Elad Hazan
1
+ PDF Chat Predicting Ground-Level Scene Layout from Aerial Imagery 2017 Menghua Zhai
Zachary Bessinger
Scott Workman
Nathan Jacobs
1
+ PDF Chat YOLO9000: Better, Faster, Stronger 2017 Joseph Redmon
Ali Farhadi
1
+ Implicit nonlinear wave simulation with 1.08T DOF and 0.270T unstructured finite elements to enhance comprehensive earthquake simulation 2015 Tsuyoshi Ichimura
Kohei Fujita
Pher Errol B. Quinay
Lalith Wijerathne
Muneo Hori
Seizo Tanaka
Yoshihisa Shizawa
Hiroshi Kobayashi
Kazuo Minami
1
+ PDF Chat Feature Pyramid Networks for Object Detection 2017 Tsung-Yi Lin
Piotr DollĂĄr
Ross Girshick
Kaiming He
Bharath Hariharan
Serge Belongie
1
+ PDF Chat COCO-Stuff: Thing and Stuff Classes in Context 2018 Holger Caesar
Jasper Uijlings
Vittorio Ferrari
1
+ PDF Chat Leveraging Pre-Trained 3D Object Detection Models for Fast Ground Truth Generation 2018 Jungwook Lee
SeĂĄn Walsh
Ali Harakeh
Steven L. Waslander
1
+ PDF Chat FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks 2017 Eddy Ilg
N. Michael Mayer
Tonmoy Saikia
Margret Keuper
Alexey Dosovitskiy
Thomas Brox
1
+ PDF Chat Deep Clustering for Unsupervised Learning of Visual Features 2018 Mathilde Caron
Piotr Bojanowski
Armand Joulin
Matthijs Douze
1
+ PDF Chat A fully convolutional two-stream fusion network for interactive image segmentation 2018 Yang Hu
Andrea Soltoggio
Russell Lock
Steve Carter
1