Wen‐mei Hwu

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading 2024 Kai Wu
Jeongmin Park
Xiaofan Zhang
Mert Hidayetoğlu
Vikram Sharma Mailthody
Sitao Huang
Steven S. Lumetta
Wen‐mei Hwu
+ PDF Chat HiCCL: A Hierarchical Collective Communication Library 2024 Mert Hidayetoğlu
Simon Garcia de Gonzalo
Elliott Slaughter
Pinku Surana
Wen‐mei Hwu
William Gropp
Alex Aiken
+ PDF Chat LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme 2024 Jeongmin Park
Kai Wu
Vikram Sharma Mailthody
Zaid Quresh
Scott Mahlke
Wen‐mei Hwu
+ PDF Chat Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures 2024 Kun Wu
Mert Hidayetoğlu
Xiang Song
Sitao Huang
Da Zheng
Israt Nisa
Wen‐mei Hwu
+ PDF Chat Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses 2024 Jeongmin Park
Vikram Sharma Mailthody
Zaid Qureshi
Wen‐mei Hwu
+ PDF Chat Parallelizing Maximal Clique Enumeration on GPUs 2023 Mohammad Almasri
Yen-Hsiang Chang
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design 2023 Benjamin Reidys
Yuqi Xue
Daixuan Li
Bharat Sukhwani
Wen‐mei Hwu
Deming Chen
Sameh Asaad
Jian Huang
+ IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research 2023 Arpandeep Khatua
Vikram Sharma Mailthody
Bhagyashree Taleka
Tengfei Ma
Xiang Song
Wen‐mei Hwu
+ GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture 2023 Zaid Qureshi
Vikram Sharma Mailthody
Isaac Gelado
Seungwon Min
Amna Masood
Jeongmin Park
Jinjun Xiong
Chris J. Newburn
Dmitri Vainbrand
I‐Hsin Chung
+ PIGEON: Optimizing CUDA Code Generator for End-to-End Training and Inference of Relational Graph Neural Networks 2023 Kun Wu
Mert Hidayetoğlu
Xiang Song
Sitao Huang
Da Zheng
Israt Nisa
Wen‐mei Hwu
+ IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research 2023 Arpandeep Khatua
Vikram Sharma Mailthody
Bhagyashree Taleka
Tengfei Ma
Xiang Song
Wen‐mei Hwu
+ Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses 2023 Jeongmin Park
Vikram Sharma Mailthody
Zaid Qureshi
Wen‐mei Hwu
+ CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs 2023 Jeongmin Park
Zaid Qureshi
Vikram Sharma Mailthody
Andrew Gacek
S. Shao
Mohammad Almasri
Isaac Gelado
Jinjun Xiong
Chris J. Newburn
I‐Hsin Chung
+ PDF Chat Can Language Models Be Specific? How? 2023 Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat Parallel K-clique counting on GPUs 2022 Mohammad Almasri
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat DKG: A Descriptive Knowledge Graph for Explaining Relationships between Entities 2022 Jie Huang
Kerui Zhu
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat A Compiler Framework for Optimizing Dynamic Parallelism on GPUs 2022 Mhd Ghaith Olabi
Juan Gómez-Luna
Onur Mutlu
Wen‐mei Hwu
Izzat El Hajj
+ PDF Chat Open Relation Modeling: Learning to Define Relations between Entities 2022 Jie Huang
Kevin C. Chang
Jinjun Xiong
Wen‐mei Hwu
+ A Compiler Framework for Optimizing Dynamic Parallelism on GPUs 2022 Mhd Ghaith Olabi
Juan Gómez-Luna
Onur Mutlu
Wen‐mei Hwu
Izzat El Hajj
+ GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture 2022 Zaid Qureshi
Vikram Sharma Mailthody
Isaac Gelado
Seung Won Min
Amna Masood
Jeongmin Park
Jinjun Xiong
CJ Newburn
Dmitri Vainbrand
I‐Hsin Chung
+ Can Language Models Be Specific? How? 2022 Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ DEER: Descriptive Knowledge Graph for Explaining Entity Relationships 2022 Jie Huang
Kerui Zhu
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ Submission-Aware Reviewer Profiling for Reviewer Recommender System 2022 Omer Anjum
Alok Kamatar
Toby Liang
Jinjun Xiong
Wen‐mei Hwu
+ Parallelizing Maximal Clique Enumeration on GPUs 2022 Mohammad Almasri
Yen-Hsiang Chang
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat Understanding Jargon: Combining Extraction and Generation for Definition Modeling 2022 Jie Huang
Hanyin Shao
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat DEER: Descriptive Knowledge Graph for Explaining Entity Relationships 2022 Jie Huang
Kerui Zhu
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ Graph Neural Network Training with Data Tiering 2021 Seungwon Min
Kun Wu
Mert Hidayetoğlu
Jinjun Xiong
Xiang Song
Wen‐mei Hwu
+ MLHarness: A scalable benchmarking system for MLCommons 2021 Yen-Hsiang Chang
Jianhao Pu
Wen‐mei Hwu
Jinjun Xiong
+ PDF Chat Interpretable Visual Reasoning via Induced Symbolic Space 2021 Zhonghao Wang
Kai Wang
Mo Yu
Jinjun Xiong
Wen‐mei Hwu
Mark Hasegawa–Johnson
Humphrey Shi
+ PDF Chat Large graph convolutional network training with GPU-oriented data communication architecture 2021 Seung Won Min
Kun Wu
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
+ PDF Chat Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection 2021 Jiachen Li
Bowen Cheng
Rogério Feris
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
+ K-Clique Counting on GPUs. 2021 Mohammad Almasri
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
+ FFT blitz 2021 Sultan Durrani
Muhammad Saad Chughtai
Abdul Dakkak
Wen‐mei Hwu
Lawrence Rauchwerger
+ PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses. 2021 Seungwon Min
Kun Wu
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
+ Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19 2021 Vikram Sharma Mailthody
James Wei
Nicholas Chen
Mohammad Behnia
Ruihao Yao
Qihao Wang
Vedant Agrawal
Churan He
Lijian Wang
Leihao Chen
+ Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection 2021 Jiachen Li
Bowen Cheng
Rogério Feris
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
+ Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach 2021 Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach 2021 Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19 2021 Vikram Sharma Mailthody
James Cheng‐Chung Wei
Nicholas Chen
Mohammad Behnia
Ruihao Yao
Qihao Wang
Vedant Agarwal
Churan He
Lijian Wang
Leihao Chen
+ MLHarness: A Scalable Benchmarking System for MLCommons 2021 Yen-Hsiang Chang
Jianhao Pu
Wen‐mei Hwu
Jinjun Xiong
+ Graph Neural Network Training with Data Tiering 2021 Seung Won Min
Wu Kun
Mert Hidayetoğlu
Jinjun Xiong
Xiang Song
Wen‐mei Hwu
+ Parallel K-Clique Counting on GPUs 2021 Mohammad Almasri
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
+ Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture 2021 Seung Won Min
Wu Kun
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
+ Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19 2021 Vikram Sharma Mailthody
James Cheng‐Chung Wei
Nicholas Chen
Mohammad Behnia
Ruihao Yao
Qihao Wang
Vedant Agrawal
Churan He
Lijian Wang
Leihao Chen
+ PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses 2021 Seung Won Min
Wu Kun
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
+ Open Relation Modeling: Learning to Define Relations between Entities 2021 Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ Understanding Jargon: Combining Extraction and Generation for Definition Modeling 2021 Jie Huang
Hanyin Shao
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
+ Fast CUDA-Aware MPI Datatypes without Platform Support 2020 Carl Pearson
Kun Wu
I‐Hsin Chung
Jinjun Xiong
Wen‐mei Hwu
+ Interpretable Visual Reasoning via Induced Symbolic Space 2020 Zhonghao Wang
Kai Wang
Mo Yu
Jinjun Xiong
Wen‐mei Hwu
Mark Hasegawa–Johnson
Humphrey Shi
+ PDF Chat Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes 2020 Mert Hidayetoğlu
Tekin Biçer
Simon Garcia de Gonzalo
Bin Ren
Vincent De Andrade
Doğa Gürsoy
Raj Kettimuthu
Ian Foster
Wen‐mei Hwu
+ PDF Chat The Design and Implementation of a Scalable Deep Learning Benchmarking Platform 2020 Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation 2020 Mert Hidayetoğlu
Carl Pearson
Vikram Sharma Mailthody
Eiman Ebrahimi
Jinjun Xiong
Rakesh Nagi
Wen‐mei Hwu
+ Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes 2020 Mert Hidayetoğlu
Tekin Biçer
Simon Garcia de Gonzalo
Bin Ren
Vincent De Andrade
Doğa Gürsoy
Raj Kettimuthu
Ian Foster
Wen‐mei Hwu
+ PDF Chat Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices 2020 Cong Hao
Yao Chen
Xiaofan Zhang
Yuhong Li
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
+ At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation 2020 Mert Hidayetoğlu
Carl Pearson
Vikram Sharma Mailthody
Eiman Ebrahimi
Jinjun Xiong
Rakesh Nagi
Wen‐mei Hwu
+ PDF Chat EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions 2020 Yuhong Li
Cong Hao
Xiaofan Zhang
Xinheng Liu
Yao Chen
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
+ PDF Chat Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation 2020 Zhonghao Wang
Mo Yu
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
+ PDF Chat Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation 2020 Zhonghao Wang
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
+ PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM 2020 Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Sapan Agarwal
Matthew Marinella
Martin Foltín
John Paul Strachan
Dejan Milojičić
Wen‐mei Hwu
Kaushik Roy
+ PDF Chat Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs 2020 Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs 2020 Cheng Li
Abdul Dakkak
Jinjun Xiong
Wei Wei
Lingjie Xu
Wen‐mei Hwu
+ PDF Chat DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs 2020 Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
+ SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems 2020 Xiaofan Zhang
Haoming Lu
Cong Hao
Jiachen Li
Bowen Cheng
Yuhong Li
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
+ DLSpec: A Deep Learning Task Exchange Specification 2020 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
+ Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation 2020 Zhonghao Wang
Mo Yu
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
+ Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation 2020 Zhonghao Wang
Yunchao Wei
Rogerior Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
+ EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions 2020 Yuhong Li
Cong Hao
Xiaofan Zhang
Xinheng Liu
Yao Chen
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
+ EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs 2020 Seung Won Min
Vikram Sharma Mailthody
Zaid Qureshi
Jinjun Xiong
Eiman Ebrahimi
Wen‐mei Hwu
+ DLSpec: A Deep Learning Task Exchange Specification 2020 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
+ Tearing Down the Memory Wall 2020 Zaid Qureshi
Vikram Sharma Mailthody
Seungwon Min
I‐Hsin Chung
Jinjun Xiong
Wen‐mei Hwu
+ DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator 2020 Xiaofan Zhang
Hanchen Ye
Junsong Wang
Yonghua Lin
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
+ Exploring Semantic Capacity of Terms 2020 Jie Huang
Zilong Wang
Kevin Chen–Chuan Chang
Wen‐mei Hwu
Jinjun Xiong
+ Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices 2020 Cong Hao
Yao Chen
Xiaofan Zhang
Yuhong Li
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
+ Exploring Semantic Capacity of Terms 2020 Jie Huang
Zilong Wang
Kevin Chen–Chuan Chang
Wen‐mei Hwu
Jinjun Xiong
+ Interpretable Visual Reasoning via Induced Symbolic Space 2020 Zhonghao Wang
Kai Wang
Mo Yu
Jinjun Xiong
Wen‐mei Hwu
Mark Hasegawa–Johnson
Humphrey Shi
+ TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware Datatypes 2020 Carl Pearson
Kun Wu
I‐Hsin Chung
Jinjun Xiong
Wen‐mei Hwu
+ Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes 2020 Mert Hidayetoğlu
Tekin Biçer
Simon Garcia de Gonzalo
Bin Ren
Vincent De Andrade
Doğa Gürsoy
Raj Kettimuthu
Ian Foster
Wen‐mei Hwu
+ At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation 2020 Mert Hidayetoğlu
Carl M. Pearson
Vikram Sharma Mailthody
Eiman Ebrahimi
Jinjun Xiong
Rakesh Nagi
Wen‐mei Hwu
+ MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale 2020 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
+ PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM 2019 Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Sapan Agarwal
Matthew Marinella
Martin Foltín
John Paul Strachan
Dejan Milojičić
Wen‐mei Hwu
Kaushik Roy
+ PDF Chat NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving 2019 Cong Hao
Yao Chen
Xinheng Liu
Atif Sarwari
Daryl Sew
Ashutosh Dhar
Bryan Wu
Dongdong Fu
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat SPGNet: Semantic Prediction Guidance for Scene Parsing 2019 Bowen Cheng
Liang-Chieh Chen
Yunchao Wei
Yukun Zhu
Zilong Huang
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
Uiuc Uiuc
+ PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space 2019 Omer Anjum
Hongyu Gong
Suma Bhat
Wen‐mei Hwu
Jinjun Xiong
+ MLModelScope: A Distributed Platform for ML Model Evaluation and Benchmarking at Scale 2019 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device 2019 Seung Won Min
Sitao Huang
Mohamed El-Hadedy
Jinjun Xiong
Deming Chen
Wen‐mei Hwu
+ Across-Stack Profiling and Characterization of Machine Learning Models on GPUs. 2019 Cheng Li
Abdul Dakkak
Jinjun Xiong
Wei Wei
Lingjie Xu
Wen‐mei Hwu
+ Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device 2019 Seung Won Min
Sitao Huang
Mohamed El-Hadedy
Jinjun Xiong
Deming Chen
Wen‐mei Hwu
+ PDF Chat TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in Function-as-a-Service 2019 Abdul Dakkak
Cheng Li
Simon Garcia de Gonzalo
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat Accelerating reduction and scan using tensor core units 2019 Abdul Dakkak
Cheng Li
Jinjun Xiong
Isaac Gelado
Wen‐mei Hwu
+ Challenges and Pitfalls of Reproducing Machine Learning Artifacts. 2019 Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
+ PDF Chat Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus 2019 Hongyu Gong
Suma Bhat
Lingfei Wu
Jinjun Xiong
Wen‐mei Hwu
+ FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge 2019 Cong Hao
Xiaofan Zhang
Yuhong Li
Sitao Huang
Jinjun Xiong
Kyle Rupnow
Wen‐mei Hwu
Deming Chen
+ A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices 2019 Xiaofan Zhang
Cong Hao
Yuhong Li
Yao Chen
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
+ Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus 2019 Hongyu Gong
Suma Bhat
Lingfei Wu
Jinjun Xiong
Wen‐mei Hwu
+ A Retrospective Recount of Computer Architecture Research with a Data-Driven Study of Over Four Decades of ISCA Publications 2019 Omer Anjum
Wen‐mei Hwu
Jinjun Xiong
+ SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection 2019 Xiaofan Zhang
Cong Hao
Haoming Lu
Jiachen Li
Yuhong Li
Yuchen Fan
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
+ SPGNet: Semantic Prediction Guidance for Scene Parsing 2019 Bowen Cheng
Liang-Chieh Chen
Yunchao Wei
Yukun Zhu
Zilong Huang
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
+ PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space 2019 Omer Anjum
Hongyu Gong
Suma Bhat
Wen‐mei Hwu
Jinjun Xiong
+ SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems 2019 Xiaofan Zhang
Haoming Lu
Cong Hao
Jiachen Li
Bowen Cheng
Yuhong Li
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
+ NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving 2019 Cong Hao
Yao Chen
Xinheng Liu
Atif Sarwari
Daryl Sew
Ashutosh Sutra Dhar
Bryan Wu
Dongdong Fu
Jinjun Xiong
Wen‐mei Hwu
+ PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM 2019 Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Sapan Agarwal
Matthew Marinella
Martin Foltín
John Paul Strachan
Dejan Milojičić
Wen‐mei Hwu
Kaushik Roy
+ PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space 2019 Omer Anjum
Hongyu Gong
Suma Bhat
Wen‐mei Hwu
Jinjun Xiong
+ Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device 2019 Seung Won Min
Sitao Huang
Mohamed El-Hadedy
Jinjun Xiong
Deming Chen
Wen‐mei Hwu
+ PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference 2019 Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Geoffrey Ndu
Martin Foltín
R. Stanley Williams
Paolo Faraboschi
Wen‐mei Hwu
John Paul Strachan
Kaushik Roy
+ MLModelScope: Evaluate and Measure ML Models within AI Pipelines. 2018 Abdul Dakkak
Cheng Li
Abhishek Srivastava
Jinjun Xiong
Wen‐mei Hwu
+ Frustrated with replicating claims of a shared model? a solution 2018 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
+ SCOPE: C3SR Systems Characterization and Benchmarking Framework. 2018 Carl M. Pearson
Abdul Dakkak
Cheng Li
Sarah Hashash
Jinjun Xiong
Wen‐mei Hwu
+ A Fast and Massively-Parallel Inverse Solver for Multiple-Scattering Tomographic Image Reconstruction 2018 Mert Hidayetoğlu
Carl Pearson
Izzat El Hajj
Levent Gürel
Weng Cho Chew
Wen‐mei Hwu
+ Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts 2018 Raymond A. Yeh
Jinjun Xiong
Wen‐mei Hwu
N. Minh
Alexander G. Schwing
+ Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection 2018 Bowen Cheng
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
+ Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts 2018 Raymond A. Yeh
Jinjun Xiong
Wen‐mei Hwu
Nguyen Q. Minh
Alexander G. Schwing
+ A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better Generalization 2018 Bowen Cheng
Yunchao Wei
Jiahui Yu
Shiyu Chang
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
+ Frustrated with Replicating Claims of a Shared Model? A Solution 2018 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
+ SCOPE: C3SR Systems Characterization and Benchmarking Framework 2018 Carl M. Pearson
Abdul Dakkak
Cheng Li
Sarah Hashash
Jinjun Xiong
Wen‐mei Hwu
+ Scalable parallel DBIM solutions of inverse-scattering problems 2017 Mert Hidayetoğlu
Carl Pearson
Levent Gürel
Wen‐mei Hwu
Weng Cho Chew
+ Parallel patterns: sparse matrix computation 2017 David B. Kirk
Wen‐mei Hwu
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
23
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
13
+ PDF Chat Going deeper with convolutions 2015 Christian Szegedy
Wei Liu
Yangqing Jia
Pierre Sermanet
Scott Reed
Dragomir Anguelov
Dumitru Erhan
Vincent Vanhoucke
Andrew Rabinovich
10
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
9
+ In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
8
+ PDF Chat Feature Pyramid Networks for Object Detection 2017 Tsung-Yi Lin
Piotr Dollár
Ross Girshick
Kaiming He
Bharath Hariharan
Serge Belongie
7
+ PDF Chat Rethinking the Inception Architecture for Computer Vision 2016 Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jon Shlens
Zbigniew Wojna
7
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
7
+ PDF Chat MobileNetV2: Inverted Residuals and Linear Bottlenecks 2018 Mark Sandler
Andrew Howard
Menglong Zhu
Andrey Zhmoginov
Liang-Chieh Chen
6
+ PDF Chat In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
6
+ MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 2017 Andrew Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
Marco Andreetto
Hartwig Adam
6
+ PDF Chat You Only Look Once: Unified, Real-Time Object Detection 2016 Joseph Redmon
Santosh Divvala
Ross Girshick
Ali Farhadi
6
+ PDF Chat Pyramid Scene Parsing Network 2017 Hengshuang Zhao
Jianping Shi
Xiaojuan Qi
Xiaogang Wang
Jiaya Jia
6
+ Regularized Evolution for Image Classifier Architecture Search 2019 Esteban Real
Alok Aggarwal
Yanping Huang
Quoc V. Le
6
+ SSD: Single Shot MultiBox Detector 2016 Wei Liu
Dragomir Anguelov
Dumitru Erhan
Christian Szegedy
Scott Reed
Cheng-Yang Fu
Alexander C. Berg
5
+ PDF Chat Aggregated Residual Transformations for Deep Neural Networks 2017 Saining Xie
Ross Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
5
+ PDF Chat Design Flow of Accelerating Hybrid Extremely Low Bit-Width Neural Network in Embedded FPGA 2018 Junsong Wang
Qiuwen Lou
Xiaofan Zhang
Chao Zhu
Yonghua Lin
Deming Chen
5
+ ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 2018 Han Cai
Ligeng Zhu
Song Han
5
+ Rethinking Atrous Convolution for Semantic Image Segmentation 2017 Liang-Chieh Chen
George Papandreou
Florian Schroff
Hartwig Adam
5
+ SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size 2016 Forrest Iandola
Song Han
Matthew W. Moskewicz
Khalid Ashraf
William J. Dally
Kurt Keutzer
5
+ PDF Chat Fully convolutional networks for semantic segmentation 2015 Jonathan Long
Evan Shelhamer
Trevor Darrell
5
+ PDF Chat Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation 2014 Ross Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
5
+ PDF Chat The Cityscapes Dataset for Semantic Urban Scene Understanding 2016 Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
5
+ PDF Chat Densely Connected Convolutional Networks 2017 Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
5
+ PDF Chat DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs 2017 Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Murphy
Alan Yuille
5
+ PDF Chat DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs 2020 Da Zheng
Chao Ma
Minjie Wang
Jinjing Zhou
Qidong Su
Xiang Song
Quan Gan
Zheng Zhang
George Karypis
4
+ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018 Jacob Devlin
Ming‐Wei Chang
Kenton Lee
Kristina Toutanova
4
+ SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems 2020 Xiaofan Zhang
Haoming Lu
Cong Hao
Jiachen Li
Bowen Cheng
Yuhong Li
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
4
+ Language Models are Few-Shot Learners 2020 T. B. Brown
Benjamin F. Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
4
+ Adam: A Method for Stochastic Optimization 2014 Diederik P. Kingma
Jimmy Ba
4
+ Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs 2014 Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Murphy
Alan Yuille
4
+ Semi-Supervised Classification with Graph Convolutional Networks 2016 Thomas Kipf
Max Welling
4
+ R-FCN: Object Detection via Region-based Fully Convolutional Networks 2016 Jifeng Dai
Li Yi
Kaiming He
Jian Sun
4
+ PDF Chat ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design 2018 Ningning Ma
Xiangyu Zhang
Hai-Tao Zheng
Jian Sun
4
+ PDF Chat Learning Transferable Architectures for Scalable Image Recognition 2018 Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
4
+ Distributed Representations of Words and Phrases and their Compositionality 2013 Tomáš Mikolov
Ilya Sutskever
Kai Chen
Greg S. Corrado
Jeffrey Dean
4
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr Dollár
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
4
+ PDF Chat Identity Mappings in Deep Residual Networks 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
4
+ PDF Chat Focal Loss for Dense Object Detection 2017 Tsung-Yi Lin
Priya Goyal
Ross Girshick
Kaiming He
Piotr Dollár
4
+ MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems 2015 Tianqi Chen
Mu Li
Yutian Li
Min Lin
Naiyan Wang
Minjie Wang
Tianjun Xiao
Bing Xu
Chiyuan Zhang
Zheng Zhang
4
+ OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks 2014 Pierre Sermanet
David Eigen
Xiang Zhang
Michaël Mathieu
Rob Fergus
Yann LeCun
4
+ Automated Phrase Mining from Massive Text Corpora 2018 Jingbo Shang
Jialu Liu
Meng Jiang
Xiang Ren
Clare R. Voss
Jiawei Han
4
+ Frustrated with replicating claims of a shared model? a solution 2018 Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
4
+ PDF Chat Revisiting RCNN: On Awakening the Classification Power of Faster RCNN 2018 Bowen Cheng
Yunchao Wei
Humphrey Shi
Rogério Feris
Jinjun Xiong
Thomas S. Huang
4
+ PDF Chat YOLO9000: Better, Faster, Stronger 2017 Joseph Redmon
Ali Farhadi
4
+ PDF Chat Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields 2017 Zhe Cao
Tomas Simon
Shih-En Wei
Yaser Sheikh
4
+ Efficient Estimation of Word Representations in Vector Space 2013 Tomáš Mikolov
Kai Chen
Greg S. Corrado
Jay B. Dean
4
+ PDF Chat MnasNet: Platform-Aware Neural Architecture Search for Mobile 2019 Mingxing Tan
Bo Chen
Ruoming Pang
Vijay Vasudevan
Mark Sandler
Andrew Howard
Quoc V. Le
4
+ Mixed Precision Training 2017 Paulius Micikevicius
Sharan Narang
Jonah Alben
Gregory Diamos
Erich Elsen
David García
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
4
+ PDF Chat GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild 2019 Lianghua Huang
Xin Zhao
Kaiqi Huang
3