+
PDF
Chat
|
TBA: Faster Large Language Model Training Using SSD-Based Activation
Offloading
|
2024
|
Kai Wu
Jeongmin Park
Xiaofan Zhang
Mert Hidayetoğlu
Vikram Sharma Mailthody
Sitao Huang
Steven S. Lumetta
Wen‐mei Hwu
|
+
PDF
Chat
|
HiCCL: A Hierarchical Collective Communication Library
|
2024
|
Mert Hidayetoğlu
Simon Garcia de Gonzalo
Elliott Slaughter
Pinku Surana
Wen‐mei Hwu
William Gropp
Alex Aiken
|
+
PDF
Chat
|
LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing
Data Transfer Scheme
|
2024
|
Jeongmin Park
Kai Wu
Vikram Sharma Mailthody
Zaid Quresh
Scott Mahlke
Wen‐mei Hwu
|
+
PDF
Chat
|
Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures
|
2024
|
Kun Wu
Mert Hidayetoğlu
Xiang Song
Sitao Huang
Da Zheng
Israt Nisa
Wen‐mei Hwu
|
+
PDF
Chat
|
Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses
|
2024
|
Jeongmin Park
Vikram Sharma Mailthody
Zaid Qureshi
Wen‐mei Hwu
|
+
PDF
Chat
|
Parallelizing Maximal Clique Enumeration on GPUs
|
2023
|
Mohammad Almasri
Yen-Hsiang Chang
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design
|
2023
|
Benjamin Reidys
Yuqi Xue
Daixuan Li
Bharat Sukhwani
Wen‐mei Hwu
Deming Chen
Sameh Asaad
Jian Huang
|
+
|
IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research
|
2023
|
Arpandeep Khatua
Vikram Sharma Mailthody
Bhagyashree Taleka
Tengfei Ma
Xiang Song
Wen‐mei Hwu
|
+
|
GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture
|
2023
|
Zaid Qureshi
Vikram Sharma Mailthody
Isaac Gelado
Seungwon Min
Amna Masood
Jeongmin Park
Jinjun Xiong
Chris J. Newburn
Dmitri Vainbrand
I‐Hsin Chung
|
+
|
PIGEON: Optimizing CUDA Code Generator for End-to-End Training and Inference of Relational Graph Neural Networks
|
2023
|
Kun Wu
Mert Hidayetoğlu
Xiang Song
Sitao Huang
Da Zheng
Israt Nisa
Wen‐mei Hwu
|
+
|
IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research
|
2023
|
Arpandeep Khatua
Vikram Sharma Mailthody
Bhagyashree Taleka
Tengfei Ma
Xiang Song
Wen‐mei Hwu
|
+
|
Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses
|
2023
|
Jeongmin Park
Vikram Sharma Mailthody
Zaid Qureshi
Wen‐mei Hwu
|
+
|
CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs
|
2023
|
Jeongmin Park
Zaid Qureshi
Vikram Sharma Mailthody
Andrew Gacek
S. Shao
Mohammad Almasri
Isaac Gelado
Jinjun Xiong
Chris J. Newburn
I‐Hsin Chung
|
+
PDF
Chat
|
Can Language Models Be Specific? How?
|
2023
|
Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
Parallel K-clique counting on GPUs
|
2022
|
Mohammad Almasri
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
DKG: A Descriptive Knowledge Graph for Explaining Relationships between
Entities
|
2022
|
Jie Huang
Kerui Zhu
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
A Compiler Framework for Optimizing Dynamic Parallelism on GPUs
|
2022
|
Mhd Ghaith Olabi
Juan Gómez-Luna
Onur Mutlu
Wen‐mei Hwu
Izzat El Hajj
|
+
PDF
Chat
|
Open Relation Modeling: Learning to Define Relations between Entities
|
2022
|
Jie Huang
Kevin C. Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
A Compiler Framework for Optimizing Dynamic Parallelism on GPUs
|
2022
|
Mhd Ghaith Olabi
Juan Gómez-Luna
Onur Mutlu
Wen‐mei Hwu
Izzat El Hajj
|
+
|
GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture
|
2022
|
Zaid Qureshi
Vikram Sharma Mailthody
Isaac Gelado
Seung Won Min
Amna Masood
Jeongmin Park
Jinjun Xiong
CJ Newburn
Dmitri Vainbrand
I‐Hsin Chung
|
+
|
Can Language Models Be Specific? How?
|
2022
|
Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
DEER: Descriptive Knowledge Graph for Explaining Entity Relationships
|
2022
|
Jie Huang
Kerui Zhu
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Submission-Aware Reviewer Profiling for Reviewer Recommender System
|
2022
|
Omer Anjum
Alok Kamatar
Toby Liang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Parallelizing Maximal Clique Enumeration on GPUs
|
2022
|
Mohammad Almasri
Yen-Hsiang Chang
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
Understanding Jargon: Combining Extraction and Generation for Definition Modeling
|
2022
|
Jie Huang
Hanyin Shao
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
DEER: Descriptive Knowledge Graph for Explaining Entity Relationships
|
2022
|
Jie Huang
Kerui Zhu
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Graph Neural Network Training with Data Tiering
|
2021
|
Seungwon Min
Kun Wu
Mert Hidayetoğlu
Jinjun Xiong
Xiang Song
Wen‐mei Hwu
|
+
|
MLHarness: A scalable benchmarking system for MLCommons
|
2021
|
Yen-Hsiang Chang
Jianhao Pu
Wen‐mei Hwu
Jinjun Xiong
|
+
PDF
Chat
|
Interpretable Visual Reasoning via Induced Symbolic Space
|
2021
|
Zhonghao Wang
Kai Wang
Mo Yu
Jinjun Xiong
Wen‐mei Hwu
Mark Hasegawa–Johnson
Humphrey Shi
|
+
PDF
Chat
|
Large graph convolutional network training with GPU-oriented data communication architecture
|
2021
|
Seung Won Min
Kun Wu
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
|
+
PDF
Chat
|
Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection
|
2021
|
Jiachen Li
Bowen Cheng
Rogério Feris
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
|
+
|
K-Clique Counting on GPUs.
|
2021
|
Mohammad Almasri
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
|
+
|
FFT blitz
|
2021
|
Sultan Durrani
Muhammad Saad Chughtai
Abdul Dakkak
Wen‐mei Hwu
Lawrence Rauchwerger
|
+
|
PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses.
|
2021
|
Seungwon Min
Kun Wu
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
|
+
|
Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19
|
2021
|
Vikram Sharma Mailthody
James Wei
Nicholas Chen
Mohammad Behnia
Ruihao Yao
Qihao Wang
Vedant Agrawal
Churan He
Lijian Wang
Leihao Chen
|
+
|
Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection
|
2021
|
Jiachen Li
Bowen Cheng
Rogério Feris
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
|
+
|
Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach
|
2021
|
Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach
|
2021
|
Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19
|
2021
|
Vikram Sharma Mailthody
James Cheng‐Chung Wei
Nicholas Chen
Mohammad Behnia
Ruihao Yao
Qihao Wang
Vedant Agarwal
Churan He
Lijian Wang
Leihao Chen
|
+
|
MLHarness: A Scalable Benchmarking System for MLCommons
|
2021
|
Yen-Hsiang Chang
Jianhao Pu
Wen‐mei Hwu
Jinjun Xiong
|
+
|
Graph Neural Network Training with Data Tiering
|
2021
|
Seung Won Min
Wu Kun
Mert Hidayetoğlu
Jinjun Xiong
Xiang Song
Wen‐mei Hwu
|
+
|
Parallel K-Clique Counting on GPUs
|
2021
|
Mohammad Almasri
Izzat El Hajj
Rakesh Nagi
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
|
2021
|
Seung Won Min
Wu Kun
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
|
+
|
Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19
|
2021
|
Vikram Sharma Mailthody
James Cheng‐Chung Wei
Nicholas Chen
Mohammad Behnia
Ruihao Yao
Qihao Wang
Vedant Agrawal
Churan He
Lijian Wang
Leihao Chen
|
+
|
PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses
|
2021
|
Seung Won Min
Wu Kun
Sitao Huang
Mert Hidayetoğlu
Jinjun Xiong
Eiman Ebrahimi
Deming Chen
Wen‐mei Hwu
|
+
|
Open Relation Modeling: Learning to Define Relations between Entities
|
2021
|
Jie Huang
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Understanding Jargon: Combining Extraction and Generation for Definition Modeling
|
2021
|
Jie Huang
Hanyin Shao
Kevin Chen–Chuan Chang
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Fast CUDA-Aware MPI Datatypes without Platform Support
|
2020
|
Carl Pearson
Kun Wu
I‐Hsin Chung
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Interpretable Visual Reasoning via Induced Symbolic Space
|
2020
|
Zhonghao Wang
Kai Wang
Mo Yu
Jinjun Xiong
Wen‐mei Hwu
Mark Hasegawa–Johnson
Humphrey Shi
|
+
PDF
Chat
|
Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes
|
2020
|
Mert Hidayetoğlu
Tekin Biçer
Simon Garcia de Gonzalo
Bin Ren
Vincent De Andrade
Doğa Gürsoy
Raj Kettimuthu
Ian Foster
Wen‐mei Hwu
|
+
PDF
Chat
|
The Design and Implementation of a Scalable Deep Learning Benchmarking Platform
|
2020
|
Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation
|
2020
|
Mert Hidayetoğlu
Carl Pearson
Vikram Sharma Mailthody
Eiman Ebrahimi
Jinjun Xiong
Rakesh Nagi
Wen‐mei Hwu
|
+
|
Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes
|
2020
|
Mert Hidayetoğlu
Tekin Biçer
Simon Garcia de Gonzalo
Bin Ren
Vincent De Andrade
Doğa Gürsoy
Raj Kettimuthu
Ian Foster
Wen‐mei Hwu
|
+
PDF
Chat
|
Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices
|
2020
|
Cong Hao
Yao Chen
Xiaofan Zhang
Yuhong Li
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
|
+
|
At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation
|
2020
|
Mert Hidayetoğlu
Carl Pearson
Vikram Sharma Mailthody
Eiman Ebrahimi
Jinjun Xiong
Rakesh Nagi
Wen‐mei Hwu
|
+
PDF
Chat
|
EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions
|
2020
|
Yuhong Li
Cong Hao
Xiaofan Zhang
Xinheng Liu
Yao Chen
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
|
+
PDF
Chat
|
Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
|
2020
|
Zhonghao Wang
Mo Yu
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
|
+
PDF
Chat
|
Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation
|
2020
|
Zhonghao Wang
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
|
+
|
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM
|
2020
|
Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Sapan Agarwal
Matthew Marinella
Martin Foltín
John Paul Strachan
Dejan Milojičić
Wen‐mei Hwu
Kaushik Roy
|
+
PDF
Chat
|
Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs
|
2020
|
Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs
|
2020
|
Cheng Li
Abdul Dakkak
Jinjun Xiong
Wei Wei
Lingjie Xu
Wen‐mei Hwu
|
+
PDF
Chat
|
DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs
|
2020
|
Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
|
+
|
SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems
|
2020
|
Xiaofan Zhang
Haoming Lu
Cong Hao
Jiachen Li
Bowen Cheng
Yuhong Li
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
|
+
|
DLSpec: A Deep Learning Task Exchange Specification
|
2020
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
|
2020
|
Zhonghao Wang
Mo Yu
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
|
+
|
Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation
|
2020
|
Zhonghao Wang
Yunchao Wei
Rogerior Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
|
+
|
EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions
|
2020
|
Yuhong Li
Cong Hao
Xiaofan Zhang
Xinheng Liu
Yao Chen
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
|
+
|
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs
|
2020
|
Seung Won Min
Vikram Sharma Mailthody
Zaid Qureshi
Jinjun Xiong
Eiman Ebrahimi
Wen‐mei Hwu
|
+
|
DLSpec: A Deep Learning Task Exchange Specification
|
2020
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Tearing Down the Memory Wall
|
2020
|
Zaid Qureshi
Vikram Sharma Mailthody
Seungwon Min
I‐Hsin Chung
Jinjun Xiong
Wen‐mei Hwu
|
+
|
DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator
|
2020
|
Xiaofan Zhang
Hanchen Ye
Junsong Wang
Yonghua Lin
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
|
+
|
Exploring Semantic Capacity of Terms
|
2020
|
Jie Huang
Zilong Wang
Kevin Chen–Chuan Chang
Wen‐mei Hwu
Jinjun Xiong
|
+
|
Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices
|
2020
|
Cong Hao
Yao Chen
Xiaofan Zhang
Yuhong Li
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
|
+
|
Exploring Semantic Capacity of Terms
|
2020
|
Jie Huang
Zilong Wang
Kevin Chen–Chuan Chang
Wen‐mei Hwu
Jinjun Xiong
|
+
|
Interpretable Visual Reasoning via Induced Symbolic Space
|
2020
|
Zhonghao Wang
Kai Wang
Mo Yu
Jinjun Xiong
Wen‐mei Hwu
Mark Hasegawa–Johnson
Humphrey Shi
|
+
|
TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware Datatypes
|
2020
|
Carl Pearson
Kun Wu
I‐Hsin Chung
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes
|
2020
|
Mert Hidayetoğlu
Tekin Biçer
Simon Garcia de Gonzalo
Bin Ren
Vincent De Andrade
Doğa Gürsoy
Raj Kettimuthu
Ian Foster
Wen‐mei Hwu
|
+
|
At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation
|
2020
|
Mert Hidayetoğlu
Carl M. Pearson
Vikram Sharma Mailthody
Eiman Ebrahimi
Jinjun Xiong
Rakesh Nagi
Wen‐mei Hwu
|
+
|
MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale
|
2020
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
|
+
|
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM
|
2019
|
Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Sapan Agarwal
Matthew Marinella
Martin Foltín
John Paul Strachan
Dejan Milojičić
Wen‐mei Hwu
Kaushik Roy
|
+
PDF
Chat
|
NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving
|
2019
|
Cong Hao
Yao Chen
Xinheng Liu
Atif Sarwari
Daryl Sew
Ashutosh Dhar
Bryan Wu
Dongdong Fu
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
SPGNet: Semantic Prediction Guidance for Scene Parsing
|
2019
|
Bowen Cheng
Liang-Chieh Chen
Yunchao Wei
Yukun Zhu
Zilong Huang
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
Uiuc Uiuc
|
+
|
PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space
|
2019
|
Omer Anjum
Hongyu Gong
Suma Bhat
Wen‐mei Hwu
Jinjun Xiong
|
+
|
MLModelScope: A Distributed Platform for ML Model Evaluation and Benchmarking at Scale
|
2019
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device
|
2019
|
Seung Won Min
Sitao Huang
Mohamed El-Hadedy
Jinjun Xiong
Deming Chen
Wen‐mei Hwu
|
+
|
Across-Stack Profiling and Characterization of Machine Learning Models on GPUs.
|
2019
|
Cheng Li
Abdul Dakkak
Jinjun Xiong
Wei Wei
Lingjie Xu
Wen‐mei Hwu
|
+
|
Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device
|
2019
|
Seung Won Min
Sitao Huang
Mohamed El-Hadedy
Jinjun Xiong
Deming Chen
Wen‐mei Hwu
|
+
PDF
Chat
|
TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in Function-as-a-Service
|
2019
|
Abdul Dakkak
Cheng Li
Simon Garcia de Gonzalo
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
Accelerating reduction and scan using tensor core units
|
2019
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Isaac Gelado
Wen‐mei Hwu
|
+
|
Challenges and Pitfalls of Reproducing Machine Learning Artifacts.
|
2019
|
Cheng Li
Abdul Dakkak
Jinjun Xiong
Wen‐mei Hwu
|
+
PDF
Chat
|
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus
|
2019
|
Hongyu Gong
Suma Bhat
Lingfei Wu
Jinjun Xiong
Wen‐mei Hwu
|
+
|
FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge
|
2019
|
Cong Hao
Xiaofan Zhang
Yuhong Li
Sitao Huang
Jinjun Xiong
Kyle Rupnow
Wen‐mei Hwu
Deming Chen
|
+
|
A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices
|
2019
|
Xiaofan Zhang
Cong Hao
Yuhong Li
Yao Chen
Jinjun Xiong
Wen‐mei Hwu
Deming Chen
|
+
|
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus
|
2019
|
Hongyu Gong
Suma Bhat
Lingfei Wu
Jinjun Xiong
Wen‐mei Hwu
|
+
|
A Retrospective Recount of Computer Architecture Research with a Data-Driven Study of Over Four Decades of ISCA Publications
|
2019
|
Omer Anjum
Wen‐mei Hwu
Jinjun Xiong
|
+
|
SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection
|
2019
|
Xiaofan Zhang
Cong Hao
Haoming Lu
Jiachen Li
Yuhong Li
Yuchen Fan
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
|
+
|
SPGNet: Semantic Prediction Guidance for Scene Parsing
|
2019
|
Bowen Cheng
Liang-Chieh Chen
Yunchao Wei
Yukun Zhu
Zilong Huang
Jinjun Xiong
Thomas S. Huang
Wen‐mei Hwu
Humphrey Shi
|
+
|
PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space
|
2019
|
Omer Anjum
Hongyu Gong
Suma Bhat
Wen‐mei Hwu
Jinjun Xiong
|
+
|
SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems
|
2019
|
Xiaofan Zhang
Haoming Lu
Cong Hao
Jiachen Li
Bowen Cheng
Yuhong Li
Kyle Rupnow
Jinjun Xiong
Thomas S. Huang
Humphrey Shi
|
+
|
NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving
|
2019
|
Cong Hao
Yao Chen
Xinheng Liu
Atif Sarwari
Daryl Sew
Ashutosh Sutra Dhar
Bryan Wu
Dongdong Fu
Jinjun Xiong
Wen‐mei Hwu
|
+
|
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM
|
2019
|
Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Sapan Agarwal
Matthew Marinella
Martin Foltín
John Paul Strachan
Dejan Milojičić
Wen‐mei Hwu
Kaushik Roy
|
+
|
PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space
|
2019
|
Omer Anjum
Hongyu Gong
Suma Bhat
Wen‐mei Hwu
Jinjun Xiong
|
+
|
Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device
|
2019
|
Seung Won Min
Sitao Huang
Mohamed El-Hadedy
Jinjun Xiong
Deming Chen
Wen‐mei Hwu
|
+
|
PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
|
2019
|
Aayush Ankit
Izzat El Hajj
Sai Rahul Chalamalasetti
Geoffrey Ndu
Martin Foltín
R. Stanley Williams
Paolo Faraboschi
Wen‐mei Hwu
John Paul Strachan
Kaushik Roy
|
+
|
MLModelScope: Evaluate and Measure ML Models within AI Pipelines.
|
2018
|
Abdul Dakkak
Cheng Li
Abhishek Srivastava
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Frustrated with replicating claims of a shared model? a solution
|
2018
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
|
+
|
SCOPE: C3SR Systems Characterization and Benchmarking Framework.
|
2018
|
Carl M. Pearson
Abdul Dakkak
Cheng Li
Sarah Hashash
Jinjun Xiong
Wen‐mei Hwu
|
+
|
A Fast and Massively-Parallel Inverse Solver for Multiple-Scattering Tomographic Image Reconstruction
|
2018
|
Mert Hidayetoğlu
Carl Pearson
Izzat El Hajj
Levent Gürel
Weng Cho Chew
Wen‐mei Hwu
|
+
|
Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts
|
2018
|
Raymond A. Yeh
Jinjun Xiong
Wen‐mei Hwu
N. Minh
Alexander G. Schwing
|
+
|
Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection
|
2018
|
Bowen Cheng
Yunchao Wei
Rogério Feris
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
|
+
|
Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts
|
2018
|
Raymond A. Yeh
Jinjun Xiong
Wen‐mei Hwu
Nguyen Q. Minh
Alexander G. Schwing
|
+
|
A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better Generalization
|
2018
|
Bowen Cheng
Yunchao Wei
Jiahui Yu
Shiyu Chang
Jinjun Xiong
Wen‐mei Hwu
Thomas S. Huang
Humphrey Shi
|
+
|
Frustrated with Replicating Claims of a Shared Model? A Solution
|
2018
|
Abdul Dakkak
Cheng Li
Jinjun Xiong
Wen‐mei Hwu
|
+
|
SCOPE: C3SR Systems Characterization and Benchmarking Framework
|
2018
|
Carl M. Pearson
Abdul Dakkak
Cheng Li
Sarah Hashash
Jinjun Xiong
Wen‐mei Hwu
|
+
|
Scalable parallel DBIM solutions of inverse-scattering problems
|
2017
|
Mert Hidayetoğlu
Carl Pearson
Levent Gürel
Wen‐mei Hwu
Weng Cho Chew
|
+
|
Parallel patterns: sparse matrix computation
|
2017
|
David B. Kirk
Wen‐mei Hwu
|