+
PDF
Chat
|
PRESERVE: Prefetching Model Weights and KV-Cache in Distributed LLM
Serving
|
2025
|
Ahmet Caner YĂŒzĂŒgĂŒler
Jiawei Zhuang
Lukas Cavigelli
|
+
PDF
Chat
|
AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing
and Data Locality
|
2024
|
Ilias Bournias
Lukas Cavigelli
Georgios Zacharopoulos
|
+
PDF
Chat
|
SSSD: Simply-Scalable Speculative Decoding
|
2024
|
Michele Marzollo
Jiawei Zhuang
Niklas Roemer
Lorenz K. MĂŒller
Lukas Cavigelli
|
+
PDF
Chat
|
Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging
Generative AI Workloads
|
2024
|
Aritra Dhar
Clément Thorens
Lara Magdalena Lazier
Lukas Cavigelli
|
+
PDF
Chat
|
On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems
|
2024
|
Cristian Cioflan
Lukas Cavigelli
Manuele Rusci
Miguel de Prado
Luca Benini
|
+
PDF
Chat
|
Ara2: Exploring Single- and Multi-Core Vector Processing With an Efficient RVV 1.0 Compliant Open-Source Processor
|
2024
|
Matteo Perotti
Matheus Cavalcante
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Boosting keyword spotting through on-device learnable user speech
characteristics
|
2024
|
Cristian Cioflan
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge
Embedded Systems
|
2024
|
Cristian Cioflan
Lukas Cavigelli
Manuele Rusci
Miguel de Prado
Luca Benini
|
+
PDF
Chat
|
Flex-SFU: Accelerating DNN Activation Functions by Non-Uniform Piecewise Approximation
|
2023
|
Enrico Reggiani
Renzo Andri
Lukas Cavigelli
|
+
|
Flex-SFU: Accelerating DNN Activation Functions by Non-Uniform Piecewise Approximation
|
2023
|
Enrico Reggiani
Renzo Andri
Lukas Cavigelli
|
+
|
ReDSEa: Automated Acceleration of Triangular Solver on Supercloud Heterogeneous Systems
|
2023
|
Georgios Zacharopoulos
Ilias Bournias
Verner Vlacic
Lukas Cavigelli
|
+
|
RL-based Stateful Neural Adaptive Sampling and Denoising for Real-Time Path Tracing
|
2023
|
Antoine Scardigli
Lukas Cavigelli
Lorenz K. MĂŒller
|
+
|
Ara2: Exploring Single- and Multi-Core Vector Processing with an Efficient RVV1.0 Compliant Open-Source Processor
|
2023
|
Matteo Perotti
Matheus Cavalcante
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
|
Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication
|
2023
|
Jannis Schönleber
Lukas Cavigelli
Renzo Andri
Matteo Perotti
Luca Benini
|
+
PDF
Chat
|
A âNew Araâ for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design
|
2022
|
Matteo Perotti
Matheus Cavalcante
Nils Wistoff
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks
|
2022
|
Gianmarco Cerutti
Lukas Cavigelli
Renzo Andri
Michele Magno
Elisabetta Farella
Luca Benini
|
+
PDF
Chat
|
Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks
|
2022
|
Gianmarco Cerutti
Lukas Cavigelli
Renzo Andri
Michele Magno
Elisabetta Farella
Luca Benini
|
+
|
Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks
|
2022
|
Gianmarco Cerutti
Lukas Cavigelli
Renzo Andri
Michele Magno
Elisabetta Farella
Luca Benini
|
+
|
Going Further With Winograd Convolutions: Tap-Wise Quantization for Efficient Inference on 4x4 Tile
|
2022
|
Renzo Andri
Beatrice Bussolino
Antonio Cipolletta
Lukas Cavigelli
Zhe Wang
|
+
|
A ''New Ara'' for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design
|
2022
|
Matteo Perotti
Matheus Cavalcante
Nils Wistoff
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Vau Da Muntanialas: Energy-Efficient Multi-Die Scalable Acceleration of RNN Inference
|
2021
|
Gianna Paulin
Francesco Conti
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network
|
2021
|
Thorir Mar Ingolfsson
Xiaying Wang
Michael Hersche
Alessio Burrello
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
ChewBaccaNN: A Flexible 223 TOPS/W BNN Accelerator
|
2021
|
Renzo Andri
Geethan Karunaratne
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Mixed-Precision Quantization and Parallel Implementation of Multispectral Riemannian Classification for Brain-Machine Interfaces
|
2021
|
Xiaying Wang
Tibor Schneider
Michael Hersche
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy Efficiency
|
2021
|
Moritz Scherer
Georg Rutishauser
Lukas Cavigelli
Luca Benini
|
+
|
Mixed-Precision Quantization and Parallel Implementation of Multispectral Riemannian Classification for Brain--Machine Interfaces
|
2021
|
Xiaying Wang
Tibor Schneider
Michael Hersche
Lukas Cavigelli
Luca Benini
|
+
|
Reinforcement Learning for Scalable Logic Optimization with Graph Neural Networks
|
2021
|
Xavier Timoneda
Lukas Cavigelli
|
+
|
Sub-100uW Multispectral Riemannian Classification for EEG-based Brain--Machine Interfaces
|
2021
|
Xiaying Wang
Lukas Cavigelli
Tibor Schneider
Luca Benini
|
+
|
ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network
|
2021
|
Thorir Mar Ingolfsson
Xiaying Wang
Michael Hersche
Alessio Burrello
Lukas Cavigelli
Luca Benini
|
+
|
CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency.
|
2020
|
Moritz Scherer
Georg Rutishauser
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery BrainâMachine Interfaces
|
2020
|
Thorir Mar Ingolfsson
Michael Hersche
Xiaying Wang
Nobuaki Kobayashi
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Q-EEGNet: an Energy-Efficient 8-bit Quantized Parallel EEGNet Implementation for Edge Motor-Imagery Brain-Machine Interfaces
|
2020
|
Tibor Schneider
Xiaying Wang
Michael Hersche
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Sound event detection with binary neural networks on tightly power-constrained IoT devices
|
2020
|
Gianmarco Cerutti
Renzo Andri
Lukas Cavigelli
Elisabetta Farella
Michele Magno
Luca Benini
|
+
|
EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery Brain-Machine Interfaces
|
2020
|
Thorir Mar Ingolfsson
Michael Hersche
Xiaying Wang
Nobuaki Kobayashi
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting
|
2020
|
Michele Magno
Xiaying Wang
Manuel Eggimann
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
HR-SAR-Net: A Deep Neural Network for Urban Scene Segmentation from High-Resolution SAR Data
|
2020
|
Xiaying Wang
Lukas Cavigelli
Manuel Eggimann
Michele Magno
Luca Benini
|
+
|
InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting
|
2020
|
Michele Magno
Xiaying Wang
Manuel Eggimann
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things
|
2020
|
Xiaying Wang
Michele Magno
Lukas Cavigelli
Luca Benini
|
+
|
RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks
|
2020
|
Lukas Cavigelli
Luca Benini
|
+
|
CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency
|
2020
|
Moritz Scherer
G Rutishauser
Lukas Cavigelli
Luca Benini
|
+
|
EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery Brain-Machine Interfaces
|
2020
|
Thorir Mar Ingolfsson
Michael Hersche
Xiaying Wang
Nobuaki Kobayashi
Lukas Cavigelli
Luca Benini
|
+
|
ChewBaccaNN: A Flexible 223 TOPS/W BNN Accelerator
|
2020
|
Renzo Andri
Geethan Karunaratne
Lukas Cavigelli
Luca Benini
|
+
|
Q-EEGNet: an Energy-Efficient 8-bit Quantized Parallel EEGNet Implementation for Edge Motor-Imagery Brain--Machine Interfaces
|
2020
|
Tibor Schneider
Xiaying Wang
Michael Hersche
Lukas Cavigelli
Luca Benini
|
+
|
InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting
|
2020
|
Michele Magno
Xiaying Wang
Manuel Eggimann
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators
|
2019
|
Lukas Cavigelli
Georg Rutishauser
Luca Benini
|
+
|
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
|
2019
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
PDF
Chat
|
CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams
|
2019
|
Lukas Cavigelli
Luca Benini
|
+
PDF
Chat
|
Extended Bit-Plane Compression for Convolutional Neural Network Accelerators
|
2019
|
Lukas Cavigelli
Luca Benini
|
+
|
Additive Noise Annealing and Approximation Properties of Quantized Neural Networks
|
2019
|
Matteo Spallanzani
Lukas Cavigelli
Gian Paolo Leonardi
Marko Bertogna
Luca Benini
|
+
|
FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things
|
2019
|
Xiaying Wang
Michele Magno
Lukas Cavigelli
Luca Benini
|
+
|
HR-SAR-Net: A Deep Neural Network for Urban Scene Segmentation from High-Resolution SAR Data
|
2019
|
Xiaying Wang
Lukas Cavigelli
Manuel Eggimann
Michele Magno
Luca Benini
|
+
|
EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators
|
2019
|
Lukas Cavigelli
G Rutishauser
Luca Benini
|
+
PDF
Chat
|
Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features
|
2018
|
Michael Hersche
Tino Rellstab
Pasquale Davide Schiavone
Lukas Cavigelli
Luca Benini
Abbas Rahimi
|
+
PDF
Chat
|
Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS
|
2018
|
Manuel Eggimann
Cary A. Gloor
Florian Scheidegger
Lukas Cavigelli
Michael Schaffner
AljoĆĄa SmoliÄ
Luca Benini
|
+
PDF
Chat
|
XNORBIN: A 95 TOp/s/W hardware accelerator for binary convolutional neural networks
|
2018
|
Andrawes Al Bahou
Geethan Karunaratne
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
|
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks
|
2018
|
Andrawes Al Bahou
Geethan Karunaratne
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
|
Hyperdrive: A Systolically Scalable Binary-Weight CNN Inference Engine for mW IoT End-Nodes
|
2018
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
|
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine.
|
2018
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
PDF
Chat
|
Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?
|
2018
|
Manuele Rusci
Lukas Cavigelli
Luca Benini
|
+
|
Extended Bit-Plane Compression for Convolutional Neural Network Accelerators
|
2018
|
Lukas Cavigelli
Luca Benini
|
+
|
CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams
|
2018
|
Lukas Cavigelli
Luca Benini
|
+
|
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
|
2018
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
|
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks
|
2018
|
Andrawes Al Bahou
Geethan Karunaratne
Renzo Andri
Lukas Cavigelli
Luca Benini
|
+
|
Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features
|
2018
|
Michael Hersche
Tino Rellstab
Pasquale Davide Schiavone
Lukas Cavigelli
Luca Benini
Abbas Rahimi
|
+
|
Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?
|
2017
|
Manuele Rusci
Lukas Cavigelli
Luca Benini
|
+
|
Chipmunk: A Systolically Scalable 0.9 mm${}^2$, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference
|
2017
|
Francesco Conti
Lukas Cavigelli
Gianna Paulin
Igor Susmelj
Luca Benini
|
+
|
Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS
|
2017
|
Manuel Eggimann
Christelle Gloor
Florian Scheidegger
Lukas Cavigelli
Michael Schaffner
AljoĆĄa SmoliÄ
Luca Benini
|
+
|
Efficient Convolutional Neural Network For Audio Event Detection
|
2017
|
Matthias Meyer
Lukas Cavigelli
Lothar Thiele
|
+
PDF
Chat
|
Deep structured features for semantic segmentation
|
2017
|
Michael Tschannen
Lukas Cavigelli
Fabian Mentzer
Thomas Wiatowski
Luca Benini
|
+
PDF
Chat
|
CAS-CNN: A deep convolutional neural network for image compression artifact suppression
|
2017
|
Lukas Cavigelli
Pascal Alexander Hager
Luca Benini
|
+
|
Soft-to-Hard Vector Quantization for End-to-End Learned Compression of Images and Neural Networks.
|
2017
|
Eirikur Agustsson
Fabian Mentzer
Michael Tschannen
Lukas Cavigelli
Radu Timofte
Luca Benini
Luc Van Gool
|
+
|
Soft-to-hard vector quantization for end-to-end learning compressible representations
|
2017
|
Eirikur Agustsson
Fabian Mentzer
Michael Tschannen
Lukas Cavigelli
Radu Timofte
Luca Benini
Luc Van Gool
|
+
PDF
Chat
|
YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration
|
2017
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
|
Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations
|
2017
|
Eirikur Agustsson
Fabian Mentzer
Michael Tschannen
Lukas Cavigelli
Radu Timofte
Luca Benini
Luc Van Gool
|
+
|
CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data
|
2017
|
Lukas Cavigelli
Philippe Degen
Luca Benini
|
+
|
Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?
|
2017
|
Manuele Rusci
Lukas Cavigelli
Luca Benini
|
+
|
Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS
|
2017
|
Manuel Eggimann
Christelle Gloor
Florian Scheidegger
Lukas Cavigelli
Michael Schaffner
AljoĆĄa SmoliÄ
Luca Benini
|
+
|
Chipmunk: A Systolically Scalable 0.9 mm${}^2$, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference
|
2017
|
Francesco Conti
Lukas Cavigelli
Gianna Paulin
Igor Susmelj
Luca Benini
|
+
|
Efficient Convolutional Neural Network For Audio Event Detection
|
2017
|
Matthias Meyer
Lukas Cavigelli
Lothar Thiele
|
+
PDF
Chat
|
Computationally efficient target classification in multispectral image data with Deep Neural Networks
|
2016
|
Lukas Cavigelli
Dominic Bernath
Michele Magno
Luca Benini
|
+
|
Deep Structured Features for Semantic Segmentation
|
2016
|
Michael Tschannen
Lukas Cavigelli
Fabian Mentzer
Thomas Wiatowski
Luca Benini
|
+
PDF
Chat
|
Origami: A 803-GOp/s/W Convolutional Network Accelerator
|
2016
|
Lukas Cavigelli
Luca Benini
|
+
|
YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration
|
2016
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
|
YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights
|
2016
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
|
YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration
|
2016
|
Renzo Andri
Lukas Cavigelli
Davide Rossi
Luca Benini
|
+
|
Deep Structured Features for Semantic Segmentation
|
2016
|
Michael Tschannen
Lukas Cavigelli
Fabian Mentzer
Thomas Wiatowski
Luca Benini
|