Projects
Reading
People
Chat
SU\G
(𝔸)
/K·U
Projects
Reading
People
Chat
Sign Up
Light
Dark
System
Johan Ferret
Follow
Share
Generating author description...
All published works
Action
Title
Year
Authors
+
PDF
Chat
Humanity's Last Exam
2025
Long Phan
Alice Gatti
Ziwen Han
Nathaniel Li
Josephina Hu
Hugh Zhang
Shuangshuang Shi
Michael Y. Choi
Arjun Agrawal
Asmita Chopra
+
PDF
Chat
Diversity-Rewarded CFG Distillation
2024
Geoffrey Cideron
Andrea Agostinelli
Johan Ferret
Sertan Girgin
Romuald Élie
Olivier Bachem
Sarah Perrin
Alexandre Ramé
+
PDF
Chat
Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL
2024
Eduardo Pignatelli
Johan Ferret
Tim Rockäschel
Edward Grefenstette
Davide Paglieri
Samuel Coward
Laura Toni
+
PDF
Chat
Gemma 2: Improving Open Language Models at a Practical Size
2024
Gemma Team
Morgane Rivière
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
Léonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ramé
+
PDF
Chat
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
2024
Kaiwen Wang
Rahul Kidambi
Ryan Sullivan
A. Agarwal
Christoph Dann
Andrea Michi
Marco Gelmi
Yunxuan Li
Raghav Gupta
Avinava Dubey
+
PDF
Chat
BOND: Aligning LLMs with Best-of-N Distillation
2024
Pier Giuseppe Sessa
Robert Dadashi
Léonard Hussenot
Johan Ferret
Nino Vieillard
Alexandre Ramé
Bobak Shariari
Sarah Perrin
Abe Friesen
Geoffrey Cideron
+
PDF
Chat
WARP: On the Benefits of Weight Averaged Rewarded Policies
2024
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
+
PDF
Chat
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
2024
Aleksandar Botev
Soham De
Samuel Smith
Anushan Fernando
George-Cristian Muraru
Ruba Haroun
Leonard Berrada
Razvan Pascanu
Pier Giuseppe Sessa
Robert Dadashi
+
PDF
Chat
Gemma: Open Models Based on Gemini Research and Technology
2024
Gemma Team
Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
Shreya Pathak
Laurent Sifre
Morgane Rivière
Mihir Kale
Juliette Love
+
PDF
Chat
Direct Language Model Alignment from Online AI Feedback
2024
Shangmin Guo
Biao Zhang
Tianlin Liu
Tianqi Liu
Misha Khalman
Felipe Llinares
Alexandre Ramé
Thomas Mesnard
Yao Zhao
Bilal Piot
+
WARM: On the Benefits of Weight Averaged Reward Models
2024
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
+
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
2023
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
Robert Dadashi
Matthieu Geist
Sertan Girgin
Léonard Hussenot
Orgad Keller
+
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
2023
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
Robert Dadashi
Matthieu Geist
Sertan Girgin
Léonard Hussenot
Orgad Keller
+
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
2023
Eduardo Pignatelli
Johan Ferret
Matthieu Geist
Thomas Mesnard
Hado van Hasselt
Laura Toni
+
Gemini: A Family of Highly Capable Multimodal Models
2023
Gemini Team
Rohan Anil
Sebastian Borgeaud
Jean-Baptiste Alayrac
Jiahui Yu
Radu Soricut
Johan Schalkwyk
Andrew M. Dai
Anja Hauth
Katie Millican
+
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
2022
Alexis Jacq
Johan Ferret
Olivier Pietquin
Matthieu Geist
+
PDF
Chat
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning
2021
Nathan Grinsztajn
Johan Ferret
Olivier Pietquin
Pierre‐Marie Preux
Matthieu Geist
+
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences
2021
Toby Johnstone
Nathan Grinsztajn
Johan Ferret
Pierre‐Marie Preux
+
PDF
Chat
Adversarially Guided Actor-Critic
2021
Yannis Flet-Berliac
Johan Ferret
Olivier Pietquin
Pierre‐Marie Preux
Matthieu Geist
+
PDF
Chat
Adversarially Guided Actor-Critic
2021
Yannis Flet-Berliac
Johan Ferret
Olivier Pietquin
Pierre‐Marie Preux
Matthieu Geist
+
PDF
Chat
Self-Imitation Advantage Learning
2021
Johan Ferret
Olivier Pietquin
Matthieu Geist
+
Adversarially Guided Actor-Critic
2021
Yannis Flet-Berliac
Johan Ferret
Olivier Pietquin
Pierre‐Marie Preux
Matthieu Geist
+
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences
2021
Toby Johnstone
Nathan Grinsztajn
Johan Ferret
Pierre‐Marie Preux
+
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning
2021
Nathan Grinsztajn
Johan Ferret
Olivier Pietquin
Pierre‐Marie Preux
Matthieu Geist
+
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning
2020
Johan Ferret
Raphaël Marinier
Matthieu Geist
Olivier Pietquin
+
Self-Imitation Advantage Learning
2020
Johan Ferret
Olivier Pietquin
Matthieu Geist
+
Acme: A Research Framework for Distributed Reinforcement Learning
2020
Matthew W. Hoffman
Bobak Shahriari
John Aslanides
Gabriel Barth-Maron
Nikola Momchev
Danila Sinopalnikov
Piotr Stańczyk
Sabela Ramos
Anton Raichuk
Damien Vincent
+
Credit Assignment as a Proxy for Transfer in Reinforcement Learning.
2019
Johan Ferret
Raphaël Marinier
Matthieu Geist
Olivier Pietquin
Common Coauthors
Coauthor
Papers Together
Matthieu Geist
14
Olivier Pietquin
13
Léonard Hussenot
11
Robert Dadashi
10
Olivier Bachem
10
Sertan Girgin
9
Nino Vieillard
8
Pierre‐Marie Preux
7
Geoffrey Cideron
7
Alexandre Ramé
7
Piotr Stańczyk
6
Nikola Momchev
6
Sabela Ramos
5
Thomas Mesnard
5
Pier Giuseppe Sessa
5
Koray Kavukcuoglu
4
Nathan Grinsztajn
4
Cassidy Hardin
4
Danila Sinopalnikov
4
Clément Farabet
4
Evan Senter
4
Surya Bhupatiraju
4
Laurent Sifre
4
Minh Giang
4
Pouya D. Tafti
4
Demis Hassabis
4
Sebastian Borgeaud
4
Morgane Rivière
4
Abe Friesen
3
Orgad Keller
3
Tris Warkentin
3
Eric Noland
3
Charline Le Lan
3
Noah Fiedel
3
Sarah Perrin
3
Aliaksei Severyn
3
Jeff Stanway
3
Ludovic Peran
3
Tom Hennigan
3
Antonia Paterson
3
Mateo Wirth
3
Amélie Héliou
3
Christopher A. Choquette-Choo
3
Ramona Comanescu
3
Machel Reid
3
Mihir Kale
3
Bobak Shahriari
3
Elena Buchatskaya
3
Yannis Flet-Berliac
3
Kathleen Kenealy
3
Commonly Cited References
Action
Title
Year
Authors
# of times referenced
+
Proximal Policy Optimization Algorithms
2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
5
+
Munchausen Reinforcement Learning
2020
Nino Vieillard
Olivier Pietquin
Matthieu Geist
4
+
Exploration by Random Network Distillation
2018
Yuri Burda
Harrison Edwards
Amos Storkey
Oleg Klimov
4
+
Adam: A Method for Stochastic Optimization
2014
Diederik P. Kingma
Jimmy Ba
4
+
Go-Explore: a New Approach for Hard-Exploration Problems
2019
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
4
+
Noisy Networks For Exploration
2018
Meire Fortunato
Mohammad Gheshlaghi Azar
Bilal Piot
Jacob Menick
Ian Osband
Alexander Graves
Vlad Mnih
Rémi Munos
Demis Hassabis
Olivier Pietquin
3
+
Episodic Curiosity through Reachability
2018
Nikolay Savinov
Anton Raichuk
Raphaël Marinier
Damien Vincent
Marc Pollefeys
Timothy Lillicrap
Sylvain Gelly
3
+
Generative Adversarial Imitation Learning
2016
Jonathan Ho
Stefano Ermon
3
+
A Theory of Regularized Markov Decision Processes
2019
Matthieu Geist
Bruno Scherrer
Olivier Pietquin
3
+
Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation
2018
Niels Justesen
Rubén Rodríguez Torrado
Philip Bontrager
Ahmed Khalifa
Julian Togelius
Sebastian Risi
2
+
PDF
Chat
Self-Imitation Advantage Learning
2021
Johan Ferret
Olivier Pietquin
Matthieu Geist
2
+
Connecting Generative Adversarial Networks and Actor-Critic Methods
2016
David Pfau
Oriol Vinyals
2
+
Only Relevant Information Matters: Filtering Out Noisy Samples To Boost RL
2020
Yannis Flet-Berliac
Pierre‐Marie Preux
2
+
Towards Principled Methods for Training Generative Adversarial Networks
2017
Martín Arjovsky
Léon Bottou
2
+
Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
2020
Seungyul Han
Youngchul Sung
2
+
PDF
Chat
The Arcade Learning Environment: An Evaluation Platform for General Agents
2013
Marc G. Bellemare
Yavar Naddaf
Joel Veness
Michael Bowling
2
+
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning
2020
Johan Ferret
Raphaël Marinier
Matthieu Geist
Olivier Pietquin
2
+
Asynchronous Methods for Deep Reinforcement Learning
2016
Volodymyr Mnih
Adrià Puigdomènech Badia
Mehdi Mirza
Alex Graves
Tim Harley
Timothy Lillicrap
David Silver
Koray Kavukcuoglu
2
+
Concrete Problems in AI Safety
2016
Dario Amodei
Chris Olah
Jacob Steinhardt
Paul F. Christiano
John Schulman
Dan Mané
2
+
Distributional Smoothing with Virtual Adversarial Training
2016
Takeru Miyato
Shin‐ichi Maeda
Masanori Koyama
Ken Nakae
Shin Ishii
2
+
Generalization and Regularization in DQN
2018
Jesse Farebrother
Marlos C. Machado
Michael Bowling
2
+
PDF
Chat
Attention Augmented Convolutional Networks
2019
Irwan Bello
Barret Zoph
Quoc V. Le
Ashish Vaswani
Jonathon Shlens
2
+
Observational Overfitting in Reinforcement Learning
2019
Xingyou Song
Yiding Jiang
Yilun Du
Behnam Neyshabur
2
+
Diversity is All You Need: Learning Skills without a Reward Function
2018
Benjamin Eysenbach
Abhishek Gupta
Julian Ibarz
Sergey Levine
2
+
PDF
Chat
Learning Value Functions in Deep Policy Gradients using Residual Variance
2021
Yannis Flet-Berliac
Reda Ouhamma
Odalric-Ambrym Maillard
Pierre‐Marie Preux
2
+
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
2017
Irina Higgins
Arka Pal
Andrei A. Rusu
Löıc Matthey
Christopher Burgess
Alexander Pritzel
Matthew Botvinick
Charles Blundell
Alexander Lerchner
2
+
Leverage the Average: an Analysis of Regularization in RL
2020
Nino Vieillard
Tadashi Kozuno
Bruno Scherrer
Olivier Pietquin
Rémi Munos
Matthieu Geist
2
+
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
2015
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
2
+
Safe and Efficient Off-Policy Reinforcement Learning
2016
Rémi Munos
Tom Stepleton
Anna Harutyunyan
Marc G. Bellemare
2
+
Understanding the impact of entropy on policy optimization
2018
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
2
+
Deep Recurrent Q-Learning for Partially Observable MDPs
2015
Matthew Hausknecht
Peter Stone
2
+
High-Dimensional Continuous Control Using Generalized Advantage Estimation
2015
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
2
+
Sample Efficient Actor-Critic with Experience Replay
2016
Ziyu Wang
Victor Bapst
Nicolas Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
2
+
A Study on Overfitting in Deep Reinforcement Learning
2018
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
2
+
Continuous control with deep reinforcement learning
2016
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
Nicolas Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
2
+
A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning
2018
Amy Zhang
Nicolas Ballas
Joëlle Pineau
2
+
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
2017
Chelsea Finn
Pieter Abbeel
Sergey Levine
2
+
Adversarial Machine Learning at Scale
2016
Alexey Kurakin
Ian Goodfellow
Samy Bengio
2
+
Agent57: Outperforming the Atari Human Benchmark
2020
Adrià Puigdomènech Badia
Bilal Piot
Steven Kapturowski
Pablo Sprechmann
Alex Vitvitskyi
Daniel Guo
Charles Blundell
2
+
PDF
Chat
ViZDoom: A Doom-based AI research platform for visual reinforcement learning
2016
Michał Kempka
Marek Wydmuch
Grzegorz Runc
Jakub Toczek
Wojciech Jaśkowski
2
+
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
2018
Lasse Espeholt
Hubert Soyer
Rémi Munos
Karen Simonyan
Volodymir Mnih
Tom Ward
Yotam Doron
Vlad Firoiu
Tim Harley
Iain Dunning
1
+
Maximum a Posteriori Policy Optimisation
2018
Abbas Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
Nicolas Heess
Martin Riedmiller
1
+
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening
2016
Frank He
Yang Liu
Alexander G. Schwing
Jian Peng
1
+
AI Safety Gridworlds
2017
Jan Leike
Miljan Martic
Victoria Krakovna
Pedro A. Ortega
Tom Everitt
Andrew Lefrancq
Laurent Orseau
Shane Legg
1
+
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
2017
Yuhuai Wu
Elman Mansimov
S. Matthew Liao
Roger Grosse
Jimmy Ba
1
+
PDF
Chat
Self-Supervised Video Representation Learning with Odd-One-Out Networks
2017
Basura Fernando
Hakan Bilen
Efstratios Gavves
Stephen Jay Gould
1
+
Deep reinforcement learning with double Q-Learning
2016
Hado van Hasselt
Arthur Guez
David Silver
1
+
Imagination-Augmented Agents for Deep Reinforcement Learning
2017
Théophane Weber
Sébastien Racanière
David Reichert
Lars Buesing
Arthur Guez
Danilo Jimenez Rezende
Adrià Puigdomènech Badia
Oriol Vinyals
Nicolas Heess
Yujia Li
1
+
Learning to reinforcement learn
2016
Jane X. Wang
Zeb Kurth‐Nelson
Dhruva Tirumala
Hubert Soyer
Joel Z. Leibo
Rémi Munos
Charles Blundell
Dharshan Kumaran
Matt Botvinick
1
+
Provably Efficient Maximum Entropy Exploration
2018
Elad Hazan
Sham M. Kakade
Karan Singh
Abby Van Soest
1