Projects
Reading
People
Chat
SU\G
(đž)
/K·U
Projects
Reading
People
Chat
Sign Up
Light
Dark
System
Nathaniel Korda
Follow
Share
Generating author description...
All published works
Action
Title
Year
Authors
+
PDF
Chat
Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits
2015
Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence
2014
Nathaniel Korda
L. A. Prashanth
+
PDF
Chat
Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control
2014
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+
Stochastic approximation for efficient LSTD and least squares regression
2014
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+
Fast gradient descent for drifting least squares regression, with application to bandits
2013
Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+
Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits.
2013
Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+
Analysis of stochastic approximation for efficient least squares regression and LSTD.
2013
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+
Stochastic approximation for speeding up LSTD (and LSPI)
2013
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+
Fast LSTD using stochastic approximation: Finite time analysis and application to traffic control
2013
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+
Thompson Sampling for 1-Dimensional Exponential Family Bandits
2013
Nathaniel Korda
Emilie Kaufmann
RĂ©mi Munos
+
Concentration bounds for temporal difference learning with linear function approximation: The case of batch data and uniform sampling
2013
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+
Fast gradient descent for drifting least squares regression, with application to bandits
2013
Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
2012
Emilie Kaufmann
Nathaniel Korda
RĂ©mi Munos
+
PDF
Chat
Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis
2012
Emilie Kaufmann
Nathaniel Korda
RĂ©mi Munos
+
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
2012
Emilie Kaufmann
Nathaniel Korda
Munos Remi
Common Coauthors
Coauthor
Papers Together
L. A. Prashanth
11
RĂ©mi Munos
8
RĂ©mi Munos
5
Emilie Kaufmann
4
Munos Remi
1
Commonly Cited References
Action
Title
Year
Authors
# of times referenced
+
Online convex programming and generalized infinitesimal gradient ascent
2003
Martin Zinkevich
7
+
PDF
Chat
A contextual-bandit approach to personalized news article recommendation
2010
Lihong Li
Wei Chu
John Langford
Robert E. Schapire
7
+
PDF
Chat
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
2011
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
5
+
PDF
Chat
Concentration bounds for stochastic approximations
2012
Noufel Frikha
Stéphane Menozzi
4
+
PDF
Chat
Transport-Entropy inequalities and deviation estimates for stochastic approximation schemes
2013
Max Fathi
Noufel Frikha
3
+
A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets
2012
Nicolas Le Roux
Mark Schmidt
Francis Bach
3
+
PDF
Chat
Linearly Parameterized Bandits
2010
Paat Rusmevichientong
John N. Tsitsiklis
2
+
Analysis of Thompson Sampling for the multi-armed bandit problem
2011
Shipra Agrawal
Navin Goyal
2
+
PDF
Chat
Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits
2015
Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
2
+
ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES
1933
W. R THOMPSON
2
+
Online Learning as Stochastic Approximation of Regularization Paths
2011
Pierre TarrĂšs
Yuan Yao
2
+
PDF
Chat
Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control
2014
L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
2
+
Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits.
2013
Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
1
+
OnActor-Critic Algorithms
2003
Vijay R. Konda
John N. Tsitsiklis
1
+
PDF
Chat
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
2007
AndrĂĄs Antos
Csaba SzepesvĂĄri
RĂ©mi Munos
1
+
Statistical Linear Estimation with Penalized Estimators: an Application to Reinforcement Learning
2012
Bernardo Ăvila Pires
Csaba SzepesvĂĄri
1
+
Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization
2012
Shai ShalevâShwartz
Tong Zhang
1
+
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
2011
Aurélien Garivier
Olivier Cappé
1