Nathaniel Korda

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits 2015 Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+ On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence 2014 Nathaniel Korda
L. A. Prashanth
+ PDF Chat Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control 2014 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+ Stochastic approximation for efficient LSTD and least squares regression 2014 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+ Fast gradient descent for drifting least squares regression, with application to bandits 2013 Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+ Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits. 2013 Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+ Analysis of stochastic approximation for efficient least squares regression and LSTD. 2013 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+ Stochastic approximation for speeding up LSTD (and LSPI) 2013 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+ Fast LSTD using stochastic approximation: Finite time analysis and application to traffic control 2013 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+ Thompson Sampling for 1-Dimensional Exponential Family Bandits 2013 Nathaniel Korda
Emilie Kaufmann
RĂ©mi Munos
+ Concentration bounds for temporal difference learning with linear function approximation: The case of batch data and uniform sampling 2013 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
+ Fast gradient descent for drifting least squares regression, with application to bandits 2013 Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
+ Thompson Sampling: An Asymptotically Optimal Finite Time Analysis 2012 Emilie Kaufmann
Nathaniel Korda
RĂ©mi Munos
+ PDF Chat Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis 2012 Emilie Kaufmann
Nathaniel Korda
RĂ©mi Munos
+ Thompson Sampling: An Asymptotically Optimal Finite Time Analysis 2012 Emilie Kaufmann
Nathaniel Korda
Munos Remi
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ Online convex programming and generalized infinitesimal gradient ascent 2003 Martin Zinkevich
7
+ PDF Chat A contextual-bandit approach to personalized news article recommendation 2010 Lihong Li
Wei Chu
John Langford
Robert E. Schapire
7
+ PDF Chat Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms 2011 Lihong Li
Wei Chu
John Langford
Xuanhui Wang
5
+ PDF Chat Concentration bounds for stochastic approximations 2012 Noufel Frikha
Stéphane Menozzi
4
+ PDF Chat Transport-Entropy inequalities and deviation estimates for stochastic approximation schemes 2013 Max Fathi
Noufel Frikha
3
+ A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets 2012 Nicolas Le Roux
Mark Schmidt
Francis Bach
3
+ PDF Chat Linearly Parameterized Bandits 2010 Paat Rusmevichientong
John N. Tsitsiklis
2
+ Analysis of Thompson Sampling for the multi-armed bandit problem 2011 Shipra Agrawal
Navin Goyal
2
+ PDF Chat Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits 2015 Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
2
+ ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES 1933 W. R THOMPSON
2
+ Online Learning as Stochastic Approximation of Regularization Paths 2011 Pierre TarrĂšs
Yuan Yao
2
+ PDF Chat Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control 2014 L. A. Prashanth
Nathaniel Korda
RĂ©mi Munos
2
+ Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits. 2013 Nathaniel Korda
L. A. Prashanth
RĂ©mi Munos
1
+ OnActor-Critic Algorithms 2003 Vijay R. Konda
John N. Tsitsiklis
1
+ PDF Chat Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path 2007 AndrĂĄs Antos
Csaba SzepesvĂĄri
RĂ©mi Munos
1
+ Statistical Linear Estimation with Penalized Estimators: an Application to Reinforcement Learning 2012 Bernardo Ávila Pires
Csaba SzepesvĂĄri
1
+ Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization 2012 Shai Shalev‐Shwartz
Tong Zhang
1
+ The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond 2011 Aurélien Garivier
Olivier Cappé
1