Hiteshi Sharma

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning 2024 Nabil Omi
Hosein Hasanbeig
Hiteshi Sharma
Sriram K. Rajamani
Siddhartha Sen
+ PDF Chat Self-Exploring Language Models: Active Preference Elicitation for Online Alignment 2024 Shenao Zhang
Donghan Yu
Hiteshi Sharma
Ziyi Yang
Shuohang Wang
Hany Hassan
Zhaoran Wang
+ Fine-Tuning Language Models with Advantage-Induced Policy Alignment 2023 Banghua Zhu
Hiteshi Sharma
Felipe Vieira Frujeri
Shi Dong
Chenguang Zhu
Michael I. Jordan
Jiantao Jiao
+ ALLURE: Auditing and Improving LLM-based Evaluation of Text using Iterative In-Context-Learning 2023 Hosein Hasanbeig
Hiteshi Sharma
Leo Betthauser
Felipe Vieira Frujeri
Ida Momennejad
+ Evaluating Cognitive Maps and Planning in Large Language Models with CogEval 2023 Ida Momennejad
Hosein Hasanbeig
Felipe do Nascimento Vieira
Hiteshi Sharma
Robert Osazuwa Ness
Nebojša Jojić
Hamid Palangi
Jonathan Larson
+ Language Models can be Logical Solvers 2023 Jiazhan Feng
Ruochen Xu
Junheng Hao
Hiteshi Sharma
Yelong Shen
Dongyan Zhao
Weizhu Chen
+ Randomized Policy Learning for Continuous State and Action MDPs 2020 Hiteshi Sharma
Rahul Jain
+ An Empirical Relative Value Learning Algorithm for Non-parametric MDPs with Continuous State Space 2019 Hiteshi Sharma
Rahul Jain
Abhishek Gupta
+ A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs 2019 William B. Haskell
Rahul Jain
Hiteshi Sharma
P. L. Yu
+ Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes 2019 Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
Rahul Jain
+ An Empirical Dynamic Programming Algorithm for Continuous MDPs 2017 William B. Haskell
Rahul Jain
Hiteshi Sharma
P. L. Yu
+ A dynamical systems framework for stochastic iterative optimization 2016 William B. Haskell
Rahul Jain
Hiteshi Sharma
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat Empirical Dynamic Programming 2016 William B. Haskell
Rahul Jain
Dileep Kalathil
5
+ Finite-Time Bounds for Fitted Value Iteration 2008 Rémi Munos
Csaba Szepesvári
4
+ PDF Chat Uniform approximation of functions with random bases 2008 Ali Rahimi
Benjamin Recht
3
+ Using Randomization to Break the Curse of Dimensionality 1997 John Rust
2
+ Nonlinear approximation 1998 Ronald DeVore
2
+ PDF Chat Shannon sampling II: Connections to learning theory 2005 Steve Smale
Ding‐Xuan Zhou
2
+ Modelling transition dynamics in MDPs with RKHS embeddings 2012 Steffen Grünewälder
Guy Lever
Luca Baldassarre
Massi Pontil
Arthur Gretton
2
+ Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension 1995 David Haussler
2
+ PDF Chat Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization 2014 Shai Shalev‐Shwartz
Tong Zhang
1
+ PDF Chat Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm 2015 Deanna Needell
Nathan Srebro
Rachel Ward
1
+ Parallelized Stochastic Gradient Descent 2010 Martin Zinkevich
Markus Weimer
Lihong Li
Alex Smola
1
+ Continuous control with deep reinforcement learning 2015 Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
Nicolas Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
1
+ An empirical algorithm for relative value iteration for average-cost MDPs 2015 Abhishek Gupta
Rahul Jain
Peter W. Glynn
1
+ Minimax Regret Bounds for Reinforcement Learning 2017 Mohammad Gheshlaghi Azar
Ian Osband
Rémi Munos
1
+ Proximal Policy Optimization Algorithms 2017 John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
1
+ Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control 2017 Riashat Islam
Peter Henderson
Maziar Gomrokchi
Doina Precup
1
+ Learning-based Control of Unknown Linear Systems with Thompson Sampling 2017 Yi Ouyang
Mukul Gagrani
Rahul Jain
1
+ An Empirical Dynamic Programming Algorithm for Continuous MDPs 2017 William B. Haskell
Rahul Jain
Hiteshi Sharma
P. L. Yu
1
+ Primal-Dual $π$ Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems 2017 Mengdi Wang
1
+ Simple random search provides a competitive approach to reinforcement learning 2018 Horia Mania
Aurelia Guy
Benjamin Recht
1
+ Regret Bounds for Reinforcement Learning via Markov Chain Concentration 2018 Ronald Ortner
1
+ Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP 2019 Kefan Dong
Yuanhao Wang
Xiaoyu Chen
Liwei Wang
1
+ Improved Path-length Regret Bounds for Bandits 2019 Sébastien Bubeck
Yuanzhi Li
Haipeng Luo
Chen-Yu Wei
1
+ Trust Region Policy Optimization 2015 John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
1
+ Fast Stochastic Alternating Direction Method of Multipliers 2013 Leon Wenliang Zhong
James T. Kwok
1
+ Is Q-learning Provably Efficient? 2018 Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
1
+ Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning 2018 Ronan Fruit
Matteo Pirotta
Alessandro Lazaric
Ronald Ortner
1
+ Asynchronous Methods for Deep Reinforcement Learning 2016 Volodymyr Mnih
Adrià Puigdomènech Badia
Mehdi Mirza
Alex Graves
Tim Harley
Timothy Lillicrap
David Silver
Koray Kavukcuoglu
1
+ PDF Chat Analysis and Design of Optimization Algorithms via Integral Quadratic Constraints 2016 Laurent Lessard
Benjamin Recht
Andrew Packard
1
+ Exploration-Enhanced POLITEX 2019 Yasin Abbasi-Yadkori
Nevena Lazic
Csaba Szepesvári
Gellért Weisz
1
+ Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function 2019 Zihan Zhang
Xiangyang Ji
1
+ Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs 2018 Mohammad Sadegh Talebi
Odalric-Ambrym Maillard
1
+ PDF Chat Deep Randomized Neural Networks 2020 Claudio Gallicchio
Simone Scardapane
1
+ Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers 2014 Taiji Suzuki
1
+ Generalization and Exploration via Randomized Value Functions 2014 Ian Osband
Benjamin Van Roy
Zheng Wen
1
+ Convex Optimization 2004 Stephen Boyd
Lieven Vandenberghe
1
+ REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs 2012 Peter L. Bartlett
Ambuj Tewari
1
+ Proximal Stochastic Dual Coordinate Ascent 2012 Shai Shalev‐Shwartz
Tong Zhang
1
+ Modelling transition dynamics in MDPs with RKHS embeddings 2012 Guy Lever
Luca Baldassarre
Arthur Gretton
Massimiliano Pontil
Steffen Gr new lder
1
+ PDF Chat Incremental Stochastic Subgradient Algorithms for Convex Optimization 2009 S. Sundhar Ram
A. Nedić
Venugopal V. Veeravalli
1
+ PDF Chat A Proximal Stochastic Gradient Method with Progressive Variance Reduction 2014 Lin Xiao
Tong Zhang
1
+ Convergence analysis of gradient descent stochastic algorithms 1996 Alexander Shapiro
Y. Wardi
1
+ The uniform convergence of nearest neighbor regression function estimators and their application in optimization 1978 Luc Devroye
1