Rahul Kidambi

I earned my PhD from University of Washington Seattle (Advisor: Sham M. Kakade).

I then pursued post-doctoral research at Cornell University.

contact: rkidambi AT uw DOT edu | Google Scholar.

Research

I study topics in Machine Learning, Deep Learning and AI. Currently, my interests are in:

Reinforcement Learning including topics in exploration, offline RL, counterfactual reasoning and learning with expert supervision (imitation learning).

Stochastic Gradient Methods/Stochastic Approximation for large-scale Machine Learning and Deep Learning.

I consider issues in these topics with applications to Human-facing systems including recommendation systems and robotics.

Research Threads

Stochastic Gradient Descent for large scale ML

Mini-Batch/parallel/distributed SGD [JMLR 2018]

Accelerated/Momentum based SGD [COLT 2018, ICLR 2018]

Learning Rate Scheduling for SGD [NeurIPS 2019]

Algorithmic Frameworks for Model-Based Interactive Learning

MOReL - Offline RL [NeurIPS 2020]

MobILE - Third Person Imitation Learning [NeurIPS 2021 (to appear)]

MILO - Imitation Learning with offline behavior data [NeurIPS 2021 (to appear)]

PhD Thesis

Stochastic Gradient Descent For Modern Machine Learning: Theory, Algorithms And Applications,
Rahul Kidambi.
PhD Thesis, University of Washington Seattle, June 2019.
[Link]

Publications

Asterisk [*] indicates alphabetical ordering of authors.

Conference/Journal Papers

Counterfactual Learning to Rank for Utility Maximizing Query Autocompletion,
Adam Block, Rahul Kidambi, Daniel Hill, Thorsten Joachims, Inderjit Dhillon.
To Appear, ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022.
ArXiv manuscript, abs/2204.10936, April 2022.

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage,
Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun.
In Proc. Neural Information Processing Systems (NeurIPS), 2021.
ArXiv manuscript, abs/2106.03207, May 2021.
[Code]

MobILE: Model-based Imitation Learning from Observation Alone,⁴
Rahul Kidambi, Jonathan Chang, Wen Sun.
In Proc. Neural Information Processing Systems (NeurIPS), 2021.
ArXiv manuscript, abs/2102.10769, February 2021.

Top-k eXtreme Contextual Bandits with Arm Hierarchy,
Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon.
In Proc. International Conference on Machine Learning (ICML), 2021.
ArXiv manuscript, abs/2102.07800, February 2021.

Making Paper Reviewing Robust to Bid Manipulation Attacks,
Ruihan Wu, Chuan Guo, Felix Wu, Rahul Kidambi, Laurens van der Maaten, Kilian Q. Weinberger.
In Proc. International Conference on Machine Learning (ICML), 2021.
ArXiv manuscript, abs/2102.06020, February 2021.

MOReL: Model-Based Offline Reinforcement Learning,
Rahul Kidambi, Aravind Rajeswaran, Praneeth Netrapalli, Thorsten Joachims.
In Proc. Neural Information Processing Systems (NeurIPS), 2020.
ArXiv manuscript, abs/2005.05951, May 2020.
[Project Page]

Leverage Score Sampling for Faster Accelerated Regression and ERM, [*]
Naman Agarwal, Sham M. Kakade, Rahul Kidambi, Yin Tat Lee, Praneeth Netrapalli, Aaron Sidford.
In Proc. Algorithmic Learning Theory (ALT), 2020.
ArXiv manuscript, abs/1711.08426, November 2017.

The Step Decay Schedule: A Near Optimal Geometrically Decaying Learning Rate Procedure For Least Squares, [*] ³
Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli.
In Proc. Neural Information Processing Systems (NeurIPS), 2019.
ArXiv manuscript, abs/1904.12838, April 2019.
[Slides]

Open Problem: Do Good Algorithms Necessarily Query Bad Points?, [*]
Rong Ge, Prateek Jain, Sham M. Kakade, Rahul Kidambi, Dheeraj M. Nagaraj, Praneeth Netrapalli.
In Proc. Conference on Learning Theory (COLT), 2019.
[COLT Proceedings]

On the insufficiency of existing Momentum schemes for Stochastic Optimization,
Rahul Kidambi, Praneeth Netrapalli, Prateek Jain, Sham M. Kakade.
In International Conference on Learning Representations (ICLR), 2018. (Oral Presentation: 23/1002 submissions ≈ 2% Acceptance Rate.)
Also an invited paper at Information Theory and Applications (ITA) workshop, San Diego, 2018.
ArXiv manuscript, abs/1803.05591, March 2018.
[Open Review] [ITA version] [Code]

Accelerating Stochastic Gradient Descent for least squares regression², [*]
Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford.
In Proc. Conference on Learning Theory (COLT), 2018.
ArXiv manuscript, abs/1704.08227, April 2017.
[COLT proceedings] [Video (Sham at MSR)]

Parallelizing Stochastic Gradient Descent for Least Squares Regression: mini-batching, averaging, and model misspecification¹, [*]
Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford.
In Journal of Machine Learning Research (JMLR), Vol. 18 (223), July 2018.
ArXiv manuscript, abs/1610.03774, October 2016. Updated, April 2018.
[JMLR link]

Submodular Hamming Metrics,
Jennifer Gillenwater, Rishabh K. Iyer, Bethany Lusch, Rahul Kidambi, Jeff A. Bilmes.
In Proc. Neural Information Processing Systems (NeurIPS), December 2015. (Spotlight Presentation)
ArXiv manuscript, abs/1511.02163, November 2015.
[NeurIPS proceedings]

Deformable trellises on factor graphs for robust microtubule tracking in clutter,
Rahul Kidambi, Min-Chi Shih, Kenneth Rose.
In Proc. International Symposium on Biomedical Imaging (ISBI), May 2012.
[ISBI proceedings]

Invited/Workshop Papers

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares), [*]
Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford.
Invited paper at FSTTCS 2017.
ArXiv manuscript, abs/1710.09430, October 2017.

On Shannon capacity and causal estimation,
Rahul Kidambi, Sreeram Kannan.
Invited paper at Allerton Conference on Communication, Control, and Computing, 2015.
[Allerton proceedings]

Technical Reports

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles,
Dhruv Mahajan, Vivek Gupta, S. Sathiya Keerthi, Sundararajan Sellamanickam, Shravan Narayanamurthy, Rahul Kidambi.
ArXiv manuscript, abs/1711.05482, November 2017.

A Structured Prediction Approach for Missing Value Imputation,
Rahul Kidambi, Vinod Nair, Sundararajan Sellamanickam, S. Sathiya Keerthi.
ArXiv manuscript, abs/1311.2137, November 2013.

A Quantitative Evaluation Framework for Missing Value Imputation Algorithms,
Vinod Nair, Rahul Kidambi, Sundararajan Sellamanickam, S. Sathiya Keerthi, Johannes Gehrke, Vijay Narayanan.
ArXiv manuscript, abs/1311.2276, November 2013.

The dblp maintains a listing of my papers.

^{1. Earlier Version Titled "Parallelizing Stochastic Approximation Through Mini-Batching and Tail Averaging."↩}
^{2. Earlier Version Titled "Accelerating Stochastic Gradient Descent."↩}
^{3. Earlier Version Titled "The Step Decay Schedule: A Near Optimal Geometrically Decaying Learning Rate Procedure."↩}
^{4. Earlier Version Titled "Optimism is all you need: Model-based Imitation Learning from Observation Alone."↩}

Academic Service

Conference Reviewing: COLT, NeurIPS, ICML, ICLR, AISTATS, ALT, UAI, ISIT.

Journal Refereeing: Journal of Machine Learning Research (JMLR), Electronic Journal of Statistics, IEEE Transactions of Information Theory, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

I am also a member of the JMLR editorial board.

Teaching

Some classes I have TA'ed for include:

CSE 547/STAT 548: Machine Learning for Big Data. (Spring 2018).

EE 514a: Information Theory-I (Autumn 2015).