Workshop and Seminars | Infosys Centre for AI

Exploration–Exploitation Dilemma in Reinforcement Learning: Calibrating Optimism in the Face of Uncertainty

Date: Monday, 15th September 2025

Time: 4:00 – 5:00 PM

Venue: A007, R&D Block

Title: Exploration–Exploitation Dilemma in Reinforcement Learning: Calibrating Optimism in the Face of Uncertainty

Abstract:

In Reinforcement Learning (RL), we study sequential decision-making problems under uncertainty and partial information. The exploration–exploitation dilemma lies at the core of RL algorithms—whether to collect new information or exploit existing information to maximise a desired objective. This talk will focus on the Markov Decision Process (MDP) formulation of RL problems, exploring two algorithm design paradigms to address this dilemma: frequentist algorithms with optimistic indices and Bayesian algorithms with posterior sampling. Dr. Basu will present a historical overview of these approaches, their theoretical analyses, and conclude with recent work on Langevin sampling-based RL algorithms (arXiv:2412.20824

) that achieve both theoretical guarantees and practical efficiency, even in deep RL settings, thus bridging the theory-to-practice gap.

Bio:

Dr. Debabrota Basu is a faculty member at Inria, France, where he leads research on robust, private, and fair machine learning. He previously worked as a Postdoctoral Research Fellow at Chalmers University of Technology, Sweden, and as a Research Fellow at the National University of Singapore, where he also earned his PhD in Computer Science. His work lies at the intersection of machine learning, privacy, and algorithmic fairness.

Notes

Authors

* External authors

Seminar Topic

Seminar

Hosted By

CAI

Seminar Date

September 15, 2025