RT1 Report: The Multi-Armed Bandit Problem and Thompson Sampling

The first of two reports I have written this year for STOR601 is on the multi-armed bandit problem, supervised by James Grant.

This report focuses on using Thompson sampling to minimise regret for the multi-armed bandit problem, including approximations to Thompson sampling when the method cannot be used directly. These methods are compared empirically using simulated data.

View the report here:

Leave a Reply

Your email address will not be published. Required fields are marked *