This week as part of the MRes course we had to pick our next topic to write a report on. I was really stuck between two options but in the end had to chose one. I thought if I’m not going to write a report on the other option I can at least write a blog on it and here we are. So as you may or may not have guessed from the title of this post the option I didn’t chose was: Multi-armed bandits. At the end of the talk on this area the lecturer, Kevin Glazebrook, mentioned some areas of particular study. One that particularly caught my eye was fast fashion. Before deciding to do a Maths degree I wanted to be a fashion designer or just any job in fashion really. While I gave up that dream for my love of Maths, it is still an interest of mine. Hence, I was very excited by the combination of the two areas.
Previously clothing companies would have to make decisions on what products that were selling that season with very little information on where demand may lie that season. As you can imagine this leads to them missing opportunities of selling popular goods as well as having excess supply of unwanted products. As technology has improved, especially manufacturing schemes and means of transport, companies have been able to delay some of the production for that season. This means they will have more information on what’s in demand that season and then produce and sell goods accordingly.
Now you may be thinking that’s very nice but what’s that got to do with maths. Well I’ll tell you. I assume you remember one of the focuses of this blog was multi-armed bandits. If you are picturing something like this picture from this blog, do not worry as in this case we are talking about problems known as multi-armed bandits. In these problems we have a series of time steps and at each time step we have to make a decision (pull an arm). Before we pull an arm we are unsure if it will help us achieve what ever it is we wish to achieve, but by pulling that arm we will know more information about the arm. The aim is to minimize the regret we have for pulling an arm. To do this we have to balance between two things: exploitation and exploration. So we want to exploit any information we have from pulling arms previously in order to pull arms which give us successful results. However, we want to explore all our options to ensure we have found the best arm.
So how does this relate to fast fashion? Well if the company delays production so that they release a new selection of goods at T time steps. Then at each time step, t, they have to chose which products they will release in this selection. Picking a product to go in a selection is pulling an arm. By picking a product they can then see how well it sells and hence, its demand. They can then use this information to help make their decision at the next time step. To ensure that we are making the best decisions with the information we have at each time step we make a model.
So lets look at this model. To start with we have a set of S different products to chose from. As there is limited space within a shop we can only chose N of these products at each t. For this model we assume that a customer will buy one unit of a product at an unknown constant rate ds. This is assumed to remain constant but the actual demand for the product will only be observed at times when the product is in the selection. To formulate this model we use some Bayesian statistics. If you are not clued up on Bayesian statistics I suggest you take a peak here. In Bayesian statistics we can incorporate prior beliefs or information on parameter of our model. In this case our prior beliefs are represented as a Gamma distribution with a shape parameter ms and scale parameter as. Both are assumed to be positive and ms assumed to be an integer. We are using a likelihood function for a Poisson distribution on any samples of data we have at a given time. As the Gamma distribution is a conjugate prior, our resulting distribution (posterior) is a Gamma distribution with shape parameter (ms+ns) and scale parameter (as+1), where ns is the number of products, s, sold in a selection period. So, each time a product is selected its posterior distribution will be updated with the addition of ns sales for that selection period to the shape parameter and and 1 to its scale parameter. The intuition is that the shape parameter is the units of products that will be sold in a number of periods equal to the scale parameter so the expected number of sales from a product in a period is the shape parameter divided by the scale parameter. This can be used to make decisions by choosing options with the largest expected sales. This model balances exploration and exploitation. If a product has a lot of sales ns will be larger so will the shape parameter and hence the expectation will be as well. This means that product will likely be picked again however, the more times a product is picked the larger the scale parameter will get. A larger scale parameter means the expectation will be reduced hence, lowering the chances options picked frequently and increasing opportunities for exploring other options.
If we simplify the problem so that we have to chose one of a pair of shorts, a skirt or a skort at each time step as choices may go something like this:
To read the paper that formulated this model as well as learn more about how maths is used to learn about demand within fast fashion click here. I hope this blog post gave you a little insight in to maths being used in our everyday lives which you may not have previously thought about.