“So, what do you actually do?” – The question that many postgraduate students are fed up of hearing. If, like me, you hail from a maths averse family, you will be familiar with trying to explain what it is you actually do all day to your nearest and dearest. So, what is the answer? Well, in truth, a lot of tea drinking and biscuit eating, marvelling at lecturer’s cats over Teams calls and frequent discussions about Bake Off. But that’s not just what we do.
In fact, entering the fields of statistics and operational research from my background in physics, I would be lying if I said that I had known exactly what to expect myself. My undergraduate degree is in Theoretical Physics, the key part here being theoretical – to be honest, I had very little idea what research I would be a part of when I joined STOR-i.
I plan to write a series of blog posts that cover some cool and unexpected areas of research and uses for statistics and operational research that you may not know about. Today’s blog post is the first of said posts, and we are going to be considering the question: what is the most likely path that a particle would take between two points on the surface of the ocean?
This research question was considered in a recent talk given to the MRes students by Dr Adam M. Sykulski, and covered the work of his PhD student Michael O’Malley. I’ll be using some of the figures from the presentation in this post, so be sure to check out the paper for more reading.
Let’s get started…
So, what do you suppose the most likely path is from point A to point B in the ocean? Perhaps you suppose that the most likely path is along a geodesic line (geodesics describe the shortest route between two points on the Earth’s surface, see here for more information); a perfectly reasonable guess. Or maybe you think that the most likely path would follow some ocean currents.
The Data …
The Global Drifter Program consists of approximately 1300 satellite-tracked drifting buoys, where the buoys have been designed to mimic the motion of a water particle in the ocean. This provides an ideal data set in order to carry out research on the most likely path!
But finding the most likely path between two specific points is not as simple as monitoring a buoys movement from one point to the other – in fact, the chances that a buoy will have travelled through those two points is unlikely. Instead, the ocean is discretised into spatial bins (this is just a fancy way of saying chopped up into equally sized sections).
Most standard approaches to do this use latitude and longitude binning to tesselate the globe (ie, a one degree by one degree map is taken). While this is an intuitive idea, there are problems. Due to the fact that the Earth is a sphere, taking one degree by one degree bins does not result in all bins being of the same area – the bins near the poles will be much smaller. In addition, tessellating the globe into squares means that the diagonals only share one vertex and no edge, creating an asymmetry in the tessellation.
O’Malley and Sykulski proposed a new method of tessellating the globe using Uber’s H3 index, where the globe is tessellated into pentagons or hexagons. This helps to eliminate some of the problems experienced when using standard degree binning, as now each hexagon (or pentagon) is approximately equal in area. In addition, each polygon now shares an edge and two vertices with each neighbour. It is then possible to use this tessellation to form Markov Transition matrices, based on drifter locations from one time step to the next.
The Maths …
Imagine that we have a system, where at each positive integer time point, the system transitions into a new state. Such a system is Markov if the next state only depends on the previous state; this is commonly referred to as the lack of memory property.
This system is what we call a time-homogenous Markov chain. We can describe the stochastic dynamics of such a process using transition probabilities pij, which describe the probability of transitioning from state i to state j. It is then possible to write a square matrix of one-step transition probabilities, where p01 denotes the probability of transitioning from state 0 to state 1 and so on. This is what we call the Markov Transition Matrix.
Back to our ocean path example. Perhaps you have noticed that in order to use the tessellation of the globe to form Markov transition matrices based on drifter locations from one time step to the next, we must choose an appropriate time step. This time step should be chosen so that the drifter motions obey the lack of memory property that we talked about before. In their research, O’Malley and Sykulski set this to 5 days.
If we now consider the points A and B as belonging to two different states, it is possible to then apply a path finding algorithm in order to find the most likely path between points. O’Malley and Sykulski’s research showed some interesting results! The most likely path from point 1 to point 2 seems to be travelling along the South Equatorial Current, which might have been what you would’ve expected after seeing the current map at the beginning of this post. Interestingly though, the path between 2 and 1 hugs the coastline quite tightly, which is due to the Equatorial Counter Current. This behaviour might not have been expected from the current map!
As you can see, the researchers at STOR-i are actually getting up to some pretty interesting things, alongside the regular cat marvelling and biscuit eating. In fact, this is not the only example of the tools of statistics and operational research being applied to oceanography at STOR-i, alongside many many other things!
I hope you enjoyed this first instalment on my blog; be sure to check out the paper and further reading below if you found this topic interesting, and don’t forget to check back on the blog over the coming weeks!
Further reading
Estimating the travel time and the most likely path from Lagrangian drifters – Michael O’Malley, Adam M. Sykulski, Romuald Laso-Jadart, Mohammed-Amin Madoui
Introduction to Markov Chains – Towards Data Science