A Lesson in Simplicity from Dijkstra
23th March 2016
In my last blog post Networking is Everything (Nearly), I discussed the basic ideas of networks based on
the Masterclass we were given by Professor Pitu Mirchandani, from
the School of Computing, Informatics, and Decision Systems Engineering at Arizona State University. I mentioned in that post was the
Shortest Path Problem (SPP), in which you would like to get from one node, $v_s$, to the destination node, $v_t$, with the minimum cost.
This cost could be allsorts, ranging from the minimum price to the minimum distance, to the shortest amount of time. It simply depends on
what you choose to your arc weights to be. Professor Pitu Mirchandani returned on Monday in order to "finish off" the Masterclass. The
purpose of this blog post will be to talk about what he said about the SPP in the lectures on Monday.
As I spoke about last time, the SPP can be formulated as a linear integer program. Great! Now we can just use the standard algorithms
such as simplex followed by branch and bound to solve the problem in a way that is well known and loved! Except that this actually gets
really slow as the size of the problems increase, i.e. as the network gets more nodes and more arcs. This is largely because the constraint
matrix has and awful lot of zeros in it and gets very large. For a person doing algebra, this is brilliant. For a computer, less so. The
zeros take up memory and serve little or no other purpose, slowing the whole process down. The problem is the way that the data is
represented, which is known as the data structure. The challenge of those in optimisation is what data structure to use in order that only
the necessary data has to be stored in the computer's memory. The same idea applies here.
Fortunately, a Dutch computer scientist called Edsger Wybe Dijkstra came up with a dynamic programming algorithm to help solve the SPP.
Suppose that we have a network with $n$ nodes and $m$ arcs. Let $A(v_i)$ be the set of arcs that are leave node $v_i$, so that
$$\sum_{i=1}^n |A(v_i)|=m.$$ We consider two sets, $S$, which contains all the nodes $v_j$ that we know the shortest distance, and $\bar{S}$
that contains all the other nodes. Initially, the only element of $S$ is $v_s$. Let $d(v_j)$ be the current belief about the distance to
node $v_j$ from $v_s$. Now, the idea behind Dijkstra’s algorithm is the look at all the nodes in $\bar{S}$ that are connected to nodes in
$S$ and choose the one that has the smallest value of $d(v_j)$ to bring into $S$:
- 1. Begin with $S=\{v_s\}$, all other arcs in $\bar{S}$, $d(v_s)=0$ and $d(v_i)=\infty$ for all other arcs and let $v_s$ be the current node
- 2. Update the distances of all nodes connected to the current node, $v_i$, using: $$d(v_j) = \min\{d(v_j),d(v_i)+c_{ij}, \text{ such that }v_j\in \bar{S}\text{ and }[i,j]\in A(v_i)\}$$
- 3. Select the node $v_k$ in $\bar{S}$ that has the smallest value of $d(v_j)$
- 4. Move $v_k$ from $\bar{S}$ to $S$
- 5. If $|S|=n$, stop, otherwise set $v_k$ to be the current node and return to step 2
At each iteration, the node that is closest to the set $S$ is chosen. Now, this actually calculates the shortest path from $v_s$ to all
other nodes in the network, but this is helps the algorithm to remain simple. Only the information about the distance to the nodes are
stored, rather than the whole matrix which reduces the storage amount required dramatically.
Now we consider how the algorithm performs in terms of its computational form. If one looks at the worst case scenario (which is often how
algorithmic performance is measured to see how it works on the most challenging examples), then we are considering a complete graph. A complete
graph is one in which every node has an arc leaving it that goes to every other node. So if there are $n$ nodes, there are $n^2$ arcs. Now, the
first time you do step 3, you have to check $(n-1)$ nodes. In the next, you have to check $(n-2)$ nodes, $(n-3)$ in the next,…. Therefore the
total number of nodes that need checking will be of the order $O(n^2)$.
Each time one does step 2 to update the distances, there are at most $|A(v_i)|$ nodes to update. I say at most because some of the other nodes
may be in $S$ so do not need to be updated. The algorithm will do step 2 up to $n$ times, in which case it will be done for every node, so the
total number of checks is at most $\sum_{i=1}^n |A(i)|=m$ which is of order $O(n^2)$. So, the total number of operations made by Dijkstra’s algorithm
is of order:
$$O(n^2) + O(n^2) = O(n^2).$$
A simple example of a complete network!
This is quite a good algorithm in terms of its efficiency and so much better than the linear integer programming methods. By being slightly more
intelligent about the order in which nodes are checked, even better performance can be achieved. It has taken advantage of the structure of the network to get
rid of all the unwanted information. An algorithm that does this is known as a network flow algorithms. Thankfully, many of these exist to help
make problems much easier to solve without having to resort to integer programming! The main thing I got out of the Masterclass was to make
sure that when I have to solve a problem, I think about how to use the information avaiable as effectively as possible.
References
[1] Dijkstra's Algorithm, Mellisa Yan, http://math.mit.edu/~rothvoss/18.304.3PM/Presentations/1-Melissa.pdf
[2] Network Flows: theory, algorithms, and applications, Ravindra K. Ahuja; Thomas L. Magnanti; James B. Orlin, Prentice Hall.