An example 3D probability distribution surface |

In our case, the 3D probability distribution surface is represented by a matrix/table where each value represents the height of the point. You can think of this distribution as a gray-scale image where the gray value of each pixel represent the height of the point. And we use a Local Hill Climbing type algorithm with 8-connected neighbors.

**1. Down sample the distribution**

If the distribution map is very large, it might be a good idea to down sample the distribution to improve algorithm speed. We assume the surface is noise-free. If the surface is noisy, we can also smooth it with a Gaussian filter (think image processing).

**2. Check for a uniform distribution (a flat surface)**

It is a good idea to check if the probability distribution is a uniform distribution. Just check to see if all values in the matrix are identical. If a uniform distribution is identified, we know the distribution has 0 mode and we are done.

**3. Local Hill Climbing with Memory**

Start from the a point of the surface and then check its neighbors (8-connected). As soon as a neighbor with the same or better value is found, we "climb" to that point. The process is repeated until we reach a point (hilltop) where all neighbors have smaller values. As we "climb" and check neighbors, we mark all the points we visited along the way. And when we check neighbors, we only check points we have not visited before. This way we avoid finding a mode we had found before. Once we find a "mode", we can start from another unvisited point on the surface and do another Local Hill Climbing. Here I use quotes around the word mode because we are not sure if the "mode" we found is a real mode.

**4. Make sure the "mode" we found is a real mode**

An Even-Height Great Wall |

Therefore, we need to keep track of all points leading to the final "mode" point that have identical values and check all the visited neighbors of these points, making sure this flat surface is not part of a previously found mode. If these points make up a real new mode, we mark these points with a unique mode count id (e.g, mode 3). If they are only part of a previous found mode, we mark these points so (e.g., mode 2). If one of them is right next to a previously found mode but have lower value, we mark these points as non-mode points. This step is almost like performing a Connected-Component Labeling operation in Computer Vision.

At the end of the algorithm run, we will have a count of how many modes the probability distribution has and also a map with all the mode points marked. With the Even-Height Great Wall distribution, the map would look just like the image (white pixels marking mode points) with 1 mode. And within Milli-seconds, the algorithm can identify the 4 modes in the example 3D surface above.

That's it! If you ever need to do this for your projects, you now know how!

Recursive functions work great for local hill climbing until you get a stack overflow.