The Female Perspective of Computer Science: Street Selection and Self-Organizing Maps

Maps have become an essential part of many of our lives, helping us to get from one place to another for centuries. With the widespread availability of relatively recent digital technologies, such as Google Maps, the need to display maps at different resolutions has never been greater.

Think of the kinds of features that are shown in this map of Canada:

View Larger Map

Not a lot is shown beyond a bit of water and some major political markings. When zooming into Ontario, bodies of water become more clear and major roads are now shown:

View Larger Map

Finally, much more detail, including local roads, is possible when looking at just the city of Ottawa:

View Larger Map

There is obviously some decision making happening behind the scenes in order to display the most important features for a particular level of zoom (or resolution, as it were).

Let's look specifically at road networks, ignoring the other features for this discussion. There are two levels of work to be done when determining what roads should be shown at a particular level of detail. This whole process is known as generalization, since we must decide what subset of all the available information will be shown.

The first step of generalization involves examining the model of the road network and using information about street connections, types, and so on to pick out the most important or relevant pieces to show.

Once we have the data that is to be visualized, the second step analyzes the geometric properties of the chosen roads and uses the results to simplify the graphical rendering of the roads. For example, a road with many small bends that can't be rendered onto the screen as such will be simplified to a straight line instead.

It is in the first step that self-organizing maps (or SOMs) can play a role. These maps are actually neural networks that are used to cluster and visualize data. To make a longish story short (and simplified), you create vectors with as many dimensions as you have numerical properties about each item you want to cluster. Using these numbers, the SOM sorts out the data so that items that are most alike end being "close" together. You might, for example, have a SOM that is represented by a two dimensional grid, and the input vectors will be organized into the cells based on their similarities.

In the case of selecting roads from a network, there are some obvious numerical values that can be used to describe each street for use in the SOM, such as length. There are, however, also some less obvious choices that seem to help categorize roads in a more useful way. A good list of attributes for each road is as follows:

Number of streets that intersect this one
A measure of closeness between this road and all others around it, using shortest path distances
A measure of betweenness, the proportion of shortest paths between all other roads that pass through this one
Length and width (i.e. number of lanes)
A numerical representation of the class of road (i.e. avenue, street, highway, freeway, etc)
Speed limit

Using these properties, roads can be clustered into a SOM. We can then pick a group of similar roads very easily. If we are careful about using the numerical properties so that when the value increases, so does the corresponding importance of the road, we can simply pick the SOM cells with the highest average values for the vectors stored within. These roads will be used in the map's final generalization.

For more information, please have a look at my final project for the GIS class I took in the fall semester. I implemented street selection using SOMs. The file includes my write-up of the project as well as the Java code used. Be sure to look at the README file if you plan to play with it as you will need to do a bit of work to set things up.

Thursday, January 3, 2008

Street Selection and Self-Organizing Maps

0 comments:

Post a Comment