Topological mapping

In comparison to grid-based representations, topological representations (such as those proposed by Chatila [17], Kuipers and Byun [38], Mataric [47] and Kortenkamp [36], among others) are computationally cheaper. They use graphs to represent the environment. Each node corresponds to an environment feature or landmark and arcs represent paths or motion instructions between them. Some approaches (Kuipers [38], Kortencamp and Weymouth [35]) also define the nodes as ``places'', where a ``place'' is defined as a location where a set of features or landmarks fulfill a given property (e.g. sonar readings matching, landmark visibility, etc.).

With this graph, the problem of navigation is reduced to the problem of finding a route from one node to another - the target one. This can be easily computed with many graph search algorithms (Dijkstra's shortest path, A*, dynamic programming). However, this simplicity of computing routes has the disadvantage that the routes are not always the optimal ones, since there is not an accurate geometric description of the environment, and path planning algorithms for metric worlds cannot be applied. Moreover, in topological graphs there is no explicit representation of the obstacles, as in a metric map. Therefore, when moving from one node to another, there is no way of planning an optimal path, since there may be some obstacles on the way.

The advantage of topological approaches is that they do not rely on odometry in order to build the map nor localize the robot on it. The only point in which odometry is sometimes used is to label the arcs between nodes. As already mentioned, the arcs contain information about how to get from one node to another. This information can be, depending on the approach, metric information (heading and distance to the next node). If this were the case, the odometry error would influence the precision of this information. However, since neighboring nodes are close to each other, this error is bounded and does not accumulate as the robot navigates through the environment.

The drawback of not using metric information is that topological approaches have difficulties in determining if two places that look similar are the same place, since they compute the position of the robot relative to the known landmarks. This problem can be tackled if a robust enough landmark recognition system is in place. Landmark recognition is a very active field of research in vision and very promising results are being obtained [46]. In this work we assume that the vision system can recognize landmarks. However, in the absence of a robust recognition system, a probabilistic approach, similar to the one described for metric maps, could be applied.

Topological approaches can also be combined with grid-based approaches. Thrun [66] combines both representations in his work on learning maps for navigation in indoor structured environments. The grid-based map is partitioned in coherent regions to generate a topological map on top of the grid. By combining both methods, his approach gains the advantages of both methods, resulting in an accurate, consistent and efficient mapping approach. This is indeed a good idea for indoor environments but for large-scale outdoor environments may not be worth the computational effort of maintaining a grid representation under a topological one.

In our work we use the approach where nodes represent regions defined by groups of three landmarks and that are connected by arcs if the regions are adjacent, that is, if they have two landmarks in common. The arcs, instead of containing motion information, represent the cost of going from one region to another. This graph is incrementally built while the robot is moving within the environment. This incremental map building approach is based on previous work by Prescott [55] that proposed a network model that used barycentric coordinates, also called beta-coefficients by Zipser [68], to compute the spatial relations between landmarks for robot navigation. By matching a perceived landmark with the network, the robot can find its way to a target provided it is represented in the network. Prescott's approach is quantitative whereas our approach uses a fuzzy extension of the beta-coefficient coding system in order to work with fuzzy qualitative information about distances and directions. Another difference with Prescott's approach is that his topological graph contains only adjacency information, thus, not maintaining any information about costs, as in ours. This cost information is very important when planning routes from one region to another, since it is the only way to know whether a path is blocked or free. One final point to mention is that in Prescott's experiments, carried out only on simulation, the robot was allowed a training period, while this period is not present in our approach.

Levitt and Lawton [39] also proposed a qualitative approach to the navigation problem. In this approach, landmark pairs divide the environment into two regions, one for each side of the line connecting the two landmarks. The combination of all such linear divisions generates a topological division of the environment, on which navigation can be performed. Navigation consists of crossing a series of landmark pairs in order to reach the region containing the target landmark. Our navigation method uses the same idea for computing and navigating to diverting targets. The difference between this approach and ours is that we use three landmarks for creating the region subdivision, instead of only two. This gives as result a better and more compact division of the environment. Moreover, this third landmark permits the robot to compute a relationship among the landmarks that is unique and invariant to viewpoint.

Another qualitative method for robot navigation was proposed by Escrig and Toledo [24], using constraint logic. However, they assume that the robot has some a priori knowledge of the spatial relationship of the landmarks, whereas our system builds these relationships while exploring the environment.

One of the drawbacks of most of the mapping approaches is that they are thought for static environments. That is, landmarks are not supposed to change their location while the robot is exploring the environment. Thus, research on vision systems capable of extracting robust (distinguishable, invariant to viewpoint and illumination, static) landmarks is very important. However, some mapping approaches are already able to cope with dynamic environments. In [1] landmarks have an existence state (using the principles of neural networks). This mechanism permits the removal of landmarks for which their existence is not certain enough. We have used a similar idea to devise a Visual Memory (see chapter 4), a short term memory of detected landmarks.