Features

After computing these state variables, we combine and discretize them to define a small number of features each of which takes on a small number of values. These features define the state space, and they are used to access the tables $P(s'\vert s,a)$ , and in the learning phase, and also to access $\pi(s)$ for policy exploitation.

We employ the following features:

Target Distance, , discretized to 5 intervals.
Target Location Imprecision: measure of imprecision on the location of the target, , discretized to 7 intervals.
Landmark Count: average number of landmarks over the six sectors,
$\overline{C} = \frac{1}{6} \sum_{s=0}^{5}{N(s)}$ , discretized to 4 intervals.
Landmark Imprecision: average imprecision of landmarks' locations in each sector, $\overline{I} = \frac{1}{6} \sum_{s=0}^{5}{\overline{I}(s)}$ , discretized to 7 intervals.

This gives a total of 980 belief states.