Features

After computing these state variables, we combine and discretize them to define a small number of features each of which takes on a small number of values. These features define the state space, and they are used to access the tables $P(s'\vert s,a)$, $R(s,a,s')$ and $V(s)$ in the learning phase, and also to access $\pi(s)$ for policy exploitation.

We employ the following features:

This gives a total of 980 belief states.



© 2003 Dídac Busquets