Tutorial: Building a Domain

Tutorials (v1) > Building a Domain > Part 2

OO-MDPs

In the classic MDP formalism, each state is simply described by its identity. The cell in the bottom left corner of the grid world would simply be state "0" and the one above it might simply be state "11." This is known as a flat state representation because there is no other information about the states other than their identity. Although many planning/learning algorithms work just fine with flat representations, using a flat state representation makes defining transition dynamics and reward functions inconvenient. In fact, when we described the grid world in the previous section, we used words regarding spatial adjacency and direction to explain it. It would similarly be nice to define the states, transitions, etc. using such concepts. For these reasons (and others), it is often much easier to use a factored state representation, which can be exploited when defining the MDP transition dynamics and other properties.

A classic way to define a factored state representation is with a set of state variables or attributes. In our grid world, for example, we would define the state by an x-position attribute and a y-position attribute. The bottom left cell of the world would be state (0, 0); the cell directly above it would be (0, 1); and so on.

The factored representation that BURAP uses is the object-oriented MDP (OO-MDP), which rather than representing states by a set of attributes, states are represented by a set of objects. Each object belongs to an object class, and each object class has an associated set of attributes. Each attribute can be of a different type with its own value domain. An object in a state is simply a value assignment to its class' attributes. In our grid world, we can define an "agent" class that has two integer attributes associated with it with a value domain spanning the width and height of the grid world. In this definition, a state would contain an object instance belonging to the agent class with a value assignment specifying the agent's x and y position.

Although grid worlds are simple enough to describe without using an OO-MDP representation, there are a number of reasons why the OO-MDP representation is useful. For example, it's trivial to define transition dynamics that create new objects in the world or remove them, merely by having the objects added or removed from the list of objects present in a state. If there are multiple objects belonging to the same class, states can also be defined invariantly to the reference or order of the objects in the state. For instance, consider a state (s0) with block objects defined by 2D spatial positions. Now imagine a new state (s1) that is the result of swapping the positions of the block objects as show in the below image.

Even though the object identifiers associated with the blocks is different between s0 and s1, the states are isomorphic (that is, if block0 was renamed to block1 and block1 to block0, the states would be the same); therefore, from a decision making algorithm perspective it may be useful to treat the states as equal, rather than distinct states that each require independent computation and reasoning. In an OO-MDP paradigm it is possible to treat these states as equal and BURLAP will do that automatically (unless otherwise specified)!

Another advantage to the OO-MDP paradigm is that it leverages the object-oriented nature to provide additional high-level state features in the form of propositional functions that operate on objects in the world. In our grid world, we can introduce an additional object class for location objects (similarly defined by x,y position attributes) and then define a propositional function called "at" that operates on the agent object and a location object and evaluates to true when they are in the same location. Including propositional functions is useful for bridging the gap between MDPs and more classic AI approaches that are based on logical representations. In this tutorial we will implement the "at" propositional function in our grid world to demonstrate how to create them.

BURLAP OO-MDP Java Class Overview

BURLAP implements the OO-MDP paradigm in Java with the following class structure, which can be found in the packages burlap.oomdp.core and burlap.oomdp.singleagent.