Tutorial: Building a Domain

Tutorials > Building a Domain > Part 3

Tutorial Contents

Defining a Grid World State

The first primary task we will complete in defining our Grid World MDP is to define what states look like. Before doing that though, lets first make a class that will use to generate our ultimate SADomain, and hold constants for various important values. We will then use some of these constants in the definition of our State.

Start by creating the below class, which implements DomainGenerator. DomainGenerator is an optional interface commonly used in BURLAP that constains a method for generating a Domain object. Below is the class with the constants we will use throughout this tutorial, as well as the imports it will ultimately be using.

import burlap.mdp.auxiliary.DomainGenerator;
import burlap.mdp.core.StateTransitionProb;
import burlap.mdp.core.TerminalFunction;
import burlap.mdp.core.action.Action;
import burlap.mdp.core.action.UniversalActionType;
import burlap.mdp.core.state.State;
import burlap.mdp.singleagent.SADomain;
import burlap.mdp.singleagent.environment.SimulatedEnvironment;
import burlap.mdp.singleagent.model.FactoredModel;
import burlap.mdp.singleagent.model.RewardFunction;
import burlap.mdp.singleagent.model.statemodel.FullStateModel;
import burlap.shell.visual.VisualExplorer;
import burlap.visualizer.StatePainter;
import burlap.visualizer.StateRenderLayer;
import burlap.visualizer.Visualizer;

import java.awt.*;
import java.awt.geom.Ellipse2D;
import java.awt.geom.Rectangle2D;
import java.util.ArrayList;
import java.util.List;


public static final String VAR_X = "x";
public static final String VAR_Y = "y";

public static final String ACTION_NORTH = "north";
public static final String ACTION_SOUTH = "south";
public static final String ACTION_EAST = "east";
public static final String ACTION_WEST = "west";

public class ExampleGridWorld implements DomainGenerator {

	public SADomain generateDomain() {
		return null; //we'll come back to this later
	}

}

With this class and its constants defined, lets now create a class for describing Grid World states (we'll place this in a separate file). To do that, we will want our class to implement the State interface. We will actually go one step further, and have it implement MutableState, an extension to the State interface that also provides a method for setting state variables. Start with the below code, which has stubs for each of the required methods and all the imports for what we'll eventually be using.

import burlap.mdp.core.state.MutableState;
import burlap.mdp.core.state.StateUtilities;
import burlap.mdp.core.state.UnknownKeyException;
import burlap.mdp.core.state.annotations.DeepCopyState;

import java.util.Arrays;
import java.util.List;

import static edu.brown.cs.burlap.tutorials.domain.simple.ExampleGridWorld.VAR_X;
import static edu.brown.cs.burlap.tutorials.domain.simple.ExampleGridWorld.VAR_Y;

public class EXGridState implements MutableState{


	@Override
	public MutableState set(Object variableKey, Object value) {
		return this;
	}

	@Override
	public List<Object> variableKeys() {
		return null;
	}

	@Override
	public Object get(Object variableKey) {
		return null;
	}

	@Override
	public EXGridState copy() {
		return null
	}

}

Our grid world state will be defined entirely by the agent's x and y location in the world. Lets add data members for that to our class. We'll also add relevant constructors.

Serialization

To support trivial serialization of states with something like Yaml (the default approach BURLAP uses), you should make sure your State objects are Java Beans, which means having a default constructor and get and set methods for all non-public data fields that follow standard Java getter and setter method name paradigms.

public int x;
public int y;

public EXGridState() {
}

public EXGridState(int x, int y) {
	this.x = x;
	this.y = y;
}

Although we've now defined our state variables, unless some client code knows exactly what kind of State object it is, it won't be able to access or modify (in the case of MutableState objects) these state variables. Most of the methods of the State and MutableState interface provide a general mechanism for client code to work with a State's variables. The variableKeys method returns a list of Object elements that specify variable keys that can be used to get or state variables; the get method takes a variable key, and returns the variable value for that key; and the setMethod takes a variable key and a value and sets the variable to that value (and is expected to return itself to support method chaining of variable sets). Note that the variable keys, being of type Object, can be of any data type that is most relevant to indexing/specifying a variable. Common choices include String or Integer keys, but it can really be any type. Variable values are also typed to Object, which similarly means that your States variables can be made up any conceivable data type that you want, allowing you to easily represent any kind of state!

To implement these three methods, lets first define a class constant list for the variable keys (we don't need a separate copy for each State instance, since the variable keys are always the same, and using a class constant will save on the memory overhead.)

private final static List<Object> keys = Arrays.<Object>asList(VAR_X, VAR_Y);

Note that for keys, we are using Strings for variable names, and the constants for these names were defined in our ExampleGridWorld domain generator.

Now lets implement the variableKeys, set, and get methods, which is done mostly as you would expect.

@Override
public MutableState set(Object variableKey, Object value) {
	if(variableKey.equals(VAR_X)){
		this.x = StateUtilities.stringOrNumber(value).intValue();
	}
	else if(variableKey.equals(VAR_Y)){
		this.y = StateUtilities.stringOrNumber(value).intValue();
	}
	else{
		throw new UnknownKeyException(variableKey);
	}
	return this;
}

@Override
public List<Object> variableKeys() {
	return keys;
}

@Override
public Object get(Object variableKey) {
	if(variableKey.equals(VAR_X)){
		return x;
	}
	else if(variableKey.equals(VAR_Y)){
		return y;
	}
	throw new UnknownKeyException(variableKey);
}

There are two things to note. First, if client code passes a variable key that is not the x or y key, then we throw a standard BURLAP UnknownKeyException. Second, you will notice that in the set method, we process the variable value through the StateUtilities stringOrNumber method. This method takes as input an Object. If that object is a String, then it returns a Java Number instance of the number that String represents. If it is a Number already, then it simply returns it back, cast as a Number. This method is useful because it allows client code to specify values with string representations of numbers or an actual number.

BURLAP Shell And MutableState

One piece of client code that benefits from setting state variables with string representations of the value is the BURLAP shell, which is a runtime shell that lets you interact with BURLAP environments with different commands (and you can make your own commands to add to the shell too). If the environment is a simulated BURLAP environment, then one of the shell commands lets you modify the state of the environment from the command line, provided your State's set method supports String representations of keys and values. Later in this tutorial, we will launch a visual BURLAP shell with the domain we create and you can try it out!

Next we will want to implement the copy method. The copy method returns, as the name suggests, a State object that is a copy of this state. The copy method is most commonly use when defining the MDP's transition dynamics/model. That is, when the outcome of executing an action in a specific input state is request, we will usually first make a copy of the input state, modify the values that need to be modified, and return the modified copied. We will see how to define transition dynamics that use this approach shortly.

We can implement the copy method by simply creating a new EXGridState instance and passing its constructor this object's current x and y values.

@Override
public EXGridState copy() {
	return new EXGridState(x, y);
}

Because our State's data fields are simple Java primitives, the copy we return is a deep copy. Consequently, modifying any data member of a copied state will not affect the values of the state from which it was copied. However, for State implementations with more complex data members, we might instead define our copy operation to do a shallow copy, which simply passes the references of the data members. To help indicate to clients what kind of copy operation a State performs, there are two optional class annotations you can add to your class; DeepCopyState and ShallowCopyState. Since our State is a DeepCopyState, lets add the optional annotation to our class.

@DeepCopyState
public class EXGridState implements MutableState{
	...
}

Why would you ever shallow copy?

Recall that the State copy method is typically used when generating the transition dynamics of an MDP (as you will see shortly). Using a shallow copy is often more memory efficient, because it means common data between states that are generated through transitions are shared, rather than replicated for each state. However, it does mean that the transition code needs to make sure that it manually makes a copy of the specific data members that it modifies of the copy, since the copy method itself won't do it. It is also usually a good idea to have the set method of shallow copied states perform a copy on write; that is, it automatically makes a copy of the value it's modifying first. If you are in doubt, then making your copy method always perform a deep copy will be safe, but if you're looking to improve memory overhead, you may want to consider shallow copies. Many of the included BURLAP domains use shallow copies, with the set method performing a copy on write.

We're just about done defining our Grid World State, but lets also add a toString method, so that we can have meaningful string representations of our State. To implement that method, we can simply use the corresponding StateUtilities method, which will iterate through the variable keys (or you can implement it manually yourself, if you wish).

@Override
public String toString() {
	return StateUtilities.stateToString(this);
}

And with that, we're finished defining our GridWorld State!

Next Part