RLGlueEnvironment

java.lang.Object
- burlap.oomdp.singleagent.interfaces.rlglue.RLGlueEnvironment

All Implemented Interfaces:

org.rlcommunity.rlglue.codec.EnvironmentInterface
```
public class RLGlueEnvironment
extends java.lang.Object
implements org.rlcommunity.rlglue.codec.EnvironmentInterface
```
This class can be used to take a BURLAP domain and task with discrete actions and turn it into an RLGlue environment with which other RLGlue agents can interact. Because of the nature of RLGlue there are a few limitations:
(1) the same actions available in one state must be available everywhere
(2) the environment cannot represent object identifier independence and will fill in RLGlue feature vectors by object class and in the order the objects appear for each class;
(3) while single target relational domains can be used, multi-target relational domains cannot.
Because a fixed number of objects for each class is assumed, action parameterization is supported by multiplying out all the possible parameterizations. In order for action parameterization and relational domains to work consistently for RLGlue, the state generator should always add objects of each class to the state object in the same order. For instance, in a grid world, the state generator should always add the agent object instance to the state first and then all location objects (or always do it in the reverse order). Object instance names, however, can vary between generated states.

Note that RLGlue does not support observations of terminal states; it only gives the final reward upon entering a terminal state. Therefore, this class will not terminate in a terminal state indicated by the provided TerminalFunction. Instead, it will allow one more transition from the terminal state, which will transition back to itself with reward zero, which is mathematically equivalent to transitioning to terminal state and observing it.

Author:

James MacGlashan

Field Summary

Fields
Modifier and Type	Field and Description
`protected java.util.Map<java.lang.Integer,GroundedAction>`	`actionMap` A mapping from action index identifiers (that RLGlue will use) to BURLAP actions and their parametrization specified as the index of objects in a state.
`protected State`	`curState` The current state of the environment
`protected double`	`discount` The discount factor of the task
`protected Domain`	`domain` The BURLAP domain
`protected boolean`	`isEpisodic` Whether this task is episodic (false will indicate that it is continuing)
`protected int`	`numContinuousAtts` The number of RLGlue continuous attributes that will be used
`protected int`	`numDiscreteAtts` The number of RLGlue discrete attributes that will be used
`protected int`	`numObjects` The total number of objects that will appear in all states
`protected java.util.Map<java.lang.String,java.lang.Integer>`	`numObjectsOfEachClass` The number of objects of each object class that will appear in all states.
`protected org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange`	`rewardRange` The reward function value range
`protected RewardFunction`	`rf` The reward function
`protected StateGenerator`	`stateGenerator` The state generator for generating states for each episode
`protected int`	`terminalVisits` Indicates the number of times a terminal state has been visited by the agent within the same episode.
`protected TerminalFunction`	`tf` The terminal function
`protected boolean`	`usedConstructorState` Whether the state generated from the state generator to gather auxiliary information (like the number of objects of each class) has yet be used as a starting state for an RLGlue episode.

Constructor Summary

Constructors
Constructor and Description
`RLGlueEnvironment(Domain domain, StateGenerator stateGenerator, RewardFunction rf, TerminalFunction tf, org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange rewardRange, boolean isEpisodic, double discount)` Constructs with all the BURLAP information necessary for generating an RLGlue Environment.

Method Summary

Methods
Modifier and Type	Method and Description
`protected void`	`addAttribute(org.rlcommunity.rlglue.codec.taskspec.TaskSpecVRLGLUE3 theTaskSpecObject, Attribute att)` Adss a BURLAP attribute to the RLGlue task specification.
`protected org.rlcommunity.rlglue.codec.types.Observation`	`convertIntoObservation(State s)` Takes a OO-MDP state and converts it into an RLGlue observation
`void`	`env_cleanup()`
`java.lang.String`	`env_init()`
`java.lang.String`	`env_message(java.lang.String arg0)`
`org.rlcommunity.rlglue.codec.types.Observation`	`env_start()`
`org.rlcommunity.rlglue.codec.types.Reward_observation_terminal`	`env_step(org.rlcommunity.rlglue.codec.types.Action arg0)`
`void`	`load()` Loads this environment into RLGlue
`void`	`load(java.lang.String hostAddress, java.lang.String port)` Loads this environment into RLGLue with the specified host address and port
`protected int`	`objectIndex(State s, java.lang.String obName)` Returns the index of the object instance with name obName in state s.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - domain
```
protected Domain domain
```
    The BURLAP domain
  - stateGenerator
```
protected StateGenerator stateGenerator
```
    The state generator for generating states for each episode
  - rf
```
protected RewardFunction rf
```
    The reward function
  - tf
```
protected TerminalFunction tf
```
    The terminal function
  - terminalVisits
```
protected int terminalVisits
```
    Indicates the number of times a terminal state has been visited by the agent within the same episode. This variable is used because RLGLue does not support observations into terminal states and so a terminal flag will only be set once the agent has taken one action in the terminal state, which will transition back to itself.
  - rewardRange
```
protected org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange rewardRange
```
    The reward function value range
  - isEpisodic
```
protected boolean isEpisodic
```
    Whether this task is episodic (false will indicate that it is continuing)
  - discount
```
protected double discount
```
    The discount factor of the task
  - numObjectsOfEachClass
```
protected java.util.Map<java.lang.String,java.lang.Integer> numObjectsOfEachClass
```
    The number of objects of each object class that will appear in all states.
  - numObjects
```
protected int numObjects
```
    The total number of objects that will appear in all states
  - curState
```
protected State curState
```
    The current state of the environment
  - actionMap
```
protected java.util.Map<java.lang.Integer,GroundedAction> actionMap
```
    A mapping from action index identifiers (that RLGlue will use) to BURLAP actions and their parametrization specified as the index of objects in a state.
  - numDiscreteAtts
```
protected int numDiscreteAtts
```
    The number of RLGlue discrete attributes that will be used
  - numContinuousAtts
```
protected int numContinuousAtts
```
    The number of RLGlue continuous attributes that will be used
  - usedConstructorState
```
protected boolean usedConstructorState
```
    Whether the state generated from the state generator to gather auxiliary information (like the number of objects of each class) has yet be used as a starting state for an RLGlue episode. When this value is false, the state generated in the constructor will be passed as the initial state of a new episodes. After that, this value is set to true and the states used for each RLGlue episode are generated fresh from the state generator.
- Constructor Detail
  - RLGlueEnvironment
```
public RLGlueEnvironment(Domain domain,
                 StateGenerator stateGenerator,
                 RewardFunction rf,
                 TerminalFunction tf,
                 org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange rewardRange,
                 boolean isEpisodic,
                 double discount)
```
    Constructs with all the BURLAP information necessary for generating an RLGlue Environment.
    
    Parameters:
    domain - the BURLAP domain
    stateGenerator - a generated for generating states at the start of each episode.
    rf - the reward function
    tf - the terminal funciton
    rewardRange - the reward function value range
    isEpisodic - whether the task is episodic or continuing
    discount - the discount factor to use for the task
- Method Detail
  - load
```
public void load()
```
    Loads this environment into RLGlue
  - load
```
public void load(java.lang.String hostAddress,
        java.lang.String port)
```
    Loads this environment into RLGLue with the specified host address and port
    
    Parameters:
    hostAddress - the RLGlue host address
    port - the RLGlue port
  - env_cleanup
```
public void env_cleanup()
```
    Specified by:
    
    env_cleanup in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
  - env_init
```
public java.lang.String env_init()
```
    Specified by:
    
    env_init in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
  - addAttribute
```
protected void addAttribute(org.rlcommunity.rlglue.codec.taskspec.TaskSpecVRLGLUE3 theTaskSpecObject,
                Attribute att)
```
    Adss a BURLAP attribute to the RLGlue task specification. BURLAP multi-target relational attributes are not supported and will cause a runtime exception to be thrown.
    
    Parameters:
    theTaskSpecObject - the RLGlue task specification
    att - the BURLAP attribute to add to the spec
  - env_message
```
public java.lang.String env_message(java.lang.String arg0)
```
    Specified by:
    
    env_message in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
  - env_start
```
public org.rlcommunity.rlglue.codec.types.Observation env_start()
```
    Specified by:
    
    env_start in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
  - env_step
```
public org.rlcommunity.rlglue.codec.types.Reward_observation_terminal env_step(org.rlcommunity.rlglue.codec.types.Action arg0)
```
    Specified by:
    
    env_step in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
  - convertIntoObservation
```
protected org.rlcommunity.rlglue.codec.types.Observation convertIntoObservation(State s)
```
    Takes a OO-MDP state and converts it into an RLGlue observation
    
    Parameters:
    s - the OO-MDP state
    
    Returns:
    an RLGlue Observation
  - objectIndex
```
protected int objectIndex(State s,
              java.lang.String obName)
```
    Returns the index of the object instance with name obName in state s.
    
    Parameters:
    s - the state holding the object
    obName - the name of the object
    
    Returns:
    the index of obName in state s

Class RLGlueEnvironment

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

domain

stateGenerator

rf

tf

terminalVisits

rewardRange

isEpisodic

discount

numObjectsOfEachClass

numObjects

curState

actionMap

numDiscreteAtts

numContinuousAtts

usedConstructorState

Constructor Detail

RLGlueEnvironment

Method Detail

load

load

env_cleanup

env_init

addAttribute

env_message

env_start

env_step

convertIntoObservation

objectIndex