public class RLGlueEnvironment
extends java.lang.Object
implements org.rlcommunity.rlglue.codec.EnvironmentInterface
TerminalFunction
.
Instead, it will allow one more transition from the terminal state, which will transition back to itself with reward zero, which
is mathematically equivalent to transitioning to terminal state and observing it.Modifier and Type | Field and Description |
---|---|
protected java.util.Map<java.lang.Integer,GroundedAction> |
actionMap
A mapping from action index identifiers (that RLGlue will use) to BURLAP actions and their parametrization specified as the index of objects in a state.
|
protected State |
curState
The current state of the environment
|
protected double |
discount
The discount factor of the task
|
protected Domain |
domain
The BURLAP domain
|
protected boolean |
isEpisodic
Whether this task is episodic (false will indicate that it is continuing)
|
protected int |
numContinuousAtts
The number of RLGlue continuous attributes that will be used
|
protected int |
numDiscreteAtts
The number of RLGlue discrete attributes that will be used
|
protected int |
numObjects
The total number of objects that will appear in all states
|
protected java.util.Map<java.lang.String,java.lang.Integer> |
numObjectsOfEachClass
The number of objects of each object class that will appear in all states.
|
protected org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange |
rewardRange
The reward function value range
|
protected RewardFunction |
rf
The reward function
|
protected StateGenerator |
stateGenerator
The state generator for generating states for each episode
|
protected int |
terminalVisits
Indicates the number of times a terminal state has been visited by the agent within the same episode.
|
protected TerminalFunction |
tf
The terminal function
|
protected boolean |
usedConstructorState
Whether the state generated from the state generator to gather auxiliary information (like the number of objects of each class) has yet be used as a starting state for
an RLGlue episode.
|
Constructor and Description |
---|
RLGlueEnvironment(Domain domain,
StateGenerator stateGenerator,
RewardFunction rf,
TerminalFunction tf,
org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange rewardRange,
boolean isEpisodic,
double discount)
Constructs with all the BURLAP information necessary for generating an RLGlue Environment.
|
Modifier and Type | Method and Description |
---|---|
protected void |
addAttribute(org.rlcommunity.rlglue.codec.taskspec.TaskSpecVRLGLUE3 theTaskSpecObject,
Attribute att)
Adss a BURLAP attribute to the RLGlue task specification.
|
protected org.rlcommunity.rlglue.codec.types.Observation |
convertIntoObservation(State s)
Takes a OO-MDP state and converts it into an RLGlue observation
|
void |
env_cleanup() |
java.lang.String |
env_init() |
java.lang.String |
env_message(java.lang.String arg0) |
org.rlcommunity.rlglue.codec.types.Observation |
env_start() |
org.rlcommunity.rlglue.codec.types.Reward_observation_terminal |
env_step(org.rlcommunity.rlglue.codec.types.Action arg0) |
void |
load()
Loads this environment into RLGlue
|
void |
load(java.lang.String hostAddress,
java.lang.String port)
Loads this environment into RLGLue with the specified host address and port
|
protected int |
objectIndex(State s,
java.lang.String obName)
Returns the index of the object instance with name obName in state s.
|
protected Domain domain
protected StateGenerator stateGenerator
protected RewardFunction rf
protected TerminalFunction tf
protected int terminalVisits
protected org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange rewardRange
protected boolean isEpisodic
protected double discount
protected java.util.Map<java.lang.String,java.lang.Integer> numObjectsOfEachClass
protected int numObjects
protected State curState
protected java.util.Map<java.lang.Integer,GroundedAction> actionMap
protected int numDiscreteAtts
protected int numContinuousAtts
protected boolean usedConstructorState
public RLGlueEnvironment(Domain domain, StateGenerator stateGenerator, RewardFunction rf, TerminalFunction tf, org.rlcommunity.rlglue.codec.taskspec.ranges.DoubleRange rewardRange, boolean isEpisodic, double discount)
domain
- the BURLAP domainstateGenerator
- a generated for generating states at the start of each episode.rf
- the reward functiontf
- the terminal funcitonrewardRange
- the reward function value rangeisEpisodic
- whether the task is episodic or continuingdiscount
- the discount factor to use for the taskpublic void load()
public void load(java.lang.String hostAddress, java.lang.String port)
hostAddress
- the RLGlue host addressport
- the RLGlue portpublic void env_cleanup()
env_cleanup
in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
public java.lang.String env_init()
env_init
in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
protected void addAttribute(org.rlcommunity.rlglue.codec.taskspec.TaskSpecVRLGLUE3 theTaskSpecObject, Attribute att)
theTaskSpecObject
- the RLGlue task specificationatt
- the BURLAP attribute to add to the specpublic java.lang.String env_message(java.lang.String arg0)
env_message
in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
public org.rlcommunity.rlglue.codec.types.Observation env_start()
env_start
in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
public org.rlcommunity.rlglue.codec.types.Reward_observation_terminal env_step(org.rlcommunity.rlglue.codec.types.Action arg0)
env_step
in interface org.rlcommunity.rlglue.codec.EnvironmentInterface
protected org.rlcommunity.rlglue.codec.types.Observation convertIntoObservation(State s)
s
- the OO-MDP stateprotected int objectIndex(State s, java.lang.String obName)
s
- the state holding the objectobName
- the name of the object