public class Episode
extends java.lang.Object
initializeInState(State)
method to set the initial state of the episode, before recording
any transitions. It is then advised that transitions are recorded with the transition(Action, State, double)
method, which takes as input
the next state to which the agent transitioned, the action applied in the last recorded state, and the reward received fro the transition.
When querying about the state, action, and reward sequences, use the state(int)
, action(int)
, and reward(int)
methods.
These methods take as input the time step of the element you want. Note that t = 0 refers to the initial state step so calling getState(0) and getAction(0)
will return the initial state and the action taken in the initial state, respectively. However, rewards are always received in the next time step
from the state and action that produced them. Therefore, getReward(0) is undefined. Instead, the first reward received will be at time step 1: getReward(1).
Modifier and Type | Field and Description |
---|---|
java.util.List<Action> |
actionSequence
The sequence of actions taken
|
java.util.List<java.lang.Double> |
rewardSequence
The sequence of rewards received.
|
java.util.List<State> |
stateSequence
The sequence of states observed
|
Constructor and Description |
---|
Episode()
Creates a new EpisodeAnalysis object.
|
Episode(State initialState)
Initializes a new EpisodeAnalysis object with the initial state in which the episode started.
|
Modifier and Type | Method and Description |
---|---|
Action |
action(int t)
Returns the action taken in the state at time step t.
|
java.lang.String |
actionString()
Returns a string representing the actions taken in this episode.
|
java.lang.String |
actionString(java.lang.String delimiter)
Returns a string representing the actions taken in this episode.
|
void |
addAction(Action ga)
Adds a GroundedAction to the action sequence.
|
void |
addReward(double r)
Adds a reward to the reward sequence.
|
void |
addState(State s)
Adds a state to the state sequence.
|
void |
appendAndMergeEpisodeAnalysis(Episode e)
This method will append execution results in e to this object's results.
|
Episode |
copy()
Returns a copy of this
Episode . |
double |
discountedReturn(double discountFactor)
Will return the discounted return received from the first state in the episode to the last state in the episode.
|
void |
initializeInState(State initialState)
Initializes this object with the initial state in which the episode started.
|
static void |
main(java.lang.String[] args) |
int |
maxTimeStep()
Returns the maximum time step index in this episode which is the
numTimeSteps() -1. |
int |
numActions()
Returns the number of actions, which is 1 less than the number of states.
|
int |
numTimeSteps()
Returns the number of time steps in this episode, which is equivalent to the number of states.
|
static Episode |
parseEpisode(java.lang.String episodeString) |
static Episode |
read(java.lang.String path)
Reads an episode that was written to a file and turns into an EpisodeAnalysis object.
|
static java.util.List<Episode> |
readEpisodes(java.lang.String directoryPath)
Takes a path to a directory containing .episode files and reads them all into a
List
of Episode objects. |
double |
reward(int t)
Returns the reward received at timestep t.
|
java.lang.String |
serialize() |
State |
state(int t)
Returns the state observed at time step t.
|
void |
transition(Action usingAction,
State nextState,
double r)
Records a transition event where the agent applied the usingAction action in the last
state in this object's state sequence, transitioned to state nextState, and received reward r,.
|
void |
transition(EnvironmentOutcome eo)
Records a transition event from the
EnvironmentOutcome . |
void |
write(java.lang.String path)
Writes this episode to a file.
|
static void |
writeEpisodes(java.util.List<Episode> episodes,
java.lang.String directoryPath,
java.lang.String baseFileName)
Takes a
List of Episode objects and writes them to a directory. |
public java.util.List<State> stateSequence
public java.util.List<Action> actionSequence
public java.util.List<java.lang.Double> rewardSequence
public Episode()
initializeInState(State)
method
should be called to set the initial state of the episode.public Episode(State initialState)
initialState
- the initial state of the episodepublic void initializeInState(State initialState)
initialState
- the initial state of the episodepublic void addState(State s)
initializeInState(State)
method
along with subsequent calls to the transition(Action, State, double)
method is used instead, but this
method can be used to manually add a state.s
- the state to addpublic void addAction(Action ga)
initializeInState(State)
method
along with subsequent calls to the transition(Action, State, double)
method is used instead, but this
method can be used to manually add a GroundedAction.ga
- the GroundedAction to addpublic void addReward(double r)
initializeInState(State)
method
along with subsequent calls to the transition(Action, State, double)
method is used instead, but this
method can be used to manually add a reward.r
- the reward to addpublic void transition(Action usingAction, State nextState, double r)
usingAction
- the action the agent used that caused the transitionnextState
- the next state to which the agent transitionedr
- the reward the agent received for this transition.public void transition(EnvironmentOutcome eo)
EnvironmentOutcome
. Assumes that the last state recorded in
this Episode
is the same as the previous state (EnvironmentOutcome.o
in the EnvironmentOutcome
eo
- an EnvironmentOutcome
specifying a new transition for this episode.public State state(int t)
t
- the time step of the episodepublic Action action(int t)
t
- the time step of the episodepublic double reward(int t)
t
- the time step of the episodepublic int numTimeSteps()
public int maxTimeStep()
numTimeSteps()
-1. Note that there
is will be no action in the last time step.public int numActions()
public double discountedReturn(double discountFactor)
discountFactor
- the discount factor to compute the discounted return; should be on [0, 1]public void appendAndMergeEpisodeAnalysis(Episode e)
e
- the execution results to append to this episode.public java.lang.String actionString()
public java.lang.String actionString(java.lang.String delimiter)
delimiter
- the delimiter to separate actions in the string.public static void writeEpisodes(java.util.List<Episode> episodes, java.lang.String directoryPath, java.lang.String baseFileName)
List
of Episode
objects and writes them to a directory.
The format of the file names will be "baseFileName{index}.episode" where {index} represents the index of the
episode in the list. States must be serializable.episodes
- the list of episodes to write to diskdirectoryPath
- the directory path in which the episodes will be writtenbaseFileName
- the base file name to use for the episode filespublic void write(java.lang.String path)
path
- the path to the file in which to write this episode.public static java.util.List<Episode> readEpisodes(java.lang.String directoryPath)
List
of Episode
objects.directoryPath
- the path to the directory containing the episode filesList
of Episode
objects.public static Episode read(java.lang.String path)
path
- the path to the episode file.public java.lang.String serialize()
public static Episode parseEpisode(java.lang.String episodeString)
public static void main(java.lang.String[] args)