public class EnvironmentOptionOutcome extends EnvironmentOutcome
EnvironmentOutcome
class for reporting the effects of applying
an Option
in a given Environment
. This class extends the standard
EnvironmentOutcome
to include the discount to apply to the value of time steps following
the application of an Option
and the number of steps taken by the Option
in the Environment
. The discount is therefore the gamma^t, where gamma is the
MDP discount factor and t is the number of time steps taken by the option. The saved reward value (EnvironmentOutcome.r
)
for this object will also represent the cumulative discounted reward.Modifier and Type | Field and Description |
---|---|
double |
discount
The discount factor to apply to the value of time steps immediately following the application of an
Option . |
Episode |
episode
The executed episode from this execution
|
a, o, op, r, terminated
Constructor and Description |
---|
EnvironmentOptionOutcome(State s,
Action a,
State sp,
double r,
boolean terminated,
double discountFactor,
Episode episode)
Initializes.
|
Modifier and Type | Method and Description |
---|---|
int |
numSteps() |
public double discount
Option
. Specifically,
this value is gamma^t where gamma is the discount factor of the MDP and t is the number of time steps taken by the option.public Episode episode
public EnvironmentOptionOutcome(State s, Action a, State sp, double r, boolean terminated, double discountFactor, Episode episode)
discount
of this object will be set to discountFactor^numSteps, since discountFactor is
the discount factor of the MDP and discount
represents the amount values in the time step following the option
application should be discounted.s
- The previous state of the environment when the action was taken.a
- The action taken in the environmentsp
- The next state to which the environment transitionedr
- The reward receivedterminated
- Whether the next state to which the environment transitioned is a terminal state (true if so, false otherwise)discountFactor
- The discount factor of the MDP.episode
- the episode of execution