public interface Option extends Action
Action
interface. Requires additional methods for defining the
option, initiation set, termination conditions, its policy, whether the option is Markov, and giving it control
in an environment.
The policy methods policy(State, Episode)
, policyDistribution(State, Episode)
and the termination
conditions method probabilityOfTermination(State, Episode)
take as input a history (provided as an
Episode
object) so that Non Markov options can be supported. If the option is Markov, these history parameters
can be null.
the control(Environment, double)
method can generally be implemented using the control(Environment, double)
method, but you can also implement it your own way if desired.
1. Sutton, Richard S., Doina Precup, and Satinder Singh. "Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning." Artificial intelligence 112.1 (1999): 181-211.
Modifier and Type | Interface and Description |
---|---|
static class |
Option.Helper |
Modifier and Type | Method and Description |
---|---|
EnvironmentOptionOutcome |
control(Environment env,
double discount) |
boolean |
inInitiationSet(State s)
Returns true if the input state is in the initiation set of the
Option |
boolean |
markov() |
Action |
policy(State s,
Episode history) |
java.util.List<ActionProb> |
policyDistribution(State s,
Episode history) |
double |
probabilityOfTermination(State s,
Episode history) |
actionName, copy
boolean inInitiationSet(State s)
Option
s
- the State
to test.java.util.List<ActionProb> policyDistribution(State s, Episode history)
EnvironmentOptionOutcome control(Environment env, double discount)
boolean markov()