public class CustomRewardModel extends java.lang.Object implements FullModel
FullModel.Helper
Modifier and Type | Field and Description |
---|---|
protected SampleModel |
model |
protected RewardFunction |
rewardFunction |
Constructor and Description |
---|
CustomRewardModel(SampleModel model,
RewardFunction rewardFunction) |
Modifier and Type | Method and Description |
---|---|
protected EnvironmentOutcome |
modifyOutcome(EnvironmentOutcome eo) |
EnvironmentOutcome |
sample(State s,
Action a)
Samples a transition from the transition distribution and returns it.
|
boolean |
terminal(State s)
Indicates whether a state is a terminal state (i.e., no more action occurs and zero reward received from there on out)
|
java.util.List<TransitionProb> |
transitions(State s,
Action a)
|
protected SampleModel model
protected RewardFunction rewardFunction
public CustomRewardModel(SampleModel model, RewardFunction rewardFunction)
public java.util.List<TransitionProb> transitions(State s, Action a)
FullModel
Action
is applied in State
s. The returned
list only needs to include transitions that have non-zero probability of occurring.transitions
in interface FullModel
s
- the source State
a
- the Action
applied in the source statepublic EnvironmentOutcome sample(State s, Action a)
SampleModel
sample
in interface SampleModel
s
- the source statea
- the action taken in the source stateEnvironmentOutcome
describing the sampled transitionpublic boolean terminal(State s)
SampleModel
terminal
in interface SampleModel
s
- the input state to testprotected EnvironmentOutcome modifyOutcome(EnvironmentOutcome eo)