public class CustomRewardModel extends java.lang.Object implements FullModel
FullModel.Helper| Modifier and Type | Field and Description |
|---|---|
protected SampleModel |
model |
protected RewardFunction |
rewardFunction |
| Constructor and Description |
|---|
CustomRewardModel(SampleModel model,
RewardFunction rewardFunction) |
| Modifier and Type | Method and Description |
|---|---|
protected EnvironmentOutcome |
modifyOutcome(EnvironmentOutcome eo) |
EnvironmentOutcome |
sample(State s,
Action a)
Samples a transition from the transition distribution and returns it.
|
boolean |
terminal(State s)
Indicates whether a state is a terminal state (i.e., no more action occurs and zero reward received from there on out)
|
java.util.List<TransitionProb> |
transitions(State s,
Action a)
|
protected SampleModel model
protected RewardFunction rewardFunction
public CustomRewardModel(SampleModel model, RewardFunction rewardFunction)
public java.util.List<TransitionProb> transitions(State s, Action a)
FullModelAction is applied in State s. The returned
list only needs to include transitions that have non-zero probability of occurring.transitions in interface FullModels - the source Statea - the Action applied in the source statepublic EnvironmentOutcome sample(State s, Action a)
SampleModelsample in interface SampleModels - the source statea - the action taken in the source stateEnvironmentOutcome describing the sampled transitionpublic boolean terminal(State s)
SampleModelterminal in interface SampleModels - the input state to testprotected EnvironmentOutcome modifyOutcome(EnvironmentOutcome eo)