public class BoltzmannActor extends Actor
HashableStateFactory
to perform lookups.Policy.ActionProb, Policy.GroundedAnnotatedAction, Policy.PolicyUndefinedException
Modifier and Type | Field and Description |
---|---|
protected java.util.List<Action> |
actions
The actions the agent can perform
|
protected boolean |
containsParameterizedActions
Indicates whether the actions that this agent can perform are parameterized
|
protected Domain |
domain
The domain in which this agent will act
|
protected HashableStateFactory |
hashingFactory
The hashing factory used to hash states and evaluate state equality
|
protected LearningRate |
learningRate
The learning rate used to update action preferences in response to critiques.
|
protected java.util.Map<HashableState,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> |
preferences
A map from (hashed) states to Policy nodes; the latter of which contains the action preferences
for each applicable action in the state.
|
protected int |
totalNumberOfSteps
The total number of learning steps performed by this agent.
|
annotateOptionDecomposition, evaluateDecomposesOptions
Constructor and Description |
---|
BoltzmannActor(Domain domain,
HashableStateFactory hashingFactory,
double learningRate)
Initializes the Actor
|
Modifier and Type | Method and Description |
---|---|
void |
addNonDomainReferencedAction(Action a)
This method allows the actor to utilize actions that are not apart of the domain definition.
|
AbstractGroundedAction |
getAction(State s)
This method will return an action sampled by the policy for the given state.
|
java.util.List<Policy.ActionProb> |
getActionDistributionForState(State s)
This method will return action probability distribution defined by the policy.
|
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference |
getMatchingPreference(HashableState sh,
GroundedAction ga,
burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
Returns the stored
BoltzmannActor.ActionPreference that is stored in a policy node. |
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode |
getNode(HashableState sh)
Returns the policy node that stores the action preferences for state.
|
boolean |
isDefinedFor(State s)
Specifies whether this policy is defined for the input state.
|
boolean |
isStochastic()
Indicates whether the policy is stochastic or deterministic.
|
void |
resetData()
Used to reset any data that was created/modified during learning so that learning can be begin anew.
|
void |
setLearningRate(LearningRate lr)
Sets the learning rate function to use.
|
void |
updateFromCritqique(CritiqueResult critqiue)
Causes this object to update its behavior is response to a critique of its behavior.
|
evaluateBehavior, evaluateBehavior, evaluateBehavior, evaluateBehavior, evaluateBehavior, evaluateMethodsShouldAnnotateOptionDecomposition, evaluateMethodsShouldDecomposeOption, followAndRecordPolicy, followAndRecordPolicy, getDeterministicPolicy, getProbOfAction, getProbOfActionGivenDistribution, getProbOfActionGivenDistribution, sampleFromActionDistribution
protected Domain domain
protected java.util.List<Action> actions
protected HashableStateFactory hashingFactory
protected LearningRate learningRate
protected java.util.Map<HashableState,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> preferences
protected boolean containsParameterizedActions
protected int totalNumberOfSteps
public BoltzmannActor(Domain domain, HashableStateFactory hashingFactory, double learningRate)
domain
- the domain in which the agent will acthashingFactory
- the state hashing factory to use for state hashing and equality checkslearningRate
- the learning rate that affects how quickly the agent adjusts its action preferences.public void setLearningRate(LearningRate lr)
lr
- the learning rate function to use.public void updateFromCritqique(CritiqueResult critqiue)
Actor
updateFromCritqique
in class Actor
critqiue
- the critique of the agents behavior represented by a CritiqueResult
objectpublic void addNonDomainReferencedAction(Action a)
Actor
addNonDomainReferencedAction
in class Actor
a
- an action not apart of the of the domain definition that this actor should be able to use.public AbstractGroundedAction getAction(State s)
Policy
public java.util.List<Policy.ActionProb> getActionDistributionForState(State s)
Policy
getActionDistributionForState
in class Policy
s
- the state for which an action distribution should be returnedprotected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode getNode(HashableState sh)
sh
- The (hashed) state of the BoltzmannActor.PolicyNode
to returnBoltzmannActor.PolicyNode
object for the given input state.public boolean isStochastic()
Policy
isStochastic
in class Policy
public boolean isDefinedFor(State s)
Policy
isDefinedFor
in class Policy
s
- the input state to test for whether this policy is definedState
s, false otherwise.public void resetData()
Actor
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference getMatchingPreference(HashableState sh, GroundedAction ga, burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
BoltzmannActor.ActionPreference
that is stored in a policy node. If actions are parameterized and the domain is not name dependent,
then a matching between the input state and stored state is first found and used to match the input action parameters to the stored action parameters.sh
- the input state on which the input action was appliedga
- the input action for which the BoltzmannActor.ActionPreference
object should be returned.node
- the BoltzmannActor.PolicyNode
object that contains the Action preference.BoltzmannActor.ActionPreference
object for the given action stored in the given BoltzmannActor.PolicyNode
; null if it does not exist.