public class BoltzmannActor extends Actor
StateHashFactory
to perform lookups.Policy.ActionProb, Policy.PolicyUndefinedException, Policy.RandomPolicy
Modifier and Type | Field and Description |
---|---|
protected java.util.List<Action> |
actions
The actions the agent can perform
|
protected boolean |
containsParameterizedActions
Indicates whether the actions that this agent can perform are parameterized
|
protected Domain |
domain
The domain in which this agent will act
|
protected StateHashFactory |
hashingFactory
The hashing factory used to hash states and evaluate state equality
|
protected LearningRate |
learningRate
The learning rate used to update action preferences in response to critiques.
|
protected java.util.Map<StateHashTuple,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> |
preferences
A map from (hashed) states to Policy nodes; the latter of which contains the action preferences
for each applicable action in the state.
|
protected int |
totalNumberOfSteps
The total number of learning steps performed by this agent.
|
annotateOptionDecomposition, evaluateDecomposesOptions
Constructor and Description |
---|
BoltzmannActor(Domain domain,
StateHashFactory hashingFactory,
double learningRate)
Initializes the Actor
|
Modifier and Type | Method and Description |
---|---|
void |
addNonDomainReferencedAction(Action a)
This method allows the actor to utilize actions that are not apart of the domain definition.
|
AbstractGroundedAction |
getAction(State s)
This method will return an action sampled by the policy for the given state.
|
java.util.List<Policy.ActionProb> |
getActionDistributionForState(State s)
This method will return action probability distribution defined by the policy.
|
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference |
getMatchingPreference(StateHashTuple sh,
GroundedAction ga,
burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
Returns the stored
BoltzmannActor.ActionPreference that is stored in a policy node. |
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode |
getNode(StateHashTuple sh)
Returns the policy node that stores the action preferences for state.
|
boolean |
isDefinedFor(State s)
Specifies whether this policy is defined for the input state.
|
boolean |
isStochastic()
Indicates whether the policy is stochastic or deterministic.
|
void |
resetData()
Used to reset any data that was created/modified during learning so that learning can be begin anew.
|
void |
setLearningRate(LearningRate lr)
Sets the learning rate function to use.
|
protected GroundedAction |
translateAction(GroundedAction a,
java.util.Map<java.lang.String,java.lang.String> matching)
Takes a parameterized GroundedAction and returns an action with its parameters shifting according to a provided object matching from the state in
which the action was applied and some other state's object name identifiers.
|
void |
updateFromCritqique(CritiqueResult critqiue)
Causes this object to update its behavior is response to a critique of its behavior.
|
evaluateBehavior, evaluateBehavior, evaluateBehavior, evaluateMethodsShouldAnnotateOptionDecomposition, evaluateMethodsShouldDecomposeOption, getDeterministicPolicy, getProbOfAction, getProbOfActionGivenDistribution, getProbOfActionGivenDistribution, sampleFromActionDistribution
protected Domain domain
protected java.util.List<Action> actions
protected StateHashFactory hashingFactory
protected LearningRate learningRate
protected java.util.Map<StateHashTuple,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> preferences
protected boolean containsParameterizedActions
protected int totalNumberOfSteps
public BoltzmannActor(Domain domain, StateHashFactory hashingFactory, double learningRate)
domain
- the domain in which the agent will acthashingFactory
- the state hashing factory to use for state hashing and equality checkslearningRate
- the learning rate that affects how quickly the agent adjusts its action preferences.public void setLearningRate(LearningRate lr)
lr
- the learning rate function to use.public void updateFromCritqique(CritiqueResult critqiue)
Actor
updateFromCritqique
in class Actor
critqiue
- the critique of the agents behavior represented by a CritiqueResult
objectpublic void addNonDomainReferencedAction(Action a)
Actor
addNonDomainReferencedAction
in class Actor
a
- an action not apart of the of the domain definition that this actor should be able to use.public AbstractGroundedAction getAction(State s)
Policy
public java.util.List<Policy.ActionProb> getActionDistributionForState(State s)
Policy
getActionDistributionForState
in class Policy
s
- the state for which an action distribution should be returnedprotected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode getNode(StateHashTuple sh)
sh
- The (hashed) state of the BoltzmannActor.PolicyNode
to returnBoltzmannActor.PolicyNode
object for the given input state.public boolean isStochastic()
Policy
isStochastic
in class Policy
public boolean isDefinedFor(State s)
Policy
isDefinedFor
in class Policy
s
- the input state to test for whether this policy is definedState
s, false otherwise.public void resetData()
Actor
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference getMatchingPreference(StateHashTuple sh, GroundedAction ga, burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
BoltzmannActor.ActionPreference
that is stored in a policy node. If actions are parameterized and the domain is not name dependent,
then a matching between the input state and stored state is first found and used to match the input action parameters to the stored action parameters.sh
- the input state on which the input action was appliedga
- the input action for which the BoltzmannActor.ActionPreference
object should be returned.node
- the BoltzmannActor.PolicyNode
object that contains the Action preference.BoltzmannActor.ActionPreference
object for the given action stored in the given BoltzmannActor.PolicyNode
; null if it does not exist.protected GroundedAction translateAction(GroundedAction a, java.util.Map<java.lang.String,java.lang.String> matching)
a
- the source actionmatching
- a matching from objects in the state in which the source action was applied to corresponding objects in some other state's object name identifiers