public class BoltzmannActor extends Actor implements EnumerablePolicy
HashableStateFactory
to perform lookups.Modifier and Type | Field and Description |
---|---|
protected java.util.List<ActionType> |
actionTypes
The actions the agent can perform
|
protected boolean |
containsParameterizedActions
Indicates whether the actions that this agent can perform are parameterized
|
protected Domain |
domain
The domain in which this agent will act
|
protected HashableStateFactory |
hashingFactory
The hashing factory used to hash states and evaluate state equality
|
protected LearningRate |
learningRate
The learning rate used to update action preferences in response to critiques.
|
protected java.util.Map<HashableState,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> |
preferences
A map from (hashed) states to Policy nodes; the latter of which contains the action preferences
for each applicable action in the state.
|
protected int |
totalNumberOfSteps
The total number of learning steps performed by this agent.
|
Constructor and Description |
---|
BoltzmannActor(SADomain domain,
HashableStateFactory hashingFactory,
double learningRate)
Initializes the Actor
|
Modifier and Type | Method and Description |
---|---|
Action |
action(State s)
This method will return an action sampled by the policy for the given state.
|
double |
actionProb(State s,
Action a)
Returns the probability/probability density that the given action will be taken in the given state.
|
void |
addNonDomainReferencedAction(ActionType a)
This method allows the actor to utilize actions that are not apart of the domain definition.
|
boolean |
definedFor(State s)
Specifies whether this policy is defined for the input state.
|
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference |
getMatchingPreference(HashableState sh,
Action a,
burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
Returns the stored
BoltzmannActor.ActionPreference that is stored in a policy node. |
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode |
getNode(HashableState sh)
Returns the policy node that stores the action preferences for state.
|
java.util.List<ActionProb> |
policyDistribution(State s)
This method will return action probability distribution defined by the policy.
|
void |
resetData()
Used to reset any data that was created/modified during learning so that learning can be begin anew.
|
void |
setLearningRate(LearningRate lr)
Sets the learning rate function to use.
|
void |
updateFromCritique(CritiqueResult critqiue)
Causes this object to update its behavior is response to a critique of its behavior.
|
protected Domain domain
protected java.util.List<ActionType> actionTypes
protected HashableStateFactory hashingFactory
protected LearningRate learningRate
protected java.util.Map<HashableState,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> preferences
protected boolean containsParameterizedActions
protected int totalNumberOfSteps
public BoltzmannActor(SADomain domain, HashableStateFactory hashingFactory, double learningRate)
domain
- the domain in which the agent will acthashingFactory
- the state hashing factory to use for state hashing and equality checkslearningRate
- the learning rate that affects how quickly the agent adjusts its action preferences.public void setLearningRate(LearningRate lr)
lr
- the learning rate function to use.public void updateFromCritique(CritiqueResult critqiue)
Actor
updateFromCritique
in class Actor
critqiue
- the critique of the agents behavior represented by a CritiqueResult
objectpublic void addNonDomainReferencedAction(ActionType a)
Actor
addNonDomainReferencedAction
in class Actor
a
- an action not apart of the of the domain definition that this actor should be able to use.public Action action(State s)
Policy
public double actionProb(State s, Action a)
Policy
actionProb
in interface Policy
s
- the state of interesta
- the action that may be taken in the statepublic java.util.List<ActionProb> policyDistribution(State s)
EnumerablePolicy
policyDistribution
in interface EnumerablePolicy
s
- the state for which an action distribution should be returnedprotected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode getNode(HashableState sh)
sh
- The (hashed) state of the BoltzmannActor.PolicyNode
to returnBoltzmannActor.PolicyNode
object for the given input state.public boolean definedFor(State s)
Policy
definedFor
in interface Policy
s
- the input state to test for whether this policy is definedState
s, false otherwise.public void resetData()
Actor
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference getMatchingPreference(HashableState sh, Action a, burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
BoltzmannActor.ActionPreference
that is stored in a policy node. If actions are parameterized and the domain is not name dependent,
then a matching between the input state and stored state is first found and used to match the input action parameters to the stored action parameters.sh
- the input state on which the input action was applieda
- the input action for which the BoltzmannActor.ActionPreference
object should be returned.node
- the BoltzmannActor.PolicyNode
object that contains the Action preference.BoltzmannActor.ActionPreference
object for the given action stored in the given BoltzmannActor.PolicyNode
; null if it does not exist.