public class BoltzmannActor extends Actor implements EnumerablePolicy
HashableStateFactory to perform lookups.| Modifier and Type | Field and Description | 
|---|---|
protected java.util.List<ActionType> | 
actionTypes
The actions the agent can perform 
 | 
protected boolean | 
containsParameterizedActions
Indicates whether the actions that this agent can perform are parameterized 
 | 
protected Domain | 
domain
The domain in which this agent will act 
 | 
protected HashableStateFactory | 
hashingFactory
The hashing factory used to hash states and evaluate state equality 
 | 
protected LearningRate | 
learningRate
The learning rate used to update action preferences in response to critiques. 
 | 
protected java.util.Map<HashableState,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> | 
preferences
A map from (hashed) states to Policy nodes; the latter of which contains the action preferences
 for each applicable action in the state. 
 | 
protected int | 
totalNumberOfSteps
The total number of learning steps performed by this agent. 
 | 
| Constructor and Description | 
|---|
BoltzmannActor(SADomain domain,
              HashableStateFactory hashingFactory,
              double learningRate)
Initializes the Actor 
 | 
| Modifier and Type | Method and Description | 
|---|---|
Action | 
action(State s)
This method will return an action sampled by the policy for the given state. 
 | 
double | 
actionProb(State s,
          Action a)
Returns the probability/probability density that the given action will be taken in the given state. 
 | 
void | 
addNonDomainReferencedAction(ActionType a)
This method allows the actor to utilize actions that are not apart of the domain definition. 
 | 
boolean | 
definedFor(State s)
Specifies whether this policy is defined for the input state. 
 | 
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference | 
getMatchingPreference(HashableState sh,
                     Action a,
                     burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
Returns the stored  
BoltzmannActor.ActionPreference that is stored in a policy node. | 
protected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode | 
getNode(HashableState sh)
Returns the policy node that stores the action preferences for state. 
 | 
java.util.List<ActionProb> | 
policyDistribution(State s)
This method will return action probability distribution defined by the policy. 
 | 
void | 
resetData()
Used to reset any data that was created/modified during learning so that learning can be begin anew. 
 | 
void | 
setLearningRate(LearningRate lr)
Sets the learning rate function to use. 
 | 
void | 
updateFromCritique(CritiqueResult critqiue)
Causes this object to update its behavior is response to a critique of its behavior. 
 | 
protected Domain domain
protected java.util.List<ActionType> actionTypes
protected HashableStateFactory hashingFactory
protected LearningRate learningRate
protected java.util.Map<HashableState,burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode> preferences
protected boolean containsParameterizedActions
protected int totalNumberOfSteps
public BoltzmannActor(SADomain domain, HashableStateFactory hashingFactory, double learningRate)
domain - the domain in which the agent will acthashingFactory - the state hashing factory to use for state hashing and equality checkslearningRate - the learning rate that affects how quickly the agent adjusts its action preferences.public void setLearningRate(LearningRate lr)
lr - the learning rate function to use.public void updateFromCritique(CritiqueResult critqiue)
ActorupdateFromCritique in class Actorcritqiue - the critique of the agents behavior represented by a CritiqueResult objectpublic void addNonDomainReferencedAction(ActionType a)
ActoraddNonDomainReferencedAction in class Actora - an action not apart of the of the domain definition that this actor should be able to use.public Action action(State s)
Policypublic double actionProb(State s, Action a)
PolicyactionProb in interface Policys - the state of interesta - the action that may be taken in the statepublic java.util.List<ActionProb> policyDistribution(State s)
EnumerablePolicypolicyDistribution in interface EnumerablePolicys - the state for which an action distribution should be returnedprotected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode getNode(HashableState sh)
sh - The (hashed) state of the BoltzmannActor.PolicyNode to returnBoltzmannActor.PolicyNode object for the given input state.public boolean definedFor(State s)
PolicydefinedFor in interface Policys - the input state to test for whether this policy is definedState s, false otherwise.public void resetData()
Actorprotected burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.ActionPreference getMatchingPreference(HashableState sh, Action a, burlap.behavior.singleagent.learning.actorcritic.actor.BoltzmannActor.PolicyNode node)
BoltzmannActor.ActionPreference that is stored in a policy node. If actions are parameterized and the domain is not name dependent,
 then a matching between the input state and stored state is first found and used to match the input action parameters to the stored action parameters.sh - the input state on which the input action was applieda - the input action for which the BoltzmannActor.ActionPreference object should be returned.node - the BoltzmannActor.PolicyNode object that contains the Action preference.BoltzmannActor.ActionPreference object for the given action stored in the given BoltzmannActor.PolicyNode; null if it does not exist.