public class UCTTreeWalkPolicy extends java.lang.Object implements SolverDerivedPolicy, EnumerablePolicy
Constructor and Description 

UCTTreeWalkPolicy(UCT planner)
Initializes the policy with the UCT valueFunction

Modifier and Type  Method and Description 

Action 
action(State s)
This method will return an action sampled by the policy for the given state.

double 
actionProb(State s,
Action a)
Returns the probability/probability density that the given action will be taken in the given state.

void 
computePolicyFromTree()
computes a hashbacked policy for every state visited along the greedy path of the UCT tree.

boolean 
definedFor(State s)
Specifies whether this policy is defined for the input state.

protected UCTActionNode 
getQGreedyNode(UCTStateNode snode)
Returns the
UCTActionNode with the highest average sample return. 
java.util.List<ActionProb> 
policyDistribution(State s)
This method will return action probability distribution defined by the policy.

void 
setSolver(MDPSolverInterface solver)
Sets the valueFunction whose results affect this policy.

public UCTTreeWalkPolicy(UCT planner)
planner
 the UCT valueFunction whose tree should be walked.public void setSolver(MDPSolverInterface solver)
SolverDerivedPolicy
setSolver
in interface SolverDerivedPolicy
solver
 the solver from which this policy is derivedpublic void computePolicyFromTree()
protected UCTActionNode getQGreedyNode(UCTStateNode snode)
UCTActionNode
with the highest average sample return. Note that this does not use the upper confidence since
planning is completed.snode
 the UCTStateNode
for which to get the best UCTActionNode
.UCTActionNode
with the highest average sample return.public Action action(State s)
Policy
public double actionProb(State s, Action a)
Policy
actionProb
in interface Policy
s
 the state of interesta
 the action that may be taken in the statepublic java.util.List<ActionProb> policyDistribution(State s)
EnumerablePolicy
policyDistribution
in interface EnumerablePolicy
s
 the state for which an action distribution should be returnedpublic boolean definedFor(State s)
Policy
definedFor
in interface Policy
s
 the input state to test for whether this policy is definedState
s, false otherwise.