public class CartPoleDomain extends java.lang.Object implements DomainGenerator
physParams
isFiniteTrack
parameter to false. The infinite track is handled by never changing the position value of the cart. All model/physics parameters are stored in the
CartPoleDomain.CPPhysicsParams
object physParams
. Modifying this generator's model parameters
will not affected previously generated domains, so the same generator can be used to generate different domains without affecting others.
By default, this implementation will use the simulation described by Florian, which corrects two problems in the classic Barto, Sutton, and Anderson paper.
The two problems were (1) gravity was specified as negative in the equations when it should have been positive and (2) friction was not calculated
correctly. However, this domain may also be set to use the classic incorrect mechanics or the classic mechanics with correct gravity for comparison
purposes. To do so, use the methods setToIncorrectClassicModel()
and setToIncorrectClassicModelWithCorrectGravity()
. Note that
when incorrect gravity is used, the pole will "bounce" once it reaches about 90 degrees (though in most tasks the pole is never allowed to fall this far).
This domain consists of a single object with 4 real valued attributes: the x position of the cart, the x velocity of the cart, the angle between the pole
and the vertical axis, and the speed of the change in angle. Additionally, a 5th hidden attribute is included
when the corrected physics are used that maintains the sign of the normal force in the last step. If the classic mechanics are used instead,
then this hidden attribute is not included.
The physics are simulated using a non-linear differential equation
that is estimated using Euler's Method. All system parameters are defaulted to those used in the
original paper, but they may modified as desired.
Also included with this class are default classes for reward function and terminal function for this domain.
Running the main method of this class will launch and interactive visualizer with the 'a' and 'd' keys controlling left and right movement
force respectively.
1. Florian, Razvan V. "Correct equations for the dynamics of the cart-pole system." Center for Cognitive and Neural Studies (Coneural), Romania (2007).
2. Barto, Andrew G., Richard S. Sutton, and Charles W. Anderson. "Neuronlike adaptive elements that can solve difficult learning control problems."
Systems, Man and Cybernetics, IEEE Transactions on 5 (1983): 834-846.Modifier and Type | Class and Description |
---|---|
static class |
CartPoleDomain.CartPoleRewardFunction
A default reward function for this task.
|
static class |
CartPoleDomain.CartPoleTerminalFunction
A default terminal function for this domain.
|
static class |
CartPoleDomain.CPPhysicsParams |
protected static class |
CartPoleDomain.MovementAction
A movement action which applies force in the specified direction.
|
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
ACTIONLEFT
A constant for the name of the left action
|
static java.lang.String |
ACTIONRIGHT
A constant for the name of the right action
|
static java.lang.String |
ATTANGLE
A constant for the name of the angle attribute
|
static java.lang.String |
ATTANGLEV
A constant for the name of the angle velocity
|
static java.lang.String |
ATTNORMSGN
Attribute name for maintaining the direction sign of the force normal.
|
static java.lang.String |
ATTV
A constant for the name of the position velocity
|
static java.lang.String |
ATTX
A constant for the name of the position attribute
|
static java.lang.String |
CLASSCARTPOLE
A constant for the name of the cart and pole object to be moved
|
CartPoleDomain.CPPhysicsParams |
physParams
An object specifying the physics parameters for the cart pole domain.
|
Constructor and Description |
---|
CartPoleDomain() |
Modifier and Type | Method and Description |
---|---|
Domain |
generateDomain()
Returns a newly instanced Domain object
|
protected static double |
getAngle2ndDeriv(double xv0,
double a0,
double av0,
double nsign,
double f,
CartPoleDomain.CPPhysicsParams physParams)
Computes the 2nd order derivative of the angle for a given normal force sign using the corrected model.
|
static State |
getInitialState(Domain domain)
Returns the default initial state: the cart centered on the track, not moving, with the pole perfectly vertical.
|
static State |
getInitialState(Domain domain,
double x,
double xv,
double a,
double av)
Returns an initial state with the given initial values for the cart and pole.
|
protected static double |
getNormForce(double a0,
double av0,
double a_2,
CartPoleDomain.CPPhysicsParams physParams)
Computes the normal force for the corrected model
|
protected static double |
getX2ndDeriv(double xv0,
double a0,
double av0,
double n,
double f,
double a2,
CartPoleDomain.CPPhysicsParams physParams)
Returns the second order x position derivative for the corrected model.
|
static void |
main(java.lang.String[] args)
Launches an interactive visualize in which key 'a' applies a force in the left direction and key 'd' applies force in the right direction.
|
static State |
moveClassicModel(State s,
double dir,
CartPoleDomain.CPPhysicsParams physParams)
Simulates the physics for one time step give the input state s, and the direction of force applied.
|
static State |
moveCorrectModel(State s,
double dir,
CartPoleDomain.CPPhysicsParams physParams)
Simulates the physics for one time step give the input state s, and the direction of force applied.
|
double |
setMaxCartSpeedToMaxWithMovementFromOneSideToOther()
Given the current action force, track length and masses, sets the max cart speed
to an upperbound of what is possible from moving from one side of the track to another.
|
void |
setToCorrectModel()
Sets to use the correct physics model by Florian.
|
void |
setToIncorrectClassicModel()
Sets to the use the classic model by Barto, Sutton, and Anderson, which has incorrect friction forces and gravity
in the wrong direction
|
void |
setToIncorrectClassicModelWithCorrectGravity()
Sets to use the classic model by Barto, Sutton, and Anderson which has incorrect friction forces, but will use
correct gravity.
|
public static final java.lang.String ATTX
public static final java.lang.String ATTV
public static final java.lang.String ATTANGLE
public static final java.lang.String ATTANGLEV
public static final java.lang.String ATTNORMSGN
public static final java.lang.String CLASSCARTPOLE
public static final java.lang.String ACTIONLEFT
public static final java.lang.String ACTIONRIGHT
public CartPoleDomain.CPPhysicsParams physParams
public Domain generateDomain()
DomainGenerator
generateDomain
in interface DomainGenerator
public void setToIncorrectClassicModelWithCorrectGravity()
public void setToIncorrectClassicModel()
public void setToCorrectModel()
public double setMaxCartSpeedToMaxWithMovementFromOneSideToOther()
public static State getInitialState(Domain domain)
domain
- the domain object to which the state will be associated.public static State getInitialState(Domain domain, double x, double xv, double a, double av)
domain
- the domain object to which the state will be associated.x
- the position of cart.xv
- the velocity of the cart.a
- the angle between the pole and the vertical axisav
- the velocity of the anglepublic static State moveClassicModel(State s, double dir, CartPoleDomain.CPPhysicsParams physParams)
s
- the current state from which one time step of physics will be simulated.dir
- the direction of force applied; should be -1, or 1 and is multiplied to this objects movementForceMag parameter. 0 would cause no force.physParams
- the CartPoleDomain.CPPhysicsParams
object specifying the physics to use for movementpublic static State moveCorrectModel(State s, double dir, CartPoleDomain.CPPhysicsParams physParams)
s
- the current state from which one time step of physics will be simulated.dir
- the direction of force applied; should be -1, or 1 and is multiplied to this objects movementForceMag parameter. 0 would cause no force.physParams
- the CartPoleDomain.CPPhysicsParams
object specifying the physics to use for movementprotected static double getAngle2ndDeriv(double xv0, double a0, double av0, double nsign, double f, CartPoleDomain.CPPhysicsParams physParams)
xv0
- the cart velocitya0
- the pole angleav0
- the pole angle velocitynsign
- the normal force signf
- the force applied to the cartphysParams
- the CartPoleDomain.CPPhysicsParams
object specifying the physics to use for movementprotected static double getNormForce(double a0, double av0, double a_2, CartPoleDomain.CPPhysicsParams physParams)
a0
- the pole angleav0
- the pole angle velocitya_2
- the 2nd order derivative of the pole anglephysParams
- the CartPoleDomain.CPPhysicsParams
object specifying the physics to use for movementprotected static double getX2ndDeriv(double xv0, double a0, double av0, double n, double f, double a2, CartPoleDomain.CPPhysicsParams physParams)
xv0
- the cart velocitya0
- the pole angleav0
- the pole angle velocityn
- the normal forcef
- the force applied to the carta2
- the second order angle derivativephysParams
- the CartPoleDomain.CPPhysicsParams
object specifying the physics to use for movementpublic static void main(java.lang.String[] args)
args
- ignored.