A modified version of Real-time Dynamic Programming  in which first a breadth-first search-like pass is made to seed the value function, and then planning continues in the typical RTDP rollout-like fashion.
An implementation of Bounded RTDP .
A tuple class for a hashed state and the expected value function margin/gap of a the source transition.
Implementation of Real-time dynamic programming .
The different ways that states can be selected for expansion.