Sarsa Lambda Fourier Basis (Java)
The Fourier Basis is a generic basis for linear function approximation in continuous-state reinforcement learning domains. It is simple, easy to use, and seems to work reliably.
In the simplest case, specifying a Fourier Basis for an n dimensional continuous domain requires only one parameter: k, which specifies the order terms to include (all terms of all combinations of variables of order between 0 and k inclusive are included). This parameter is necessary because the Fourier Basis has infinitely many terms.
Since the number of basis functions scales exponentially with the number of state variables n (as is true for all fixed bases), in high-dimensional problems the basis function coefficient ranges will probably need to be constrained. This code includes a FourierCoefficientGenerator class which can be used to write constrained coefficient generators. Two example derived classes are included: FullFourierCoefficientGenerator (which generates all coefficients) and IndependentFourierCoefficientGenerator (which only generates terms using individual variables). Both classes take k as a parameter. The Fourier Basis is also well suited as a fixed basis for use in feature selection, since it has infinitely many basis functions.
The Fourier Basis is described in the following paper:
- G.D. Konidaris, S. Osentoski and P.S. Thomas. Value Function Approximation in Reinforcement Learning using the Fourier Basis. In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence, pages 380-385, August 2011.
Please note that, as of October 2012, the code has been updated to include Dabney and Barto's method for automatically setting alpha. This is detailed in the following paper:
- W. Dabney and A.G. Barto. Adaptive Step-Size for Online Temporal Difference Learning. Twenty-Sixth Conference on Artificial Intelligence. In "Proceedings of the Twenty-Sixth Conference on Artificial Intelligence", July 2012.
In order to disable the alpha bounds method and instead use standard a standard alpha, set the "auto-alpha" parameter to "false" in EpsilonGreedyFourierBasisSarsaLambda.java, and the "alpha" parameter to whichever value of alpha you would like to use.
Here are example learning curves for the Fourier Basis, using Dabney's auto-alpha, for various orders in Mountain Car and Acrobot:
State Representation/Function Approximation
- State variables should be continuous. The function approximation code scales them to between 0 and 1 internally.
- This implementation handles only discrete actions.
- RL-Viz Compatible
- Value function visualizable
- Language: Java
- License: Apache 2.0
Download and Installation Instructions
This download contains the source code that can be used to change/rebuild the project as well as a pre-built JAR file that can be used immediately.
Using This Download
Before diving into this, you may want to check out the getting started guide.
This download can be used to augment your existing local RL-Library (if you have one), or as the basis to start a new one.
This Is Your First Project
#Create a directory for your rl-library. Call it whatever you like. mkdir rl-library
That's all you have to do special for the first time you download a rl-library component. Continue on now to the next section.
Adding To An Existing RL-Library Download
#First, download the file. Depending on your platform, you might have to do this manually with a web browser. #If you are on Linux, you can use wget which will download FourierSarsaAgent-Java-R1337.tar.gz for you wget http://rl-library.googlecode.com/files/FourierSarsaAgent-Java-R1337.tar.gz #Copy the download to your local rl-library folder (whatever it is called) cp FourierSarsaAgent-Java-R1337.tar.gz rl-library/ cd rl-library #This will add any project-specific things necessary to system and products folders #It will also create a folder for this particular project tar -zxf FourierSarsaAgent-Java-R1337.tar.gz #Clean up rm FourierSarsaAgent-Java-R1337.tar.gz
After this step is completed, you will have several new files:
Compiling This Project
You must have Apache Ant installed to build this project using these instructions.
You don't have to compile this project, because the JAR file has been compiled and placed into the products directory already. However, if you want to make changes and recompile, type:
cd FourierSarsaAgent-Java-R1334 ant clean #this will update ../products/fourierBasisAgentLib.jar ant build
Running This Project
You can run this project by typing:
java -jar products/fourierBasisAgentLib.jar #or from within the project's directory cd FourierSarsaAgent-Java-R1334 ant run
You can also use it in conjunction with RL-Viz by putting the JAR file products/fourierBasisAgentLib.jar in the appropriate directory, as long as the RL-Viz library jar file is in the appropriate relative location from where you put fourierBasisAgentLib.jar. The location is: ../system/common/libs/rl-viz/RLVizLib.jar
Please note that if learning diverges this agent will print an error message ("Function approximation divergence (SarsaLambdaFA)") and exit. It may not always be obvious that this has happened, so I recommend running all of the services (RL-Glue, agent, environment, and experiment) in separate terminals so that you can clearly see their output.
Please send all questions to either the current maintainer (below) or to the RL-Library mailing list.