The Fitted R-MAXQ algorithm is the product of Nicholas Jong's PhD thesis, "Structured Exploration for Reinforcement Learning". It combines the model-based exploration of R-MAX, function approximation based on averagers, and the MAXQ hierarchical decomposition. This combination permits aggressive but principled exploration in continuous state spaces, tempered by any available domain knowledge in the form of state and temporal abstractions.
For full details, see the thesis dissertation.
- Language: C++
- License: Apache 2.0
Download and Installation Instructions
Using This Download
#First, download the file. Depending on your platform, you might have to do this manually with a web browser. #If you are on Linux, you can use wget which will download fitted-R-MAXQ-R1336.tar.gz for you wget http://rl-library.googlecode.com/files/fitted-R-MAXQ-R1336.tar.gz #Copy the download to your local rl-library folder cp fitted-R-MAXQ-R1336.tar.gz rl-library/ cd rl-library #Unpack the project tar -zxf fitted-R-MAXQ-R1336.tar.gz #Clean up rm fitted-R-MAXQ-R1336.tar.gz
After this step is completed, you will have several new files:
The Agent, Environment, and Experiment directories contain the C++ source code. The boost directory contains a subset of the Boost C++ Library, used to develop Fitted R-MAXQ. The Makefile facilitates compilation of the source code. The html directory contains source-code documentation generated using Doxygen, parameterized by Doxyfile.
Compiling This Project
RL-Glue and the C/C++ Codec must be installed to build Fitted R-MAXQ. If you installed RL-Glue to a custom location, you must specify the path to the header files and shared libraries in the Makefile.
cd fitted-R-MAXQ-R1336 make
By default, the Makefile will produce every available agent, environment, and experiment. The environments, given the ".env" suffix, include Taxi, Mountain Car, PuddleWorld, and FlagPuddleWorld, an environment that combines hierarchical structure and continuous state. The agents, given the ".agent" suffix, include assorted instances of Fitted R-MAXQ, some of which include environment-specific task hierarchies, and some, such as Fitted R-MAX, which are domain-independent. The experiments, given the ".exp" suffix, provide simple examples for testing. The SingleRun experiment simply outputs the cumulative return for each episode, for 1000 episodes.
To recompile Fitted R-MAXQ with debugging symbols or to disable optimizations, comment or uncomment the appropriate lines near the top of the Makefile.
Running This Project
After building the project, you may launch any of the resulting binaries to connect to RL-Glue as normal. For convenience, the Makefile includes a target that builds and runs the first agent, environment, and experiment listed in the Makefile.
Please send all questions to either the current maintainer (below) or to the RL-Library mailing list.