README extended to include new pubeval
This commit is contained in:
parent
d52e4a597c
commit
6709a4bb1c
21
README.org
21
README.org
|
@ -4,12 +4,20 @@ Quack-TD is a backgammon playing algorithm based upon neural networks trained
|
|||
through TD(\lambda)-learning. The algorithm is implemented using Python 3 and
|
||||
Tensorflow.
|
||||
|
||||
|
||||
* Setup
|
||||
** Pubeval
|
||||
To use Pubeval for evaluation the Python module =pubeval= must first be
|
||||
installed. The necessary source files should be distributed alongside the main
|
||||
application and located in the =pubeval= directory. The installation can be done
|
||||
by entering the directory and running =python3 setup.py install= or =pip install
|
||||
.=.
|
||||
* Usage
|
||||
|
||||
The main executable is =main.py=. Various command-line options and switches can be used to
|
||||
execute different stages and modify the behaviour of the program. All
|
||||
command-line options and switches are listed by running =main.py= with the argument
|
||||
=--help=. The three central switches are listed below:
|
||||
=--help=. The central mode-switches are listed below:
|
||||
|
||||
- =--train=: Trains the neural network for a set amount of episodes (full games
|
||||
of backgammon) set by =--episodes= (defaults to 1,000). Summary results of the
|
||||
|
@ -22,14 +30,17 @@ command-line options and switches are listed by running =main.py= with the argum
|
|||
- =--play=: Allows the user to interactively play a game of backgammon against
|
||||
the algorithm.
|
||||
|
||||
- =--list-models=: Lists the models stored on in the =models= folder.
|
||||
|
||||
** Evaluation methods
|
||||
|
||||
Currently, the following evaluation methods are implemented:
|
||||
|
||||
- =pubeval=: Evaluates against the =pubeval= backgammon benchmark developed by
|
||||
Gerald Tesauro. The source code is included in the =pubeval= directory and
|
||||
needs to be compiled before use. The binary should be placed at
|
||||
=pubeval/pubeval=.
|
||||
- =pubeval=: Evaluates against a Python extension based on the =pubeval=
|
||||
backgammon benchmark developed by Gerald Tesauro. The source code is included
|
||||
in the =pubeval= directory and needs to be installed before use. This can be
|
||||
done by running =python3 setup.py install= or =pip install .= from the source
|
||||
directory.
|
||||
- =random=: Evaluates by playing against a player that makes random moves drawn
|
||||
from the set of legal moves. Should be used with high episode counts to lower
|
||||
variance. *TODO*: Doesn't even work currently
|
||||
|
|
Loading…
Reference in New Issue
Block a user