README extended to include new pubeval
This commit is contained in:
parent
d52e4a597c
commit
6709a4bb1c
21
README.org
21
README.org
|
@ -4,12 +4,20 @@ Quack-TD is a backgammon playing algorithm based upon neural networks trained
|
||||||
through TD(\lambda)-learning. The algorithm is implemented using Python 3 and
|
through TD(\lambda)-learning. The algorithm is implemented using Python 3 and
|
||||||
Tensorflow.
|
Tensorflow.
|
||||||
|
|
||||||
|
|
||||||
|
* Setup
|
||||||
|
** Pubeval
|
||||||
|
To use Pubeval for evaluation the Python module =pubeval= must first be
|
||||||
|
installed. The necessary source files should be distributed alongside the main
|
||||||
|
application and located in the =pubeval= directory. The installation can be done
|
||||||
|
by entering the directory and running =python3 setup.py install= or =pip install
|
||||||
|
.=.
|
||||||
* Usage
|
* Usage
|
||||||
|
|
||||||
The main executable is =main.py=. Various command-line options and switches can be used to
|
The main executable is =main.py=. Various command-line options and switches can be used to
|
||||||
execute different stages and modify the behaviour of the program. All
|
execute different stages and modify the behaviour of the program. All
|
||||||
command-line options and switches are listed by running =main.py= with the argument
|
command-line options and switches are listed by running =main.py= with the argument
|
||||||
=--help=. The three central switches are listed below:
|
=--help=. The central mode-switches are listed below:
|
||||||
|
|
||||||
- =--train=: Trains the neural network for a set amount of episodes (full games
|
- =--train=: Trains the neural network for a set amount of episodes (full games
|
||||||
of backgammon) set by =--episodes= (defaults to 1,000). Summary results of the
|
of backgammon) set by =--episodes= (defaults to 1,000). Summary results of the
|
||||||
|
@ -22,14 +30,17 @@ command-line options and switches are listed by running =main.py= with the argum
|
||||||
- =--play=: Allows the user to interactively play a game of backgammon against
|
- =--play=: Allows the user to interactively play a game of backgammon against
|
||||||
the algorithm.
|
the algorithm.
|
||||||
|
|
||||||
|
- =--list-models=: Lists the models stored on in the =models= folder.
|
||||||
|
|
||||||
** Evaluation methods
|
** Evaluation methods
|
||||||
|
|
||||||
Currently, the following evaluation methods are implemented:
|
Currently, the following evaluation methods are implemented:
|
||||||
|
|
||||||
- =pubeval=: Evaluates against the =pubeval= backgammon benchmark developed by
|
- =pubeval=: Evaluates against a Python extension based on the =pubeval=
|
||||||
Gerald Tesauro. The source code is included in the =pubeval= directory and
|
backgammon benchmark developed by Gerald Tesauro. The source code is included
|
||||||
needs to be compiled before use. The binary should be placed at
|
in the =pubeval= directory and needs to be installed before use. This can be
|
||||||
=pubeval/pubeval=.
|
done by running =python3 setup.py install= or =pip install .= from the source
|
||||||
|
directory.
|
||||||
- =random=: Evaluates by playing against a player that makes random moves drawn
|
- =random=: Evaluates by playing against a player that makes random moves drawn
|
||||||
from the set of legal moves. Should be used with high episode counts to lower
|
from the set of legal moves. Should be used with high episode counts to lower
|
||||||
variance. *TODO*: Doesn't even work currently
|
variance. *TODO*: Doesn't even work currently
|
||||||
|
|
Loading…
Reference in New Issue
Block a user