diff --git a/README.org b/README.org index 53f40e8..1ccb291 100644 --- a/README.org +++ b/README.org @@ -4,12 +4,20 @@ Quack-TD is a backgammon playing algorithm based upon neural networks trained through TD(\lambda)-learning. The algorithm is implemented using Python 3 and Tensorflow. -* Usage +* Setup +** Pubeval +To use Pubeval for evaluation the Python module =pubeval= must first be +installed. The necessary source files should be distributed alongside the main +application and located in the =pubeval= directory. The installation can be done +by entering the directory and running =python3 setup.py install= or =pip install +.=. +* Usage + The main executable is =main.py=. Various command-line options and switches can be used to execute different stages and modify the behaviour of the program. All command-line options and switches are listed by running =main.py= with the argument -=--help=. The three central switches are listed below: +=--help=. The central mode-switches are listed below: - =--train=: Trains the neural network for a set amount of episodes (full games of backgammon) set by =--episodes= (defaults to 1,000). Summary results of the @@ -22,14 +30,17 @@ command-line options and switches are listed by running =main.py= with the argum - =--play=: Allows the user to interactively play a game of backgammon against the algorithm. +- =--list-models=: Lists the models stored on in the =models= folder. + ** Evaluation methods Currently, the following evaluation methods are implemented: -- =pubeval=: Evaluates against the =pubeval= backgammon benchmark developed by - Gerald Tesauro. The source code is included in the =pubeval= directory and - needs to be compiled before use. The binary should be placed at - =pubeval/pubeval=. +- =pubeval=: Evaluates against a Python extension based on the =pubeval= + backgammon benchmark developed by Gerald Tesauro. The source code is included + in the =pubeval= directory and needs to be installed before use. This can be + done by running =python3 setup.py install= or =pip install .= from the source + directory. - =random=: Evaluates by playing against a player that makes random moves drawn from the set of legal moves. Should be used with high episode counts to lower variance. *TODO*: Doesn't even work currently