add README for project

2018-03-11 12:59:57 +01:00 · 2018-03-11 12:59:57 +01:00 · ea83f43085
commit ea83f43085
parent 49966cd207
2 changed files with 85 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -165,3 +165,6 @@ venv.bak/


 # End of https://www.gitignore.io/api/emacs,python
+
+README.*
+!README.org
--- a/README.org
+++ b/README.org
@ -0,0 +1,82 @@
+* Quack-TD
+
+Quack-TD is a backgammon playing algorithm based upon neural networks trained
+through TD(\lambda)-learning.
+
+** Usage
+
+The main executable is =main.py=. Various command-line options and switches can be used to
+execute different stages and modify the behaviour of the program. All
+command-line options and switches are listed by running =main.py= with the argument
+=--help=. The three central switches are listed below:
+
+- =--train=: Trains the neural network for a set amount of episodes (full games
+  of backgammon) set by =--episodes= (defaults to 1,000).
+
+- =--eval=: Evaluates the nerual network using the methods specified by
+ =--eval-methods= for a the amount of episodes set by =--episodes= (defaults to
+  1,000).
+
+- =--play=: Allows the user to interactively play a game of backgammon against
+  the algorithm.
+
+** Model storage format
+
+Models are stored in the directory =models=. If no model is specfied with the
+=--model= option, the model is stored in the =models/default=
+directory. Otherwise, the model is stored in =models/$MODEL=.
+
+*** Files
+
+Along with the Tensorflow checkpoint files in the directory, the following files
+are stored:
+
+- =model.episodes=: The number of episodes of training performed with the
+  model
+- =logs/eval.log=: Log of all completed evaluations performed on the model. The
+  format of this file is specified in [[Log format]].
+- =logs/train.log=: Log of all completed training sessions performed on the
+  model. If a training session is aborted before the pre-specified episode
+  target is reached, nothing will be written to this file, although
+ =model.episodes= will be updated every time the model is saved to disk. The
+  format of this file is specified in [[Log format]].
+
+*** Log format
+
+The evaluation and training log files (=logs/eval.log= and =logs/train.log=
+respectively) are CSV-foramtted files with structure as described below. Both
+files have semicolon-separated columns (=;=) and newline-separated rows (=\n=).
+
+**** Evaluation log (=eval.log=)
+
+Columns are written in the following order:
+
+- =time=: Unix time (Epoch time) timestamp in local time (TODO: should be UTC
+  instead?) describing when the evaluation was finished.
+- =method=: Short string describing the method used for evaluation.
+- =trained_eps=: Amount of episodes trained with the model before evaluation
+- =count=: Amount of episodes used for evaluation
+- =sum=: Sum of outcomes of the games played during evaluation. Outcomes are
+  integers in the range of -2 to 2. A sum of 0 indicates that the evaluated
+  algorithm scored neutrally. (TODO: Is this true?)
+- =mean=: Mean of outcomes of the games played during evaluation. Outcomes are
+  integers in the range of -2 to 2. A mean of 0 indicates that the evaluated
+  algorithm scored neutrally. (TODO: Is this true?)
+
+TODO: Add example of log row
+
+**** Training log (=train.log=)
+
+Columns are written in the following order:
+
+- =time=: Unix time (Epoch time) timestamp in local time (TODO: should be UTC
+  instead?) describing when the training session was finished.
+- =trained_eps=: Amount of episodes trained with the model /after/ the training
+  session
+- =count=: Amount of episodes used for training
+- =sum=: Sum of outcomes of the games played during training. Outcomes are
+  integers in the range of -2 to 2. A sum of 0 indicates that the evaluated
+  algorithm scored neutrally. (TODO: Is this true?)
+- =mean=: Mean of outcomes of the games played during training. Outcomes are
+  integers in the range of -2 to 2. A mean of 0 indicates that the evaluated
+  algorithm scored neutrally. (TODO: Is this true?)