backgammon

Author	SHA1	Message	Date
Christoffer Müller Madsen	1aa9cf705f	quack without leaks	2018-05-11 21:24:10 +02:00
Alexander Munch-Hansen	93224864a4	More comments, backprop have been somewhat tested in the eager_main.py and normal_main.py.	2018-05-11 13:35:01 +02:00
Alexander Munch-Hansen	504308a9af	Yet another input argument, "--ply", 0 for no look-ahead, 1 for a single look-ahead.	2018-05-10 23:22:41 +02:00
Alexander Munch-Hansen	3b57c10b5a	Saves calling tf.reduce_mean on all values once.	2018-05-10 22:57:27 +02:00
Alexander Munch-Hansen	6131d5b5f4	Added comments for Christoffer!	2018-05-10 19:25:28 +02:00
Alexander Munch-Hansen	1aedc23de1	1-ply now works again.	2018-05-10 19:13:18 +02:00
Alexander Munch-Hansen	2d84cd5a0b	1-ply now works again.	2018-05-10 19:06:53 +02:00
Alexander Munch-Hansen	396d5b036d	All values for boards and all rolls can now be calculated	2018-05-10 18:41:21 +02:00
Alexander Munch-Hansen	4efb229d34	Added a lot of comments	2018-05-10 15:28:33 +02:00
Alexander Munch-Hansen	f2a67ca92e	All board reps should now work as input.	2018-05-10 10:49:25 +02:00
Alexander Munch-Hansen	9cfdd7e2b2	Added a verbosity flag, --verbose, which allows for printing of variables and such.	2018-05-10 10:39:22 +02:00
Alexander Munch-Hansen	6429e0732c	We should now be able to both train and eval as per usual. I've added a file "global_step", which works as the new global_step counter, so we can use it for exp_decay.	2018-05-09 23:15:35 +02:00
Alexander Munch-Hansen	cb7e7b519c	Getting closer to functionality. We're capable of evaluating moves and a rework of global_step has begun, such that we now use episode_count as a way of calculating exp_decay, which have been implemented as a function.	2018-05-09 22:22:12 +02:00
Alexander Munch-Hansen	9a2d87516e	Ongoing rewrite of network to use an eager model. We're now capable of evaluating a list of states with network.py. We can also save and restore models.	2018-05-09 00:33:05 +02:00
Alexander Munch-Hansen	ac6660e05b	Added board-rep as cli argument, to state which input-board-rep to use. Also fixed weird nesting of difference_in_values.	2018-05-06 20:52:35 +02:00
Alexander Munch-Hansen	1f8485f54e	No longer use n_ply, shit's too slow man. Added extra logging, now logs the average difference in values between trainings. Also fixed bug with the length of quack-norm. Also added cli argument; use-baseline, if set, the baseline-model will be used.	2018-05-06 20:41:07 +02:00
Alexander Munch-Hansen	1db469709a	make_move now calls n_ply to search deeper and potentially give better moves. It's hella fucking slow.	2018-05-02 01:06:23 +02:00
Alexander Munch-Hansen	695a3d43db	Fixed n_ply and actually added a comma in main.py. clap Christoffer	2018-05-01 20:39:29 +02:00
Christoffer Müller Madsen	c530aa688d	flipidip	2018-05-01 13:48:42 +02:00
Alexander Munch-Hansen	3f6849048e	added network_test and some comments	2018-04-29 12:14:14 +02:00
Christoffer Müller Madsen	afa6504b05	ply again again	2018-04-26 16:49:49 +02:00
Christoffer Müller Madsen	9428a00c11	add "--force-creation" flag to force model creation	2018-04-26 11:43:19 +02:00
Pownie	48a5f6cbb6	Moved "do_ply" out of "calculate_2_ply", in an effort to be able to eventually do further plies, however some rewriting of the current "do_ply" will be needed, as described in a comment.	2018-04-26 09:42:03 +02:00
Pownie	8899c5c2d9	Fixed potential bug in regards to scores in 2-ply calculation.	2018-04-25 00:51:04 +02:00
Pownie	0509a51fd3	Added baseline model for testing	2018-04-24 22:30:58 +02:00
Pownie	349ad718f1	Moved gen_21_rolls into the 2-ply method, so it can be correctly used like the good helper method that it is	2018-04-23 00:45:31 +02:00
Pownie	e5cc54d3e0	Added a normalised version of quack	2018-04-23 00:35:25 +02:00
Pownie	160f5bd737	added some comments and removed some old code	2018-04-22 19:13:46 +02:00
Pownie	77d82f6883	Added code for 2-ply look-ahead	2018-04-22 15:07:19 +02:00
Christoffer Müller Madsen	1062b72bda	fix typo	2018-04-19 16:04:49 +02:00
Alexander Munch-Hansen	66589dfde3	fixed global step, now using exp decay	2018-04-19 16:01:19 +02:00
Alexander Munch-Hansen	cba0f67ae2	fixed the bug	2018-04-19 15:22:00 +02:00
Pownie	611f6cdba0	Changed alpha to learning_rate	2018-04-15 23:53:35 +02:00
Pownie	7d29fc02f2	Added global step + exponential decay	2018-04-14 23:11:20 +02:00
Christoffer Müller Madsen	17f5b62e9b	proper Tesauro board representation	2018-03-28 14:36:52 +02:00
Christoffer Müller Madsen	fda2c6e08d	parametric board representation in network	2018-03-28 12:00:47 +02:00
Christoffer Müller Madsen	abce56dd40	fix typo	2018-03-27 23:13:59 +00:00
alex	95b12a6c35	Added another board_rep	2018-03-28 00:33:39 +02:00
Christoffer Müller Madsen	2654006222	fix wrongful mergings	2018-03-27 13:02:36 +02:00
Christoffer Müller Madsen	c248ca0452	Merge branch 'fuck_git' into 'rework-1' # Conflicts: # network.py	2018-03-27 10:15:51 +00:00
alex	f43108c239	Training using slightly revamped version of our own board rep. Not sure if works yet.	2018-03-27 04:06:08 +02:00
alex	006f791727	Functioning network using board representation shamelessly ripped from Tesauro	2018-03-27 02:26:15 +02:00
Christoffer Müller Madsen	4c43bf19a3	Add evaluation variance benchmark To do a benchmark for `pubeval`, run `python3 main.py --bench-eval-scores --eval-methods pubeval` Logs will be placed in directory `bench` Use `plot_bench(data_path)` in `plot.py` for plotting	2018-03-26 16:45:26 +02:00
Christoffer Müller Madsen	1f1e806306	fix errant whitespace	2018-03-26 15:55:48 +02:00
Christoffer Müller Madsen	98c9af72e7	rework network	2018-03-22 15:30:47 +01:00
Alexander Munch-Hansen	b7e6dd10af	move evaluation code into network.py	2018-03-20 13:17:38 +01:00
Alexander Munch-Hansen	99783ee4f8	clean up and move things to network.py	2018-03-20 13:03:21 +01:00
Christoffer Müller Madsen	2fc7a2a09c	fixed dumb bugs; still messy	2018-03-14 20:42:09 +01:00
Christoffer Müller Madsen	55898d0e66	renaming parameters	2018-03-12 00:11:55 +01:00
Christoffer Müller Madsen	9bc1a8ba9f	save and restore number of trained episodes	2018-03-10 00:22:20 +01:00

1 2

61 Commits