Update default network to a7c8f545.nn by KierenP · Pull Request #517 · KierenP/Halogen

KierenP · 2024-07-15T09:26:04Z

a7c8f545.nn is a (768->512)x2->1 network using SCReLU activation. It has been trained from scratch, with a completely separate lineage to the master network.

Before I begin, I'd like to thank the contributors that have at some point in the past been responsible for training the best Halogen network to date:

Training a network from zero for Halogen purely through self play reinforcement + supervised learning was always been a long term goal of mine. The combined effort to achieve this began in March 2024, and represents the single greatest development effort in Halogen's history.

Training began with a novel implementation of TDLeaf(λ) reinforcement learning¹. The exact lineage of best networks were:

11.4.1_td_leaf_learn_v2.0.0 starting from randomly initialized weights trained for 8 hours to produce 768-512x2-1_g917495.nn
11.4.1_td_leaf_learn_v2.1.0 starting from 768-512x2-1_g917495.nn trained for 8 hours to produce 768-512x2-1_e2_g1768709.nn
11.4.1_td_leaf_learn_v2.2.0 starting from 768-512x2-1_e2_g1768709.nn trained for 13 hours to produce 768-512x2-1_e3_g2065630.nn

Temporal Coherence² was then used to further the training process:

11.4.1_td_leaf_learn_v2.3.1a starting from 768-512x2-1_e3_g2065630.nn trained for 16 hours to produce 768-512x2-1_e8_g1726579
11.7.0_td_leaf_learn_3.0.0 starting from 768-512x2-1_e8_g1726579 trained for 16 hours to produce 768-512x2-1_r12_g2297813.nn

By playing out Syzygy endgames and allowing the network to learn from those positions, endgame play was improved:

11.7.0_td_leaf_learn_3.3.0 starting from 768-512x2-1_r12_g2297813.nn trained for 16 hours to produce 768-512x2-1_r15_g2720489.nn

By adding 5% DFRC, the FRC performance was greatly improved:

11.7.0_td_leaf_learn_3.5.0 starting from 768-512x2-1_r15_g2720489.nn trained for 16 hours to produce 768-512x2-1_r17_g2894950.nn

By filtering openings that are wildly unbalanced +/- 500cp, the final TDLeaf(λ) network was trained:

11.7.0_td_leaf_learn_3.6.0 starting from 768-512x2-1_r17_g2894950.nn trained for 16 hours to produce 768-512x2-1_r18_g2771316.nn

The final TDLeaf(λ) network was Elo | -182.26 +- 8.45 (95%) to compared to master.

At this point I switched to supervised learning, using the bullet trainer³

datagen 11.24.0 + 768-512x2-1_r18_g2771316.nn Generated 612m fens at 40k nodes. This was filtered and used to train bullet_r10_768-512x2-1-epoch100.bin:

Bullet parameters

File Path      : ...
Threads        : 20
WDL Proportion : start 0.3 end 0.3
Max Epochs     : 100
Save Rate      : 10
Batch Size     : 16384
Net Name       : bullet_r10_768-512x2-1
LR Scheduler   : start 0.001 gamma 0.1 drop every 40 epochs
Scale          : 160
Positions      : 505179452

datagen 11.30.0 + bullet_r10_768-512x2-1-epoch100.bin Generated 739m fens at 40k nodes. This was filtered and combined with the previous 612m fens and used to train the final network bullet_r17_768-512x2-1_e50.nn:

Bullet parameters

File Path      : ...
Threads        : 20
WDL Proportion : start 0.3 end 0.3
Max Epochs     : 50
Save Rate      : 10
Batch Size     : 16384
Net Name       : bullet_r17_768-512x2-1
LR Scheduler   : start 0.001 gamma 0.95 drop every 1 epochs
Scale          : 160
Positions      : 1116214793

Elo   | 31.56 +- 5.70 (95%)
Conf  | 40.0+0.40s Threads=1 Hash=64MB
Games | N: 5000 W: 1530 L: 1077 D: 2393
Penta | [29, 473, 1126, 760, 112]
http://chess.grantnet.us/test/37549/

Elo   | 30.92 +- 6.37 (95%)
Conf  | 8.0+0.08s Threads=1 Hash=8MB
Games | N: 5002 W: 1626 L: 1182 D: 2194
Penta | [73, 504, 997, 760, 167]
http://chess.grantnet.us/test/37548/

Bench: 2109681

KierenP added 17 commits June 25, 2024 22:23

Inference code for bullet nets

16160be

Hidden layer size 32

80f61a8

Hidden layer size 64

08fe582

Hidden layer size 128

e71dc78

Hidden layer size 256

6f7cd31

Hidden layer size 512

447c5a1

Adjust scale to 160 (#503)

aa8fd54

Merge branch 'master' into bullet_inference_x512

f557470

Merge branch 'master' into bullet_inference_x512

dd5db57

Merge branch 'master' into bullet_inference_x512

07e7588

Merge branch 'master' into bullet_inference_x512

d4392e0

Merge branch 'master' into bullet_inference_x512

be7506e

Merge branch 'master' into bullet_inference_x512

1898b35

Merge branch 'master' into bullet_inference_x512

65c9cb3

x512 SCReLU using QA 181

45237a5

Update default network to a7c8f545.nn

5a3b7bf

Bench: 2109681

Bump version number

ea8af33

Bench: 2109681

KierenP merged commit 2f8e9d3 into master Jul 15, 2024

KierenP deleted the net_a7c8f545 branch July 15, 2024 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update default network to a7c8f545.nn#517

Update default network to a7c8f545.nn#517
KierenP merged 17 commits intomasterfrom
net_a7c8f545

KierenP commented Jul 15, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KierenP commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

KierenP commented Jul 15, 2024 •

edited

Loading