base WDL model on material count and normalize evals dynamically#5121
Closed
robertnurnberg wants to merge 2 commits intoofficial-stockfish:masterfrom
Closed
base WDL model on material count and normalize evals dynamically#5121robertnurnberg wants to merge 2 commits intoofficial-stockfish:masterfrom
robertnurnberg wants to merge 2 commits intoofficial-stockfish:masterfrom
Conversation
Contributor
Author
|
The output from the fitting script is |
Disservin
reviewed
Mar 17, 2024
Contributor
Author
|
The lower limit of a material count of And here the distribution of the WDL raw data: Looking at this plot, I guess we could use |
Contributor
Author
Member
|
based on these graphs, I would say 10 is a good choice. Extending too much to small material count impacts the quality of the fit for the more relevant material counts. |
linrock
pushed a commit
to linrock/Stockfish
that referenced
this pull request
Mar 27, 2024
This PR proposes to change the parameter dependence of Stockfish's internal WDL model from full move counter to material count. In addition it ensures that an evaluation of 100 centipawns always corresponds to a 50% win probability at fishtest LTC, whereas for master this holds only at move number 32. See also official-stockfish#4920 and the discussion therein. The new model was fitted based on about 340M positions extracted from 5.6M fishtest LTC games from the last three weeks, involving SF versions from e67cc97 (SF 16.1) to current master. The involved commands are for [WDL_model](https://github.com/official-stockfish/WDL_model) are: ``` ./updateWDL.sh --firstrev e67cc97 python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability ``` The anchor `58` for the material count value was chosen to be as close as possible to the observed average material count of fishtest LTC games at move 32 (`43`), while not changing the value of `NormalizeToPawnValue` compared to the move-based WDL model by more than 1. The patch only affects the displayed cp and wdl values. closes official-stockfish#5121 No functional change
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




This PR proposes to change the parameter dependence of Stockfish's internal WDL model from full move counter to material count. In addition it ensures that an evaluation of 100 centipawns always corresponds to a 50% win probability at fishtest LTC, whereas for master this holds only at move number 32. See also #4920 and the discussion therein.
The new model was fitted based on about 340M positions extracted from 5.6M fishtest LTC games from the last three weeks, involving SF versions from e67cc97 (SF 16.1) to current master.
The involved commands are for WDL_model are:
The anchor
58for the material count value was chosen to be as close as possible to the observed average material count of fishtest LTC games at move 32 (43), while not changing the value ofNormalizeToPawnValuecompared to the move-based WDL model by more than 1.The patch only affects the displayed cp and wdl values.
No functional change.