-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Theme: optimize seed adjustment #73004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on with the line additions in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we've expanded the tests, but it still feels a bit excessive. Are we drawing much value from the snapshots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that there is a lot of useful information in these tests, it's just that snapshots are not a good format for them, as they are too big and the useful information is hard to see. We should perform a set of checks on the output instead of just a snapshot diff.
Some valuable things you can see in the snapshot diff:
- some colors now have
warning: trueinstead ofwarning: false. I fixed some as they were caused by a bug in the algorithm, but there are a few remaining. I need to check if it's just a rounding error (overflowing by less thanCONTRAST_EPSILONshould be fine) or another bug. There should beexpectcalls that verifywarning === falseexplicitly. - after algorithm change, colors often change. There is a harmless kind of change just by one point, like from
#a6a6a6to#a7a7a7. The algorithms are inherently approximate, they iterate towards the precise value and stop when reaching certain precision. On the other hand, I'd like to be able to flag big changes, these are unwanted. Not sure yet how to do that. - instead of snapshotting the precise colors, we should check if the generated ramps really satisfy the contrast requirements. We have a
checkAccessibleCombinationsfunction for that, but it's now used only in Storybook. - many colors have changed from something almost white or black, like
#fefefeor#010000, to pure#fffand#000. That's a sign that the patch improved something 🙂 These colors are typically the extreme ones, they have the biggest contrast against the seed color from the entire ramp.fgSurface4is one of them. And it's right that we managed to push them all the way to black/white. When the algorithm used to stop just one point short, it means that it did some rounding or whatever incorrectly.
So, I agree that the current snapshots are excessive, and would like to find a way to convert the above points to expect statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- some colors now have
warning: trueinstead ofwarning: false. I fixed some as they were caused by a bug in the algorithm, but there are a few remaining. I need to check if it's just a rounding error (overflowing by less thanCONTRAST_EPSILONshould be fine) or another bug. There should beexpectcalls that verifywarning === falseexplicitly.
This makes sense to me that we should be asserting directly. I'd personally worry that people wouldn't be paying attention to this in snapshot changes.
- after algorithm change, colors often change. There is a harmless kind of change just by one point, like from
#a6a6a6to#a7a7a7. The algorithms are inherently approximate, they iterate towards the precise value and stop when reaching certain precision. On the other hand, I'd like to be able to flag big changes, these are unwanted. Not sure yet how to do that.
These changes should also manifest through updates in the prebuilt artifacts, right? I think we should also have a test that makes sure that those are in sync.
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
0a042d6 to
e39c78d
Compare
|
Flaky tests detected in e136698. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/19130645907
|
| } as const; | ||
|
|
||
| // Lightness precision we aim for. Approximately 1/256, resolution of an 8-bit number. | ||
| export const LIGHTNESS_EPSILON = 4e-3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming this constant because it's always used to check contrast values. Contrast between two colors, or a contrast "deficit" of a color pair.
| { | ||
| lightnessConstraint, | ||
| taperChromaOptions, | ||
| strict = true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the strict param because it was never used. If true, it caused an exception to be thrown if a matching color can't be found. But we don't need that, we return a lot of information about why a match wasn't found: achieved, reached, deficit, ...
| // Set the boundary based on the direction. | ||
| const mostContrastingL = direction === 'lighter' ? 1 : 0; | ||
| const mostContrastingColor = direction === 'lighter' ? WHITE : BLACK; | ||
| const highestContrast = getContrast( reference, mostContrastingColor ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this block few lines up because we use the highestContrast value to calculate the deficit return field.
| // If seed has low contrast vs reference, adjusting seed has high impact | ||
| // If seed has high contrast vs reference, adjusting seed has low impact | ||
| const impactWeight = 1 / getContrast( seed, referenceColor ); | ||
| const weightedDeficit = deficitVsTarget * impactWeight; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code, weighing the contrast by dividing by the seed-vs-reference contrast, seems wrong to me.
First, seed adjustment helps all constraints equally. If the seed luminance (Y) is halved, the contrast between reference and target approximately doubles. This is universal, no matter where the intermediate "reference" value is.
The right way to measure the deficit in "same units" is to do the opposite: multiply the deficit with seed-vs-reference contrast instead of dividing it. Then the result is always "contrast between seed and target", independent from the intermediate steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deficit calculation has been completely moved to findColorMeetingRequirements.
| let betterSeedL = UNSATISFIED_DIRECTION === 'lighter' ? 0 : 1; | ||
| let betterSeedL = MAX_DEFICIT_DIRECTION === 'lighter' ? 0 : 1; | ||
| let betterDeficit = -MAX_DEFICIT; | ||
| let betterReplaced = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same "Illinois bisection" as used in findColorMeetingRequirements. If we continue to use it (I'm not sure about it) I'll abstract it into a common function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about it
What are your reservations about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today I implemented another optimization that, when adjusting the seed, recomputes only the few "critical" colors instead of the entire ramp. Yesterday, when it was still only in my head, I wasn't yet sure if there will be really two bisection loops that are similar to each other. It turns out there are, and I no longer have the reservations I had.
Before merging this PR, I'll extract the "solve by bisection" function to an util and also address the other minor feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is now a solveWithBisect helper that's used at two places.
| betterSeedL = get( newSeed, [ OKLCH, 'l' ] ); | ||
| // Only update toReturn when the ramp satisfies all constraints. | ||
| toReturn.ramp = iterationResults.rampResults; | ||
| } else if ( UNSATISFIED_DIRECTION !== mainDir ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mreishus recently pointed out that this comparison might be wrong. It's true, we want to compare:
iterationResults.MAX_DEFICIT_DIRECTION === MAX_DEFICIT_DIRECTION
The MAX_DEFICIT_DIRECTION (formerly UNSATISFIED_DIRECTION) value specifies in which direction did the first iteration fail to satisfy contrast constraint. And we want to lighten/darken the seed a little bit. But when doing the bisection, it can happen that whe move the seed too much in the opposite direction, and now it can't find contrastful colors in the opposite direction.
In that case, we want to ignore the deficit value because it's not related to our calculation. It's better hardcoded to -MAX_DEFICIT.
|
I pushed one more optimization that has a nice impact. When adjusting the seed, don't recalculate the entire ramp on each iteration, but calculate only the most offending color (typically The improvement is from the original 3700ms to just 1900ms, that's 50% 🚀 |
aduth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the explanations, as a lot of this is hard to wrap my head around 😅 From my cursory understanding, this looks good to me, and I'm also judging by lack of any changes in the precompiled build artifacts that this is an effective refactoring.
| } ) | ||
| ).toMatchSnapshot(); | ||
| } ); | ||
| }, 10000 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to do something like it.each to avoid increasing the timeout because we're testing so much in the single test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's consider it.each in a follow-up. At this moment I don't want to do it because:
- We would lose the benchmark use case of the test
- I still hope that we can get way below the 1900ms time and remove the need for extra timeout completely. We're already quite well below the default 5s limit.
| if ( ! contrast.ignoreWhenAdjustingSeed ) { | ||
| if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we combine these into a single if ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combined into a single if.
| let MAX_DEFICIT = -Infinity; | ||
| let MAX_DEFICIT_DIRECTION: RampDirection = 'lighter'; | ||
| let MAX_DEFICIT_STEP = 'none'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but maybe while we're already here: These are not constants (both in the fact that they're function specific and we reassign their value), so I don't think they should be capitalized like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, changed the names to normal maxDeficit-like ones.
| }; | ||
| } | ||
|
|
||
| // Return minimal set of steps that are needed to calculate `stepName` from the seed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: For this to be associated in autocomplete, it should be TSDoc-formatted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a full tsdoc block for the function.
| addDeps( stepConfig.contrast.reference ); | ||
| if ( stepConfig.sameAsIfPossible ) { | ||
| addDeps( stepConfig.sameAsIfPossible ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible this could get into a loop? Should we also check in the condition above whether result already contains step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do a simple loop because we are recursing twice, once for contrast.reference and once for sameAsIfPossible. The loop would need to keep some backtracking stack. In practice the recursion is never very deep, at most 3 levels.
Should we also check in the condition above whether
resultalready containsstep?
result is a set, so adding step twice is already harmless. But I added additional check to prevent repeated recursion into an already handled step value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resultis a set, so addingsteptwice is already harmless. But I added additional check to prevent repeated recursion into an already handledstepvalue.
The recursion was what I was worried about, i should have said "infinite loop" to be clearer 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought you're asking about converting the function from recursion to a for loop 🙂 Loop is now prevented by checking if a given step is already in result. And our ramp configs don't have circular refs anyway 🙂
d2bb404 to
1559ddc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible this broke some tests on trunk? Funny enough, they pass locally, they only fail in CI.
Yes, it's very strange, I also don't see any failures locally. It's like if different machines did some floating-point calculations differently. |
|
Disabling the tests in #73122. |
|
This is the diff in the snapshot: It shows that the actual generated colors are still the same. It's the seed color that's wrong. The original seed color for this failing ramp is The "computed" seed color is the same color after some transformations, namely converting first to OKLCH and only then to RGB. After the extra OKLCH transformation the RGB coordinates are very slightly different: This 1e-14 difference is enough to change the hex string from Note that the CI run was doing it right: it produced the same string for both cases. It's our local macs that do the "wrong" calculation. I remember that @aduth and @mirka also ran into some floating-point issues when originally creating the |
|
To make the snapshots stable, we'd need to stop using the hex strings and use the For the RGB hex string, the 0.3 value is multiplied 255 and the result is 76.5, exactly between 76 and 77. The 76.5 value is rounded up, but the 76.49999 one is rounded down. |
|
A couple thoughts at a glance:
Changing to another format could make sense if it's an objectively better format to represent the colors accurately. |
|
The For the hex format, the precision is determined by the format itself. It's 1/256 and can't be anything else. And the way the rounding is performed is very unstable. 76.5 is rounded up to 77 (4d), but subtracting just an infinitesimal number leads to 76.49999 and gets rounded down to 76 (4c). For floating point rounding, it can't happen that a tiny change at 1e-15 level produces a jump from 76 to 75. For our purposes, where we generate colors algorithmically, the hex format with its coarse rounding is not very good. One hex step corresponds to quite a big contrast difference (1.008 to 1.016) and we need to take into account that this rounding is happening. When aiming for the standard-required 4.5 contrast, we need to be aware that it can be rounded to 4.48 and adjust the calculation accordingly. Specifying the colors as floats is also more "natural" for the OS and the hardware. The calculations are done in floating point. Fancy HDR displays have 10 bits per channel, while hex RGB has just 8 bits. |
It's not a machine problem, but a Node version problem. Likely a V8 bug that manifests in Node 24 (according to spec, it's supposed to adhere to IEEE 754 and not differ across implementations). Probably reportable if we isolate a repro. |
In this case, it's something else than Node.js version. Here is a minimal script that demonstrates the issue. const { parse, serialize, to, OKLCH, sRGB, ColorSpace } = require( 'colorjs.io/fn' );
ColorSpace.register( OKLCH );
ColorSpace.register( sRGB );
const rgb = parse( 'rgb(70% 30% 30%)' );
const ok = to( rgb, OKLCH );
const backRgb = to( ok, sRGB );
console.log( 'rgb', rgb.coords );
console.log( 'backRgb', backRgb.coords );
console.log( 'formatted', serialize( rgb, { format: 'hex' } ), 'vs', serialize( backRgb, { format: 'hex' } ) );After converting a color from RGB to OKLCH and back, the coordinates are different. And it has a big impact on the hex formatting. Because I tested this on Node.js 20, 22 and 24 and the output is always the same: Yet the CI failures imply that sometimes we get a different result, namely instead of the |
No, it's exactly that problem. But the output is the same across Node.js versions. |
This PR optimizes the algorithm that adjusts the theme seed color if contrast requirements can't be met. This algorithm makes the seed color slightly lighter so that it contrasts sufficiently well with black, or slightly darker for contrast with white. And finds the minimal adjustment needed.
The results can be measured with the "background ramp" unit test:
I expanded the test to include a full matrix of colors. Before this PR, the timing is 3700ms, after this PR it's 3200ms, a 14% improvement.
The PR includes a lot of bugfixes and tweaks that I will comment on individually. I also have ideas to make it even faster, but I wanted to share what I already have.