Back off between attempts to start the tests by zadjii-msft · Pull Request #15106 · microsoft/terminal

zadjii-msft · 2023-04-04T16:17:42Z

This is a test PR for kicking the CI.

Looking through this test, I seriously don't understand how this doesn't work. I mean, I don't really get how it does work, but at this point in the tests, we've actually established that both Nihilist.exe and openconsole are running. From my read, there's no reason these should be failing at this point.

We previously added a "retry 5 times" bit to this test, in #8534. That did work back then. So uh, just do that... again?

zadjii-msft · 2023-04-04T17:33:19Z

As a note: Run 2 succeeded, with a couple of tests succeeding on try 2&3. It ultimately failed because it couldn't publish the results, because there was already a successful run, whatever.

Run	1 Retry	2 Retries
1	0	0
2	1	1
3	1	0
4	2	1
5	0	0
6	2	0
7❌	0	0
?	0	0
?	0	0
?	0	0
?	0	0
?	0	0
?	0	0
?	0	0
?	0	0
?	0	0

DHowett · 2023-04-04T18:15:54Z

I can't understand how "run 4" succeeded in "first try" 2 times? It's a single run, it only ran once?

zadjii-msft · 2023-04-04T18:27:21Z

Shh Dustin ignore all that. I'm basically just ctrl+f'ing the log for Succeeded on try #<N> and then logging how many times a test took a retry to get initialized. So far, no test has taken till retry 4. We didn't have the logging before though, so I can't easily tell before this change how deep in the retries tests would usually take.

zadjii-msft · 2023-04-05T14:48:32Z

Hmm. After re-doing the tests 6 times, Run 7 actually straight up failed too. I always assumed that just adding a longer delay wouldn't really work. We need to figure out why it's not actually attaching.

that being said, it sure is weird that a retry does work sometimes... like, that's weird

zadjii-msft · 2023-04-06T15:19:10Z

well that's ironic

lhecker · 2023-04-12T12:48:58Z

src/host/ft_host/InitTests.cpp

    VERIFY_WIN32_BOOL_SUCCEEDED_RETURN(FreeConsole());

    // Wait a moment for the driver to be ready after freeing to attach.
+    Sleep(100);


Is this really needed?

lhecker · 2023-04-12T13:05:47Z

src/host/ft_host/InitTests.cpp

-            Sleep(1000);
+
+            // Sleep with a backoff, to give us longer to try next time.
+            Sleep(1000 * (1 + tries));


As far as I can tell, and really really don't quote me on this, but as far as I can tell, Sleep and all other "waitable" functions are all based on KiCheckWaitNext which uses QueryInterruptTimePrecise for calls like Sleep and QueryUnbiasedInterruptTime for APIs like WaitForSingleObject.

The difference between the Query*Time APIs that have unbiased in their name is:

The unbiased interrupt-time count does not include time the system spends in sleep or hibernation.

In other words, as far as I can tell, Sleep sleeps in "real time" and not in "run time", so if our CIs, which probably run on some very cheap spot instances, get suspended quite often, Sleep will effectively be skipped every time, without the OS and the apps on it really progressing at all.

I would thus try and fix it by calling this instead:

WaitForSingleObject(GetCurrentThread(), 1000);

Using exponential backoff with 5-10 retries would probably be a good idea anyways though. A good initial delay would probably be something like 10ms?, so that it doesn't throttle the tests too much on our fast developer machines. (Could you test on your own PC how many ms are enough so that it never hits retry when running locally?) Something like this:

// This will wait for up to 32s in total (from 10ms to 163840ms) for (DWORD delay = 10; delay < 30000; delay *= 2) { // ... if (succeeded) { return true; } WaitForSingleObject(GetCurrentThread(), delay); } VERIFY_FAILED(); // what's the right call again? return false;

BTW should we really reopen the CRT handles inside the loop? Seems like we should only do that once before the loop...

…-my-tests

zadjii-msft · 2023-04-12T17:36:40Z

Run	1 Retry	2 Retries	3 Retries	>=4 Retries
1✅	2	0	0	0
2✅	2	1	0	1
3✅	2	1	2	0
4✅	1	3	2	1
5❔	0	0	0	0
6❔	0	0	0	0
7❔	0	0	0	0
8❔	0	0	0	0
9❔	0	0	0	0
10❔	0	0	0	0

zadjii-msft · 2023-04-13T16:23:11Z

0a39269 had 9/10 runs pass. That's annoying, but maybe better than what we're getting now.

lhecker · 2023-04-13T18:16:53Z

src/host/ft_host/InitTests.cpp

    };

-    VERIFY_IS_LESS_THAN(tries, 100, L"Make sure we set up the new console in time");
+    VERIFY_IS_LESS_THAN(delay, 30000u, L"Make sure we set up the new console in time");


Another option would be to just return true above instead of breaking. That way you can blindly VERIFY_FAIL() here.

DHowett

jeez. if it works.

just try backing off and doing it slower?

28255ea

no one understands these tests

0a44aa6

lhecker reviewed Apr 12, 2023

View reviewed changes

zadjii-msft added 2 commits April 12, 2023 11:58

Merge branch 'main' into dev/migrie/b/these-mother-uckers-ucking-with…

a0f346b

…-my-tests

you're telling me that Sleep() doesn't sleep

0a39269

okay nonsensical checks are nonsensical

837156f

lhecker reviewed Apr 13, 2023

View reviewed changes

lhecker approved these changes Apr 13, 2023

View reviewed changes

DHowett approved these changes Apr 13, 2023

View reviewed changes

DHowett merged commit 72d0566 into main Apr 13, 2023

DHowett deleted the dev/migrie/b/these-mother-uckers-ucking-with-my-tests branch April 13, 2023 18:38

zadjii-msft mentioned this pull request Jan 3, 2022

We need to do something about these feature tests #11289

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Back off between attempts to start the tests#15106

Back off between attempts to start the tests#15106
DHowett merged 5 commits intomainfrom
dev/migrie/b/these-mother-uckers-ucking-with-my-tests

zadjii-msft commented Apr 4, 2023 •

edited

Loading

Uh oh!

zadjii-msft commented Apr 4, 2023 •

edited

Loading

Uh oh!

DHowett commented Apr 4, 2023

Uh oh!

zadjii-msft commented Apr 4, 2023 •

edited

Loading

Uh oh!

zadjii-msft commented Apr 5, 2023

Uh oh!

zadjii-msft commented Apr 6, 2023

Uh oh!

lhecker Apr 12, 2023

Uh oh!

lhecker Apr 12, 2023 •

edited

Loading

Uh oh!

lhecker Apr 12, 2023 •

edited

Loading

Uh oh!

zadjii-msft commented Apr 12, 2023 •

edited

Loading

Uh oh!

zadjii-msft commented Apr 13, 2023

Uh oh!

lhecker Apr 13, 2023

Uh oh!

DHowett left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zadjii-msft commented Apr 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zadjii-msft commented Apr 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DHowett commented Apr 4, 2023

Uh oh!

zadjii-msft commented Apr 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zadjii-msft commented Apr 5, 2023

Uh oh!

zadjii-msft commented Apr 6, 2023

Uh oh!

lhecker Apr 12, 2023

Choose a reason for hiding this comment

Uh oh!

lhecker Apr 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lhecker Apr 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zadjii-msft commented Apr 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zadjii-msft commented Apr 13, 2023

Uh oh!

lhecker Apr 13, 2023

Choose a reason for hiding this comment

Uh oh!

DHowett left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zadjii-msft commented Apr 4, 2023 •

edited

Loading

zadjii-msft commented Apr 4, 2023 •

edited

Loading

zadjii-msft commented Apr 4, 2023 •

edited

Loading

lhecker Apr 12, 2023 •

edited

Loading

lhecker Apr 12, 2023 •

edited

Loading

zadjii-msft commented Apr 12, 2023 •

edited

Loading