Skip to content

Conversation

@thaJeztah
Copy link
Member

(splitting this from #39846)

Trying to see if this helps with the cleanup step exiting in CI, but Jenkins continuing to wait for the script to end afterwards.

@thaJeztah
Copy link
Member Author

@StefanScherer ptal if this makes sense 🤗 (it was just a hunch, so no idea if this improves the situation)

@ddebroy
Copy link
Contributor

ddebroy commented Sep 5, 2019

cc @vikramhh and @ameyag

@thaJeztah thaJeztah force-pushed the hack_windows_explicit_exit branch from f67ea7c to f584974 Compare September 6, 2019 01:02
@vikramhh
Copy link

vikramhh commented Sep 6, 2019

@thaJeztah - Verified that the script will now exit with $tmpLastExitCode consistently, irrespective of whether the script is run interactive or noninteractive.

Earlier it would exit with the value 0 [interactive] or 1[noninteractive].

Does this fit with your rationale for making this change?

@thaJeztah
Copy link
Member Author

@vikramhh that's a good question; the problem I was trying to solve was two-way;

  1. we've had situations where tests failed, but Jenkins still reported them as "green" (possibly wrong exit code if the Finally block itself was successful (and would overwrite the $LastExitCode of the tests (fail / non-zero) with a "successful" exit code (of the steps in the Finally block)
  2. we've had situations where the tests finished (either successful or failing), but Jenkins did somehow not catch that they completed; the step in CI kept showing as "running", and after 2 hours (timeout of Jenkins), the job was terminated and marked as "failed".

For the second point above, I was wondering if adding an explicit exit would help (really naive thought)

Earlier it would exit with the value 0 [interactive] or 1[noninteractive].

Question; what exactly do you mean with interactive/noninteractive ? Do you know which of those Jenkins runs? (I guess Jenkins has a terminal attached, so it may be interactive?)

@vikramhh
Copy link

vikramhh commented Sep 6, 2019

It would be good to find out how Jenkins is executing the script. I do not know if it does log it someplace but if not, then something like the following at the beginning of the script will tell us:

$allArgs = [Environment]::GetCommandLineArgs()
Write-Host -ForegroundColor Red $allArgs

Even if it has a terminal attached, given that there is no one to babysit it, it should probably be run noninteractive. One possible reason for (2) could be tests being run interactive-ly and waiting for some user input. Not saying that is why (2) happens but would be good to rule that out as a possible reason.

Trying to see if this helps with the cleanup step exiting in CI, but
Jenkins continuing to wait for the script to end afterwards.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
…ins runs this script

Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the hack_windows_explicit_exit branch from f584974 to 7eb522a Compare September 6, 2019 21:33
@thaJeztah
Copy link
Member Author

One way to find out then; I pushed a commit that adds those lines to the script

@thaJeztah
Copy link
Member Author

OK, this is what it prints;

23:37:13  DEBUG: print all environment variables to check how Jenkins runs this script
23:37:13  C:\windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive -ExecutionPolicy Bypass -Command & 'd:\gopath\src\github.com\docker\docker@tmp\durable-5bbf2229\powershellScript.ps1'; exit $LASTEXITCODE;
23:37:13  ----------------------------------------------------------------------------

Looks like it's NonInteractive, so not sure if this change actually makes a difference then?

@vikramhh
Copy link

vikramhh commented Sep 9, 2019

Does Jenkin base its red/green decision on %ERRORLEVEL%? Assuming it does....is it possible to replace the -Command in the invocation with -File? Based on some quick testing, I see that -Command will always lead to #1 above in case of failures. For some reason, 0 is being returned for %ERRORLEVEL% post invocation of the script even if the script is returning a non-zero value.

I was able to verify that your explanation in #1 is not true. In case of a failure leading to catch block being executed, value of 1 will be returned in line 963 even if the finally block is successful.

@StefanScherer
Copy link
Contributor

Making the exit code explicit looks good to me.

For the hanging jobs we could list the process tree at the end of the script to see if there are some hanging child processes 🤔
Did a quick-and-dirty test with sysinternals tools

curl.exe -o pslist.exe https://live.sysinternals.com/pslist.exe
reg.exe ADD "HKCU\Software\Sysinternals\PsList" /v EulaAccepted /t REG_DWORD /d 1 /f
.\pslist.exe -t
$PID

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah added this to the 20.03.0 milestone Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants