Skip to content

Conversation

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 2, 2019

This adds a stage to test against the current SAC (Semi Annual Channel),
which allows us to catch possible regressions on upcoming LTS versions.

addresses ENGCORE-832

@thaJeztah
Copy link
Member Author

thaJeztah commented Sep 2, 2019

This might show the problem that's reported in docker/for-win#3884, which was indicated to be a regression in the platform; docker/for-win#3884 (comment)

The problem is now understood. It looks like it will require a fix for a regression in docker, and a fix in Windows itself. For now, the only workaround is as others have put here to take out the storage-opt in daemon.json. It also affects docker run where --storage-opt size=30GB for example (any number, 30 is just arbitrary for demonstration). It also means that it is not possible to build layers larger than 20GB in Windows 1903 builds.

But only shows on Docker Desktop (because Docker Enterprise is only supported on Windows Server LTS versions), and thus doesn't show up in current CI in this repository (which didn't test against 1903).

Possibly skipping this step for Windows 1903 could work around the platform regression;

// For WCOW, the default of 20GB hard-coded in the platform
// is too small for builder scenarios where many users are
// using RUN statements to install large amounts of data.
// Use 127GB as that's the default size of a VHD in Hyper-V.
if isWCOW {
hc.StorageOpt = make(map[string]string)
hc.StorageOpt["size"] = "127GB"
}

@thaJeztah
Copy link
Member Author

The good news; Windows 1903 machines are working, and CI starts;

Client:
 Debug Mode: false
 Plugins:
  cluster: Manage Docker clusters (Docker Inc., v1.0.1)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 19.03.2-tp2
 Storage Driver: windowsfilter (windows) lcow (linux)
  Windows: 
  LCOW: 
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics l2bridge l2tunnel nat null overlay transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
 Swarm: inactive
 Default Isolation: process
 Kernel Version: 10.0 18362 (18362.1.amd64fre.19h1_release.190318-1202)
 Operating System: Windows Server Datacenter Version 1903 (OS Build 18362.295)
 OSType: windows
 Architecture: x86_64
 CPUs: 4
 Total Memory: 32GiB
 Name: azwin-1-519d20
 ID: 7JIE:3EHH:REBU:5GIT:Q2PF:UZSX:743O:DUMD:6HTG:G3W6:L3CK:2YM2
 Docker Root Dir: D:\docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: true
 Insecure Registries:
  10.0.0.4:5000
  127.0.0.0/8
 Registry Mirrors:
  http://10.0.0.4:5000/
 Live Restore Enabled: false

The "bad" news; the 1903 machines have Windows Defender enabled, which is currently preventing CI to run https://ci.docker.com/public/blue/rest/organizations/jenkins/pipelines/moby/branches/PR-39846/runs/1/nodes/240/log/?start=0

(until #39804 is merged, or the configuration of those machines is changed)


ERROR: Failed 'ERROR: Windows Defender real time protection must be disabled for integration tests' at 09/02/2019 10:44:13
 At D:\gopath\src\github.com\docker\docker\hack\ci\windows.ps1:269 char:22
 + ... defender) { Throw "ERROR: Windows Defender real time protection must  ...
 +                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@thaJeztah
Copy link
Member Author

@StefanScherer ^^ looks like we may want to update the configuration for those machines (or explicitly disable it through our Jenkinsfile?)

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StefanScherer this was just an attempt to see if this would help with the RS1 builds sometimes not exiting; not sure if this would help or not (happy to drop this commit if it doesn't make sense)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to a separate PR #39854

@thaJeztah
Copy link
Member Author

Failure on RS5 is a flaky test: #36801
https://ci.docker.com/public/blue/rest/organizations/jenkins/pipelines/moby/branches/PR-39846/runs/5/nodes/41/log/?start=0

[2019-09-03T08:58:49.666Z] --- FAIL: TestJSONFileLoggerWithOpts (0.07s)
[2019-09-03T08:58:49.666Z]     jsonfilelog_test.go:187: open C:\Users\ContainerAdministrator\AppData\Local\Temp\docker-logger-344571653\container.log.1: The process cannot access the file because it is being used by another process.

@thaJeztah
Copy link
Member Author

We might want to consider running this one in Hyper-V mode (which I think is where the linked issue occurs)

@thaJeztah
Copy link
Member Author

Pushed a commit to run Windows 1903 with hyper-v isolation

@thaJeztah
Copy link
Member Author

couple of failures now that I enabled hyper-v; https://ci.docker.com/public/job/moby/job/PR-39846/7/execution/node/197/log/?consoleFull

22:46:41  FAIL: docker_api_containers_test.go:2006: DockerSuite.TestContainersAPICreateMountsCreate
22:46:41  
22:46:41  case 0 - config: {volume  c:\foo false  <nil> <nil> <nil>}
22:46:41  case 1 - config: {volume  c:\foo\ false  <nil> <nil> <nil>}
22:46:41  case 2 - config: {volume test1 c:\foo false  <nil> <nil> <nil>}
22:46:41  docker_api_containers_test.go:2180:
22:46:41      poll.WaitOn(c, containerExit(apiclient, container.ID), poll.WithDelay(time.Second))
22:46:41  d:/gopath/src/github.com/docker/docker/vendor/gotest.tools/poll/poll.go:128:
22:46:41      t.Fatalf("timeout hit after %s: %s", config.Timeout, lastMessage)
22:46:41  ... Error: timeout hit after 10s: container f84c19efa123e04f676f8c3f1424c2eb700a8680f38d5529c925b3c895bd248c is running, waiting for exit

Looks like this test should be either skipped on hyper-v, or modified;

22:49:09  FAIL: docker_cli_create_test.go:302: DockerSuite.TestCreateWithWorkdir
22:49:09  
22:49:09  assertion failed: 
22:49:09  Command:  d:\CI-7\CI-f3768a669\binary\docker.exe cp foo:c:\home\foo\bar c:\tmp
22:49:09  ExitCode: 1
22:49:09  Error:    exit status 1
22:49:09  Stdout:   
22:49:09  Stderr:   Error response from daemon: filesystem operations against a running Hyper-V container are not supported
22:49:09  
22:49:09  
22:49:09  Failures:
22:49:09  ExitCode was 1 expected 0
22:49:09  Expected no error

@thaJeztah thaJeztah force-pushed the jenkinsfile_add_windows_1903 branch 3 times, most recently from 69ee038 to aa04d99 Compare September 4, 2019 15:04
@thaJeztah
Copy link
Member Author

With --storage-opt size=xxx, building the BusyBox image now fails, which may be the issue reported in docker/for-win#3884 (comment)

https://ci.docker.com/public/job/moby/job/PR-39846/11/execution/node/240/log/

17:35:13  Sending build context to Docker daemon   2.56kB
17:35:13  
17:35:13  Step 1/6 : FROM microsoft/windowsservercore
17:35:13   ---> 55b4d18a7eea
17:35:13  Step 2/6 : RUN mkdir C:\tmp && mkdir C:\bin
17:35:13   ---> Running in 86b2979eda7a
17:35:31  powershell.exe : container 86b2979eda7a683b90bd96599ae8e1d64c71a06aab1690c2634cd863fbad2c1c encountered an error during Start: failure in a Windows system call: The virtual machine or 
17:35:31  container exited unexpectedly. (0xc0370106)
17:35:31  At D:\gopath\src\github.com\docker\docker@tmp\durable-9e3ff7cb\powershellWrapper.ps1:3 char:1
17:35:31  + & powershell -NoProfile -NonInteractive -ExecutionPolicy Bypass -Comm ...
17:35:31  + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17:35:31      + CategoryInfo          : NotSpecified: (container 86b29...y. (0xc0370106):String) [], RemoteException
17:35:31      + FullyQualifiedErrorId : NativeCommandError
17:35:31   
17:35:31  
17:35:31  
17:35:31  ERROR: Failed 'ERROR: Failed to build busybox image' at 09/04/2019 15:35:28
17:35:31  At D:\gopath\src\github.com\docker\docker\hack\ci\windows.ps1:786 char:21
17:35:31  +                     Throw "ERROR: Failed to build busybox image"
17:35:31  +                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17:35:31 

@thaJeztah thaJeztah force-pushed the jenkinsfile_add_windows_1903 branch from aa04d99 to f3d6142 Compare September 5, 2019 14:33
@thaJeztah
Copy link
Member Author

Added a commit to only set the size for build-containers to 127GB if no size was configured on the daemon.

@thaJeztah
Copy link
Member Author

Aaaaand... unit tests failed on #39856


[2019-09-05T15:03:42.298Z] --- FAIL: TestJSONFileLoggerWithOpts (0.01s)
[2019-09-05T15:03:42.298Z]     jsonfilelog_test.go:187: open C:\Users\ContainerAdministrator\AppData\Local\Temp\docker-logger-171753449\container.log.1: The process cannot access the file because it is being used by another process.
[2019-09-05T15:03:42.298Z] FAIL
[2019-09-05T15:03:42.298Z] coverage: 62.1% of statements
[2019-09-05T15:03:42.298Z] FAIL	github.com/docker/docker/daemon/logger/jsonfilelog	0.410s

@ddebroy
Copy link
Contributor

ddebroy commented Sep 5, 2019

cc @vikramhh and @ameyag

@thaJeztah
Copy link
Member Author

Ok, that last commit didn't help it seems;

[2019-09-05T15:32:58.969Z] Sending build context to Docker daemon   2.56kB
[2019-09-05T15:32:58.969Z] 
[2019-09-05T15:32:58.969Z] Step 1/6 : FROM microsoft/windowsservercore
[2019-09-05T15:32:58.969Z]  ---> 55b4d18a7eea
[2019-09-05T15:32:58.969Z] Step 2/6 : RUN mkdir C:\tmp && mkdir C:\bin
[2019-09-05T15:32:59.430Z]  ---> Running in cfddf17ac863
[2019-09-05T15:33:14.283Z] powershell.exe : container cfddf17ac863206c89e1d3c19b427401884533c7642a2d50d1db5cea2a1b8647 encountered an error during Start: failure in a Windows system call: The virtual machine or 
[2019-09-05T15:33:14.283Z] container exited unexpectedly. (0xc0370106)
[2019-09-05T15:33:14.283Z] At D:\gopath\src\github.com\docker\docker@tmp\durable-dec09710\powershellWrapper.ps1:3 char:1
[2019-09-05T15:33:14.283Z] + & powershell -NoProfile -NonInteractive -ExecutionPolicy Bypass -Comm ...
[2019-09-05T15:33:14.283Z] + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2019-09-05T15:33:14.283Z]     + CategoryInfo          : NotSpecified: (container cfddf...y. (0xc0370106):String) [], RemoteException
[2019-09-05T15:33:14.283Z]     + FullyQualifiedErrorId : NativeCommandError
[2019-09-05T15:33:14.283Z]  

Can try disabling that line altogether (perhaps at that point the config isn't set)

@thaJeztah thaJeztah force-pushed the jenkinsfile_add_windows_1903 branch from d105437 to 8dbfb92 Compare March 25, 2021 08:24
@thaJeztah
Copy link
Member Author

EDIT: I just noticed that 2022 is actually running on process isolation mode now and RS5 on Hyper-V isolation (which why it fails) so probably it would be better to skip that part from this PR too.

Fixed; moved it to the win-2022 part, and (to test) reverted the last (skip tests) commit; I think it's good to have at least one of the Windows stages run with Hyper-V; that said, we discussed removing RS1 from the list at some point; perhaps we can add a "windows latest (Hyper-V)" and "windows latest (Process)".
It's a pity that Jenkins declarative pipeline doesn't (I think still?) allow further nesting, otherwise we could "share" the build step, which takes quite a while on Windows, and after that branch to process/hyper-v for tests.

@thaJeztah thaJeztah force-pushed the jenkinsfile_add_windows_1903 branch from 8dbfb92 to d8f9340 Compare March 25, 2021 16:04
@thaJeztah
Copy link
Member Author

Hmmm.... looks like running it on Hyper-V fails; getting some hcsshim failure

hcsshim::CreateComputeSystem a9d9068c72032118b74834d50c4205ac6c23bd506caee08b0d2d01356413cc28: The request is not supported.
[2021-03-25T16:26:00.394Z] INFO: Building busybox
[2021-03-25T16:26:00.394Z] Sending build context to Docker daemon   5.12kB
[2021-03-25T16:26:00.394Z] 
[2021-03-25T16:26:00.394Z] Step 1/13 : ARG WINDOWS_BASE_IMAGE=mcr.microsoft.com/windows/servercore
[2021-03-25T16:26:00.394Z] Step 2/13 : ARG WINDOWS_BASE_IMAGE_TAG=ltsc2019
[2021-03-25T16:26:00.394Z] Step 3/13 : ARG BUSYBOX_VERSION=FRP-3329-gcf0fa4d13
[2021-03-25T16:26:00.394Z] Step 4/13 : ARG BUSYBOX_SHA256SUM=bfaeb88638e580fc522a68e69072e305308f9747563e51fa085eec60ca39a5ae
[2021-03-25T16:26:00.394Z] Step 5/13 : FROM ${WINDOWS_BASE_IMAGE}:${WINDOWS_BASE_IMAGE_TAG}
[2021-03-25T16:26:00.394Z]  ---> 39d157a84080
[2021-03-25T16:26:00.394Z] Step 6/13 : RUN mkdir C:\tmp && mkdir C:\bin
[2021-03-25T16:26:00.394Z]  ---> Running in a9d9068c7203
[2021-03-25T16:26:00.867Z] powershell.exe : hcsshim::CreateComputeSystem a9d9068c72032118b74834d50c4205ac6c23bd506caee08b0d2d01356413cc28: The request is not supported.
[2021-03-25T16:26:00.867Z] At D:\gopath\src\github.com\docker\docker@tmp\durable-ac69d9f3\powershellWrapper.ps1:3 char:1
[2021-03-25T16:26:00.867Z] + & powershell -NoProfile -NonInteractive -ExecutionPolicy Bypass -Comm ...
[2021-03-25T16:26:00.867Z] + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2021-03-25T16:26:00.867Z]     + CategoryInfo          : NotSpecified: (hcsshim::Create... not supported.:String) [], RemoteException
[2021-03-25T16:26:00.867Z]     + FullyQualifiedErrorId : NativeCommandError
[2021-03-25T16:26:00.867Z]  
[2021-03-25T16:26:00.867Z] (extra info: {"SystemType":"Container","Name":"a9d9068c72032118b74834d50c4205ac6c23bd506caee08b0d2d01356413cc28","Owner":"docker","IgnoreFlushesDuringBoot":true,"LayerFolderPath":"D:\\CI\\PR-3
[2021-03-25T16:26:00.867Z] 9846\\63\\daemon\\windowsfilter\\a9d9068c72032118b74834d50c4205ac6c23bd506caee08b0d2d01356413cc28","Layers":[{"ID":"11864a36-109a-522a-be2d-24a868e1cb10","Path":"D:\\CI\\PR-39846\\63\\daemon\\
[2021-03-25T16:26:00.867Z] windowsfilter\\1f640e261b425edcdf3c71f650e2540dbc080cd9c23efc5fed2f42e69553198b"}],"HostName":"a9d9068c7203","HvPartition":true,"EndpointList":["38084759-8286-449b-a6ef-f93f346a6407"],"HvRunti
[2021-03-25T16:26:00.867Z] me":{"ImagePath":"D:\\CI\\PR-39846\\63\\daemon\\windowsfilter\\1f640e261b425edcdf3c71f650e2540dbc080cd9c23efc5fed2f42e69553198b\\UtilityVM"},"AllowUnqualifiedDNSQuery":true})
[2021-03-25T16:26:00.867Z] 
[2021-03-25T16:26:00.867Z] 
[2021-03-25T16:26:00.867Z] ERROR: Failed 'ERROR: Failed to build busybox image' at 03/25/2021 16:26:00
[2021-03-25T16:26:00.867Z] At D:\gopath\src\github.com\docker\docker\hack\ci\windows.ps1:824 char:17
[2021-03-25T16:26:00.867Z] +                 Throw "ERROR: Failed to build busybox image"
[2021-03-25T16:26:00.867Z] +                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@olljanat
Copy link
Contributor

@thaJeztah as those tests already passed without Hyper-V maybe it would be better to disable it again, squash commits and merge this one? I was hoping to be able use 2022 on #41479 (testing my proposal #41455 (comment)) but it looks to be still that Jenkinsfile changes does not take effect before merging to master.

I think it's good to have at least one of the Windows stages run with Hyper-V; that said, we discussed removing RS1 from the list at some point; perhaps we can add a "windows latest (Hyper-V)" and "windows latest (Process)".

Honestly I do not care about Hyper-V mode as we do not use it anywhere but sure because that feature exist then it probably should be part of tests.

This adds a stage to test against the current SAC (Semi Annual Channel),
which allows us to catch possible regressions on upcoming LTS versions.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Images for Windows 2022 (SAC) are not yet available, so using insider builds
in the meantime; mcr.microsoft.com/windows/servercore/insider:10.0.20295.1

Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the jenkinsfile_add_windows_1903 branch from d8f9340 to b5f0096 Compare April 8, 2021 19:30
@thaJeztah
Copy link
Member Author

Removed the commits that (I think) were all Hyper-V related; these commits were removed:

  • switched it to use hyper-v
  • set storage-opt: size
  • disabled defender
  • skipped tests

I kept those in a separate branch, so if some need to be added back I can add them; let's see what CI says

@thaJeztah
Copy link
Member Author

Pushed the other branch as draft #42277

@olljanat
Copy link
Contributor

olljanat commented Apr 8, 2021

@thaJeztah looks that you skipped one change too much. We need 48d0152 here.

@thaJeztah
Copy link
Member Author

@thaJeztah looks that you skipped one change too much. We need 48d0152 here.

Ah! That was the one I wasn't sure about (didn't recall if it was version only, or hyper-v specific); will fix

These tests fail, possibly due to changes in the kernel. Temporarily skipping
these tests, so that we at least have some coverage on these windows versions
in this repo, and we can look into this specific issue separately.;

    === FAIL: github.com/docker/docker/pkg/archive TestChangesDirsEmpty (0.21s)
        changes_test.go:261: Reported changes for identical dirs: [{\dirSymlink C}]

    === FAIL: github.com/docker/docker/pkg/archive TestChangesDirsMutated (0.14s)
        changes_test.go:391: unexpected change "C \\dirSymlink" "\\dirnew"

Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the jenkinsfile_add_windows_1903 branch from c547624 to 8f4b3b0 Compare April 8, 2021 23:07
Copy link
Contributor

@olljanat olljanat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@StefanScherer StefanScherer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM !

@thaJeztah
Copy link
Member Author

Thanks! I'll go ahead and merge 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants