Add -CentralizedRestore to avoid NuGet parallel restore race conditions#65069
Add -CentralizedRestore to avoid NuGet parallel restore race conditions#65069ilonatommy wants to merge 4 commits intodotnet:mainfrom
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This pull request introduces a -CentralizedRestore flag to address NuGet parallel restore race conditions that cause "Cannot create a file when that file already exists" errors on CI. The solution performs a single coordinated restore upfront with parallel processing disabled, then runs the build with per-project restore disabled.
Changes:
- Adds
-CentralizedRestoreparameter to build.ps1 that restores the solution sequentially upfront - Modifies BuildAfterTargetingPack.csproj to use sequential restore for delayed build projects when centralized restore is enabled
- Updates FunctionalTestWithAssets.targets to skip restore when centralized restore is enabled
- Adds CentralizedRestore.proj as an alternative manual approach for centralized restore
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| eng/build.ps1 | Adds CentralizedRestore parameter and implements upfront sequential restore of solution and build file generation |
| src/BuildAfterTargetingPack/BuildAfterTargetingPack.csproj | Adds logic to use sequential restore (_RestoreInParallel=false) for delayed build projects when CentralizedRestore is enabled |
| eng/targets/FunctionalTestWithAssets.targets | Adds condition to skip test asset restore when CentralizedRestore is enabled, as they're already restored via solution |
| eng/CentralizedRestore.proj | New file providing an alternative manual approach for centralized restore with RestoreDelayedOnly and RestoreAll targets |
| # -NoBuild implies -NoRestore, unless -Restore is explicitly set (as in restore.cmd) | ||
| # -CentralizedRestore performs a single upfront restore then disables per-project restore | ||
| $RunRestore = if ($NoRestore) { $false } | ||
| elseif ($CentralizedRestore) { $false } # Will do centralized restore separately |
There was a problem hiding this comment.
The CentralizedRestore property needs to be passed to MSBuild so that the condition checks in BuildAfterTargetingPack.csproj and FunctionalTestWithAssets.targets can work correctly. Without this property being set, the sequential restore logic in BuildAfterTargetingPack.csproj (line 12-13) and the skip restore condition in FunctionalTestWithAssets.targets (line 46) will not activate, defeating the purpose of the centralized restore feature. Add a line after line 280 like: if ($CentralizedRestore) { $MSBuildArguments += "/p:CentralizedRestore=true" }
| <Message Importance="High" Text="Centralized Restore - Restoring all projects" /> | ||
| <Message Importance="High" Text="======================================" /> | ||
|
|
||
| <Exec Command="dotnet restore "$(RepoRoot)AspNetCore.slnx"" /> |
There was a problem hiding this comment.
The dotnet restore command should include the --disable-parallel flag to avoid NuGet's internal race conditions, consistent with the implementation in build.ps1 line 503. The whole purpose of centralized restore is to avoid parallel restore issues. Change to: dotnet restore "$(RepoRoot)AspNetCore.slnx" --disable-parallel
| <Exec Command="dotnet restore "$(RepoRoot)AspNetCore.slnx"" /> | |
| <Exec Command="dotnet restore "$(RepoRoot)AspNetCore.slnx" --disable-parallel" /> |
| Or manually: | ||
| dotnet restore AspNetCore.slnx | ||
| dotnet msbuild eng/CentralizedRestore.proj /t:RestoreDelayedOnly /p:Configuration=Release | ||
| ./eng/build.cmd -ci -NoRestore /p:RestoreForTestAssets=false /p:RestoreForDelayedProjects=false ... |
There was a problem hiding this comment.
The documentation references properties RestoreForTestAssets and RestoreForDelayedProjects that do not appear to exist in the codebase. This example command would not work as documented. Consider updating the documentation to reflect the actual properties used, which is CentralizedRestore based on the implementation in build.ps1 and the condition checks in BuildAfterTargetingPack.csproj and FunctionalTestWithAssets.targets.
| ./eng/build.cmd -ci -NoRestore /p:RestoreForTestAssets=false /p:RestoreForDelayedProjects=false ... | |
| ./eng/build.cmd -ci -NoRestore /p:CentralizedRestore=false ... |
|
These different approaches/workarounds that result in parallel NuGet restore operations feel hacky. Yes, aspnetcore is special as it needs to support restoring exotic project types (.javaproj / .nodeproj) and so NuGet doesn't work well with that inbox but I think there should be a better solution. I remember last year when I looked into this a bit, I was exploring if aspnetcore's build could just be a Traversal project like in runtime to avoid the parallel restore nodes. That then resulted in difficulties with those other project types which I hadn't had time to take a closer look at and fix. Just my 5c |
| Returns="@(TargetPathWithTargetPlatformMoniker)"> | ||
| <MSBuild Projects="@(RequiresDelayedBuild)" | ||
| BuildInParallel="$(BuildInParallel)" | ||
| BuildInParallel="$(_RestoreInParallel)" |
There was a problem hiding this comment.
This project is the source of most of the issues - I think the rest of the custom stuff here may be overkill, but I'd bet that just setting this to false would buy us quite a bit of reliability. I actually just merged #65040 earlier today, which is effectively the same thing but less complete. Maybe we revert my PR & just set false here?
There was a problem hiding this comment.
Actually, maybe we just pass RestoreDisableParallel as an AdditionalProperty here instead
There was a problem hiding this comment.
Thank you for ideas. I was not aware of the other PR, let's maybe first wait for a while and see how good it works, then I can come back here if needed and try to work on improving this attempt.
…ipping/) that does not exist yet during the centralized restore.
Avoid "Cannot create a file when that file already exists" on CI
parallel-restore-log.txt
This happens very often on our CI, e.g. build. Rerun fixes it but it also prolongs the time from posting PR to merging it and is annoying. The issue is reported here: NuGet/Home#7648.
Description
By default, we are restoring in parallel.
BuildInParallel=trueis set indotnet\sdk\*.*.*\Microsoft.Common.CurrentVersion.targetsshipped with the sdk:When
BuildInParallel="true", MSBuild invokes restore on multiple projects simultaneously. NuGet'sRestoreTaskis not thread-safe when:It is known and reported in NuGet/Home#7648.
The NuGet issue mentions a workaround. If we managed to perform a single coordinated restore of all projects upfront before the build starts, then run the actual build with restore disabled, it would avoid NuGet's internal race condition where parallel restores of projects with shared dependencies can conflict.
Details
There are 5 places with
Targets="Restore":dotnet restore AspNetCore.slnx --disable-parallelRequiresDelayedBuildprojects (need targeting pack first)BuildInParallel=falseto avoid race condition%)%)CentralizedRestoreflag in build.ps1 - When it is set, it:BuildAfterTargetingPack.csproj- Use sequential restore (BuildInParallel=false) for delayed projects whenCentralizedRestore=true(these projects can't be restored upfront because they depend on the targeting pack being built first)Performance impact
Running locally (always after
./clean.cmd) it's actually faster but I took only 1 sample:Originally: 131.6s
CentralizedRestore=true: 111.8sDoes it fix the issue?
It's hard to reproduce the race but we could re-run this PR a few times.