Skip to content

Comments

dotnetup retries to download manifest content - flaky test fixes#53102

Open
nagilson wants to merge 7 commits intodotnet:release/dnupfrom
nagilson:nagilson-dnup-release-network-retry
Open

dotnetup retries to download manifest content - flaky test fixes#53102
nagilson wants to merge 7 commits intodotnet:release/dnupfrom
nagilson:nagilson-dnup-release-network-retry

Conversation

@nagilson
Copy link
Member

also trying to run CI to see if network issues are reoccuring

also trying to run CI to see if network issues are reoccuring
dotnetrelease library doesn't allow custom http clients - does it not respect proxy?
DOTNET_TESTHOOK_DEFAULT_INSTALL_PATH is no longer used, we pass the manifest path into the app args.

Also, this sets the cwd for all of the process, not the sub dotnetup process. This was a poor  choice and prone to cause race conditions on CWD setting / unsetting for concurrent test
@nagilson nagilson changed the title Add instructions for dotnetup issue creation dotnetup retries to download manifest content - flaky test fixe s Feb 20, 2026
@nagilson nagilson changed the title dotnetup retries to download manifest content - flaky test fixe s dotnetup retries to download manifest content - flaky test fixes Feb 20, 2026
@nagilson nagilson marked this pull request as ready for review February 20, 2026 19:53
Copilot AI review requested due to automatic review settings February 20, 2026 19:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses flaky test issues in dotnetup by adding retry logic for network operations and improving test isolation. The changes aim to make CI builds more resilient to transient network failures when downloading .NET release manifests.

Changes:

  • Added retry logic with linear backoff to ReleaseManifest.GetReleasesIndex() to handle transient network failures when fetching the releases index
  • Simplified TestEnvironment class by removing process-wide state mutations (environment variables and current directory changes), relying instead on explicit path parameters for proper test isolation
  • Configured CI pipeline with permissive network isolation policy to allow external network access needed for downloading release manifests
  • Added documentation guideline about using the existing dotnetup label for issues

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/Installer/Microsoft.Dotnet.Installation/Internal/ReleaseManifest.cs Adds retry logic with linear backoff for ProductCollection.GetAsync() calls to handle transient HttpRequestException failures
test/dotnetup.Tests/Utilities/TestEnvironment.cs Removes process-wide state mutations (CWD and environment variables) to improve test isolation and parallelization safety
.vsts-dnup-ci.yml Adds permissive network isolation policy to allow external network access and removes trailing whitespace
.github/copilot-instructions.md Adds guideline to use existing dotnetup label rather than creating new label variants

Comment on lines 66 to 74
catch (HttpRequestException ex) when (attempt < MaxRetryCount)
{
lastException = ex;
Thread.Sleep(RetryDelayMilliseconds * attempt); // Linear backoff
}
}

throw new HttpRequestException(
$"Failed to fetch the releases index after {MaxRetryCount} attempts.", lastException);
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic has a flaw: when the final attempt (attempt 3) throws an HttpRequestException, the catch clause won't match because the condition is attempt < MaxRetryCount (3 < 3 is false). This means the exception will propagate without being wrapped in the informative error message with retry count.

To fix this, either:

  1. Change the catch condition to not include the attempt check, and instead catch the exception on all attempts, storing it in lastException each time, then check if we should retry after the catch block
  2. Add a separate catch block without the when clause outside the loop to handle the final attempt's exception

Option 1 is cleaner and matches the pattern used in other retry implementations in the codebase.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant