Skip to content

Fix(discovery/aws): Fixed ECS SD region loading to check AWS config before IMDS.#17684

Closed
akshatsinha0 wants to merge 1 commit intoprometheus:mainfrom
akshatsinha0:fix-ecs-region-loading
Closed

Fix(discovery/aws): Fixed ECS SD region loading to check AWS config before IMDS.#17684
akshatsinha0 wants to merge 1 commit intoprometheus:mainfrom
akshatsinha0:fix-ecs-region-loading

Conversation

@akshatsinha0
Copy link
Copy Markdown
Contributor

@akshatsinha0 akshatsinha0 commented Dec 14, 2025

Which issue(s) does the PR fix:

NONE

Description:

This PR fixes the ECS service discovery region loading logic to match the behavior of EC2 and Lightsail discovery.

Problem:
The ECS discovery was added after the region loading bugfix (PR #17376) was applied to EC2 and Lightsail. ECS has incomplete region detection logic...
It skips checking cfg.Region (which loads from AWS_REGION env var and ~/.aws/config)
used context.Background() (like other AWS discoveries)

Does this PR introduce a user-facing change?

Does this PR introduce a user-facing change?

[BUGFIX] ECS service discovery: Fix region loading to check AWS config before IMDS.

@akshatsinha0
Copy link
Copy Markdown
Contributor Author

@krajorama Please have a look here.

@akshatsinha0 akshatsinha0 force-pushed the fix-ecs-region-loading branch from 0a38efe to e887e6e Compare December 14, 2025 16:10
@bboreham
Copy link
Copy Markdown
Member

Please rebase on main after #17695 to fix the Fuzzing CI failure.

Also you seem to have some format issue.

@akshatsinha0
Copy link
Copy Markdown
Contributor Author

Yes i will

@akshatsinha0 akshatsinha0 force-pushed the fix-ecs-region-loading branch from e887e6e to 41d4ffd Compare December 16, 2025 12:03
@akshatsinha0 akshatsinha0 requested a review from a team as a code owner December 16, 2025 12:03
@krajorama krajorama requested a review from sysadmind December 17, 2025 13:37

if c.Region == "" {
cfg, err := awsConfig.LoadDefaultConfig(context.TODO())
cfg, err := awsConfig.LoadDefaultConfig(context.Background())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot comment on whether this is correct or not, will ask @sysadmind and @matt-gp about that, but given that now we'll have this code 3 times copied, let's refactor into a function, something like:

func ensureRegion(region string) (string, error) {
	if region != "" {
		return region, nil
	}

	cfg, err := awsConfig.LoadDefaultConfig(context.Background())
	if err != nil {
		return "", err
	}

	if cfg.Region != "" {
		// Use the region from the AWS config. It will load environment variables and shared config files.
		return cfg.Region, nil
	}

	// Try to get the region from the instance metadata service (IMDS).
	imdsClient := imds.NewFromConfig(cfg)
	imdsRegion, err := imdsClient.GetRegion(context.Background(), &imds.GetRegionInput{})
	if err != nil {
		return "", err
	}
	return imdsRegion.Region, nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think abstracting to a function is worthwhile. Its the same code in 3 places.

This PR is okay as it sits though from a correctness perspective. LGTM

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh adding it to it's own function would be good, we can use it in other AWS service discovery modules that we want to add, eg MSK, Elasticache etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should I take the initiative of adding it?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh go for it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be added? I can pick it up if needs be.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please discover.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matt-gp @krajorama @beorn

I notice that the region loading logic in UnmarshalYAML is actually doing two distinct things:

1). Region resolution(AWS config -> IMDS fallback)
2). Config validation(ensuring region is not empty)

The current pattern mixes these concerns. when refactoring, we might wanna consider

// region.go -can be created

// attempting to load the AWS region from multiple sources
// in order of precedence: explicit config -> AWS config/env vars ->IMDS.
// Returns empty string (not error) if region cannot be determined, allowing
// the caller to decide whether to fail or use a default.
func loadRegionWithFallback(ctx context.Context, explicitRegion string) (string, error) {
    if explicitRegion != "" {
        return explicitRegion, nil
    }
    
    cfg, err := awsConfig.LoadDefaultConfig(ctx)
    if err != nil {
        return "", fmt.Errorf("failed to load AWS config: %w", err)
    }
    
    if cfg.Region != "" {
        return cfg.Region, nil
    }
    // Fallback (may fail in non-AWS environments)
    imdsClient := imds.NewFromConfig(cfg)
    region, err := imdsClient.GetRegion(ctx, &imds.GetRegionInput{})
    if err != nil {
        return "", fmt.Errorf("failed to get region from IMDS: %w", err)
    }
    
    return region.Region, nil
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I've raised #18019

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Jan 27, 2026

Hello from the bug scrub!

Under active review.

@matt-gp
Copy link
Copy Markdown
Collaborator

matt-gp commented Feb 10, 2026

#18019 has now been merged so we can probably close this

@SuperQ SuperQ closed this Feb 10, 2026
@akshatsinha0
Copy link
Copy Markdown
Contributor Author

Yah. Wow. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants