Skip to content

Comments

feat: added extended metadata support for popular cloud providers#686

Merged
ehsandeep merged 13 commits intodevfrom
additional-metadata-cloudlist-sources
Aug 17, 2025
Merged

feat: added extended metadata support for popular cloud providers#686
ehsandeep merged 13 commits intodevfrom
additional-metadata-cloudlist-sources

Conversation

@Ice3man543
Copy link
Member

@Ice3man543 Ice3man543 commented Jul 8, 2025

Additional metadata from providers

Currently covered

  • cloudflare
  • digitalocean
  • aws
  • gcp
  • azure

Example objects

Digitalocean instance

{
  "public": true,
  "provider": "digitalocean",
  "service": "instance",
  "public_ipv4": "127.0.0.1",
  "metadata": {
    "vcpus": "2",
    "size_slug": "s-2vcpu-2gb-90gb-intel",
    "region_slug": "blr1",
    "image_id": "168972420",
    "image_name": "24.10 x64",
    "created_at": "2025-06-03T07:51:07Z",
    "tags": "data-monitoring-api",
    "name": "ubuntu-s-2vcpu-2gb-90gb-intel-blr1-01",
    "status": "active",
    "memory": "2048",
    "region_name": "Bangalore 1",
    "size_memory": "2048",
    "size_disk": "90",
    "droplet_id": "1234567890",
    "vpc_uuid": "c8391804-23e1-4a44-ad36-942ce0385447",
    "locked": "false",
    "disk": "90",
    "image_slug": "ubuntu-24-10-x64",
    "size_vcpus": "2",
    "features": "droplet_agent,private_networking"
  }
}

Summary by CodeRabbit

  • New Features

    • Added an --extended-metadata option to enable richer metadata in discovery across providers.
    • Added AWS API Gateway v2 and GCP Cloud Functions v1 support.
  • Enhancements

    • Many providers (AWS, GCP, Azure, DigitalOcean, Cloudflare, Kubernetes, etc.) optionally include comprehensive metadata (IDs, timestamps, tags, configs, related resources).
    • Improved multi-account S3 handling and Kubernetes ingress metadata; broader resource coverage (EKS, ECS, ELB/ALB, Route53, Cloud Run, GKE, etc.).
  • Documentation

    • Updated GCP Asset API docs with extended metadata guidance and required permissions.
  • Chores

    • Dependency version bumps and Go toolchain update.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 8, 2025

Walkthrough

Add an optional Extended Metadata feature: CLI flag and options propagate through runner and providers; schema.Resource gains Metadata; many providers collect and attach detailed metadata when enabled; GCP org asset parsing and bulk metadata enrichment added; dependency versions bumped.

Changes

Cohort / File(s) Change Summary
Runner & CLI
internal/runner/options.go, internal/runner/runner.go
Add ExtendedMetadata option and --extended-metadata flag; inject extended_metadata into provider config during enumeration.
Schema
pkg/schema/schema.go
Add Resource.Metadata field; update append signatures; add helpers AddMetadata, AddMetadataList, AddMetadataInt and pointer helpers.
GCP (org asset & services)
pkg/providers/gcp/assets_api.go, pkg/providers/gcp/gcp.go, pkg/providers/gcp/bucket.go, pkg/providers/gcp/cloud-run.go, pkg/providers/gcp/function.go, pkg/providers/gcp/vms.go, pkg/providers/gcp/dns.go, pkg/providers/gcp/gke.go
Add org asset parsing (parseAssetToResource) and bulk metadata enrichment (enrichAssetsWithMetadata) for compute, storage, run, functions, DNS; add extendedMetadata flag and dual Cloud Functions v1/v2 support; per-service metadata collectors and plumbing.
AWS (many providers)
pkg/providers/aws/aws.go, pkg/providers/aws/alb.go, pkg/providers/aws/elb.go, pkg/providers/aws/cloudfront.go, pkg/providers/aws/ecs.go, pkg/providers/aws/eks.go, pkg/providers/aws/instances.go, pkg/providers/aws/lambda-api-gateway.go, pkg/providers/aws/lightsail.go, pkg/providers/aws/route53.go, pkg/providers/aws/s3.go
Add ExtendedMetadata option; API Gateway v2 client; numerous metadata collectors for ALB/ELB, CloudFront, ECS (including Fargate), EKS, EC2, Lambda/API Gateway, Lightsail, Route53, S3; add ARN/tag parsing and helper utilities; S3 client wrapping for role-aware clients.
Azure
pkg/providers/azure/azure.go, pkg/providers/azure/publicips.go, pkg/providers/azure/trafficmanager.go, pkg/providers/azure/vm.go
Propagate extendedMetadata; add metadata extraction for Public IPs, VMs, Traffic Manager profiles; tag helpers and metadata builders.
DigitalOcean
pkg/providers/digitalocean/digitalocean.go, pkg/providers/digitalocean/apps.go, pkg/providers/digitalocean/instances.go
Propagate extendedMetadata; add metadata collection for Apps and Droplets.
Cloudflare
pkg/providers/cloudflare/cloudflare.go, pkg/providers/cloudflare/dns.go
Propagate extendedMetadata; add DNS record and zone metadata extraction and attachment.
Kubernetes
pkg/providers/k8s/kubernetes.go, pkg/providers/k8s/ingress.go
Propagate extendedMetadata; constructor updated; collect and attach ingress metadata; minor logic corrections.
Other providers / small changes
pkg/providers/dnssimple/dns.go
Minor refactor: switch for record type handling (A/AAAA) — behavior unchanged.
GCP docs
docs/GCP_ASSET_API.md
Document Extended Metadata feature, permissions, example config extended_metadata: true.
Go module
go.mod
Bump Go toolchain to 1.24.2 and upgrade many dependencies (Google APIs, x/*, OpenTelemetry, etc.).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Runner
    participant ProviderFactory
    participant ResourceProvider
    participant Schema

    User->>CLI: run with --extended-metadata
    CLI->>Runner: parse Options{ExtendedMetadata:true}
    Runner->>ProviderFactory: create provider with metadata option (extended_metadata=true)
    ProviderFactory->>ResourceProvider: instantiate with extendedMetadata flag
    ResourceProvider->>Schema: collect resources (if enabled fetch extended metadata via service APIs)
    Schema->>ResourceProvider: return resources with Metadata maps
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • GCP Asset API for Org #688 — Overlapping work on GCP Asset API organization-level parsing and bulk metadata enrichment (parseAssetToResource, enrichAssetsWithMetadata).

Poem

🐰
I hopped through code and left a trail,
Of metadata tiny and metadata frail.
Flags turned true, APIs I combed,
From buckets to nodes, the details homed.
A carrot-coded map for every resource tale.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2dbdbf4 and 893710a.

📒 Files selected for processing (1)
  • pkg/providers/aws/eks.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/providers/aws/eks.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Analyze (go)
  • GitHub Check: release-test
  • GitHub Check: Lint Test
  • GitHub Check: Test Builds (1.22.x, windows-latest)
  • GitHub Check: Test Builds (1.22.x, ubuntu-latest)
  • GitHub Check: Test Builds (1.22.x, macOS-latest)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch additional-metadata-cloudlist-sources

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@Ice3man543 Ice3man543 marked this pull request as draft July 8, 2025 06:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
pkg/providers/cloudflare/cloudflare.go (1)

45-77: Fix missing extendedMetadata field when using API token authentication.

The extendedMetadata field is not set when creating a Provider with API token authentication (line 52), but it is correctly set when using API key authentication (line 76). This will cause extended metadata to not work for users authenticating with API tokens.

Apply this diff to fix the issue:

-		return &Provider{id: id, client: api, services: services}, nil
+		// Parse extended metadata option
+		extendedMetadata := false
+		if extMetadata, ok := options.GetMetadata("extended_metadata"); ok {
+			extendedMetadata = extMetadata == "true"
+		}
+		
+		return &Provider{id: id, client: api, services: services, extendedMetadata: extendedMetadata}, nil

Alternatively, move the extended metadata parsing logic before the API token check to avoid duplication:

 	}
 
+	// Parse extended metadata option
+	extendedMetadata := false
+	if extMetadata, ok := options.GetMetadata("extended_metadata"); ok {
+		extendedMetadata = extMetadata == "true"
+	}
+
 	apiToken, ok := options.GetMetadata(apiToken)
 	if ok {
 		// Construct a new API object with scoped api token
 		api, err := cloudflare.NewWithAPIToken(apiToken)
 		if err != nil {
 			return nil, err
 		}
-		return &Provider{id: id, client: api, services: services}, nil
+		return &Provider{id: id, client: api, services: services, extendedMetadata: extendedMetadata}, nil
 	}
 
 	accessKey, ok := options.GetMetadata(apiAccessKey)
...
-	// Parse extended metadata option
-	extendedMetadata := false
-	if extMetadata, ok := options.GetMetadata("extended_metadata"); ok {
-		extendedMetadata = extMetadata == "true"
-	}
-
 	return &Provider{id: id, client: api, services: services, extendedMetadata: extendedMetadata}, nil
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9ee6544 and 7af2cd2.

📒 Files selected for processing (8)
  • internal/runner/options.go (2 hunks)
  • internal/runner/runner.go (1 hunks)
  • pkg/providers/cloudflare/cloudflare.go (3 hunks)
  • pkg/providers/cloudflare/dns.go (5 hunks)
  • pkg/providers/digitalocean/apps.go (3 hunks)
  • pkg/providers/digitalocean/digitalocean.go (4 hunks)
  • pkg/providers/digitalocean/instances.go (4 hunks)
  • pkg/schema/schema.go (5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
pkg/providers/cloudflare/cloudflare.go (2)
pkg/schema/schema.go (2)
  • ServiceMap (255-255)
  • Provider (19-29)
pkg/providers/digitalocean/digitalocean.go (1)
  • Provider (14-19)
pkg/providers/cloudflare/dns.go (2)
pkg/schema/schema.go (2)
  • Resource (143-164)
  • Provider (19-29)
pkg/providers/cloudflare/cloudflare.go (1)
  • Provider (14-19)
pkg/schema/schema.go (1)
pkg/schema/validate/validate.go (4)
  • ResourceType (68-68)
  • DNSName (72-72)
  • PublicIPv4 (73-73)
  • PublicIPv6 (74-74)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test Builds (1.22.x, macOS-latest)
  • GitHub Check: release-test
  • GitHub Check: Test Builds (1.22.x, windows-latest)
  • GitHub Check: Lint Test
  • GitHub Check: Test Builds (1.22.x, ubuntu-latest)
  • GitHub Check: Analyze (go)
🔇 Additional comments (20)
internal/runner/options.go (1)

35-35: LGTM! Extended metadata option properly added.

The implementation follows the existing pattern for adding CLI options and integrates well with the current structure.

Also applies to: 86-86

internal/runner/runner.go (1)

69-72: LGTM! Extended metadata propagation implemented correctly.

The conditional block properly propagates the extended metadata setting to providers through the configuration item map.

pkg/providers/cloudflare/dns.go (2)

44-49: LGTM! Metadata extraction properly integrated into DNS resource creation.

The conditional metadata extraction is cleanly implemented and the metadata is correctly attached to both DNS name and IP resources.

Also applies to: 56-56, 68-68


83-148: Well-structured metadata extraction implementation.

The getDNSRecordMetadata method comprehensively extracts all relevant metadata from both DNS records and zones. Good defensive programming with proper nil checks for optional fields.

pkg/providers/digitalocean/digitalocean.go (2)

47-59: LGTM! Extended metadata option correctly parsed and set.

The implementation properly parses the extended metadata option and includes it in the Provider struct initialization.


85-89: Extended metadata properly propagated to sub-providers.

The extendedMetadata flag is correctly passed to both instanceProvider and appsProvider, ensuring consistent behavior across all DigitalOcean resources.

Also applies to: 98-102

pkg/providers/digitalocean/instances.go (5)

5-7: LGTM: Appropriate imports added.

The added imports (fmt, strings, time) are necessary for the metadata extraction functionality.


17-17: LGTM: Extended metadata field added correctly.

The extendedMetadata boolean field provides proper control over metadata collection.


40-44: LGTM: Conditional metadata extraction implemented properly.

The conditional logic ensures metadata is only collected when requested, which is efficient and follows the design pattern.


52-52: LGTM: Metadata consistently attached to resources.

Metadata is properly attached to both private and public IP resources, ensuring consistency.

Also applies to: 62-62


78-165: LGTM: Comprehensive metadata extraction with proper error handling.

The getDropletMetadata method provides excellent coverage of droplet attributes with:

  • Proper nil checks for nested objects
  • Robust time parsing with fallback
  • Appropriate type conversions
  • Good use of helper functions from the schema package
pkg/providers/digitalocean/apps.go (5)

5-6: LGTM: Appropriate imports added.

The added imports (fmt, time) are necessary for the metadata extraction functionality.


16-16: LGTM: Extended metadata field added correctly.

The extendedMetadata boolean field provides proper control over metadata collection, consistent with the instances provider.


37-41: LGTM: Conditional metadata extraction implemented properly.

The conditional logic ensures metadata is only collected when requested, following the same efficient pattern as the instances provider.


49-49: LGTM: Metadata properly attached to resource.

Metadata is correctly attached to the app resource.


65-102: LGTM: Comprehensive app metadata extraction with proper handling.

The getAppMetadata method provides excellent coverage of app attributes with:

  • Proper nil checks for nested objects (e.g., app.ActiveDeployment)
  • Robust time handling with zero checks
  • Appropriate type conversions
  • Good use of helper functions from the schema package
pkg/schema/schema.go (4)

67-67: LGTM: Method signature updated correctly.

The appendResourceWithTypeAndMeta method signature now properly accepts metadata parameter and assigns it to the resource.

Also applies to: 72-72


98-98: LGTM: All method calls updated consistently.

All calls to appendResourceWithTypeAndMeta have been updated to pass the metadata parameter, ensuring consistency across resource types.

Also applies to: 104-104, 110-110, 116-116, 122-122


162-163: LGTM: Metadata field properly added to Resource struct.

The Metadata field is correctly typed as map[string]string with appropriate JSON tag including omitempty.


329-346: LGTM: Well-designed helper functions for metadata handling.

The helper functions provide a clean API for adding different types of metadata:

  • AddMetadata: Handles string pointers with nil/empty checks
  • AddMetadataList: Joins string slices with proper handling
  • AddMetadataInt: Converts integers with zero-value checks

The implementation is robust and handles edge cases appropriately.

@Ice3man543 Ice3man543 linked an issue Jul 8, 2025 that may be closed by this pull request
3 tasks
@Ice3man543 Ice3man543 marked this pull request as ready for review July 10, 2025 13:57
@Ice3man543 Ice3man543 self-assigned this Jul 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (12)
pkg/providers/gcp/vms.go (1)

31-31: Consider renaming the zone variable for clarity.

The variable zone actually contains the full zone path (e.g., "zones/us-central1-a"), not just the zone name. Consider renaming it to zonePath or zoneKey to avoid confusion.

-for zone, instancesScopedList := range ial.Items {
+for zonePath, instancesScopedList := range ial.Items {
pkg/providers/azure/vm.go (1)

136-149: Consider simplifying metadata copying for DNS resources.

The current implementation creates a new map and copies all entries. Since metadata maps are not modified after creation, you could potentially share the same map reference.

-if metadata != nil {
-	dnsResource.Metadata = make(map[string]string)
-	for k, v := range metadata {
-		dnsResource.Metadata[k] = v
-	}
-}
+dnsResource.Metadata = metadata

If you need to ensure immutability, consider extracting this to a helper function that could be reused across the codebase.

pkg/providers/azure/publicips.go (1)

61-75: Consider simplifying metadata copying for DNS resources.

Similar to the vm.go file, the metadata copying could be simplified.

-if metadata != nil {
-	dnsResource.Metadata = make(map[string]string)
-	for k, v := range metadata {
-		dnsResource.Metadata[k] = v
-	}
-}
+dnsResource.Metadata = metadata
pkg/providers/azure/trafficmanager.go (2)

73-74: Use consistent metadata assignment pattern.

For consistency with the rest of the codebase, consider using the schema.AddMetadata helper function for these fields:

-metadata["subscription_id"] = tmp.SubscriptionID
-metadata["owner_id"] = tmp.SubscriptionID
+schema.AddMetadata(metadata, "subscription_id", &tmp.SubscriptionID)
+schema.AddMetadata(metadata, "owner_id", &tmp.SubscriptionID)

78-83: Consider logging Azure resource ID parsing errors.

While continuing execution on parse errors is appropriate, consider logging the error for debugging purposes to help identify malformed resource IDs.

pkg/providers/k8s/ingress.go (1)

145-155: Consider simplifying the deduplication logic.

The current implementation works correctly but could be more concise:

-// Deduplicate backend services
-uniqueBackends := make(map[string]struct{})
-for _, backend := range backendServices {
-    uniqueBackends[backend] = struct{}{}
-}
-var uniqueBackendsList []string
-for backend := range uniqueBackends {
-    uniqueBackendsList = append(uniqueBackendsList, backend)
-}
-metadata["backend_services"] = strings.Join(uniqueBackendsList, ",")
+// Deduplicate backend services
+uniqueBackends := make(map[string]bool)
+for _, backend := range backendServices {
+    uniqueBackends[backend] = true
+}
+uniqueBackendsList := make([]string, 0, len(uniqueBackends))
+for backend := range uniqueBackends {
+    uniqueBackendsList = append(uniqueBackendsList, backend)
+}
+metadata["backend_services"] = strings.Join(uniqueBackendsList, ",")
pkg/providers/aws/alb.go (1)

189-286: Consider extracting owner ID parsing into a helper function.

The owner ID extraction from ARN logic (lines 206-211) is duplicated in getTargetInstanceMetadata (lines 318-323). Consider creating a helper function to avoid duplication:

+func extractOwnerIDFromARN(arn string) string {
+    arnParts := strings.Split(arn, ":")
+    if len(arnParts) >= 5 && arnParts[4] != "" {
+        return arnParts[4]
+    }
+    return ""
+}

 // In getLoadBalancerMetadata:
-// Extract owner ID from ARN (format: arn:aws:elasticloadbalancing:region:account-id:loadbalancer/app/name/id)
-arnParts := strings.Split(arn, ":")
-if len(arnParts) >= 5 && arnParts[4] != "" {
-    metadata["owner_id"] = arnParts[4]
-}
+// Extract owner ID from ARN (format: arn:aws:elasticloadbalancing:region:account-id:loadbalancer/app/name/id)
+if ownerID := extractOwnerIDFromARN(arn); ownerID != "" {
+    metadata["owner_id"] = ownerID
+}
pkg/providers/gcp/gcp.go (1)

65-67: Consider more robust boolean parsing.

The current implementation only checks for the exact string "true". Consider supporting more boolean representations:

-if extendedMetadata, ok := options.GetMetadata("extended_metadata"); ok {
-    provider.extendedMetadata = extendedMetadata == "true"
-}
+if extendedMetadata, ok := options.GetMetadata("extended_metadata"); ok {
+    provider.extendedMetadata = strings.EqualFold(extendedMetadata, "true") || 
+                                extendedMetadata == "1" || 
+                                strings.EqualFold(extendedMetadata, "yes")
+}
pkg/providers/aws/ecs.go (2)

254-258: Consider using a more robust ARN parsing approach.

The current implementation uses a simple string split by "/" which may not handle all ARN formats correctly. ARNs have a well-defined structure that could be parsed more reliably.

Consider using the AWS SDK's ARN parsing utilities or a more robust parsing approach:

-taskID := aws.StringValue(task.TaskArn)
-if task.TaskArn != nil {
-    if parts := splitARN(*task.TaskArn); len(parts) > 0 {
-        taskID = parts[len(parts)-1]
-    }
-}
+taskID := aws.StringValue(task.TaskArn)
+if task.TaskArn != nil {
+    // Extract task ID from ARN format: arn:aws:ecs:region:account-id:task/cluster-name/task-id
+    arnParts := strings.Split(*task.TaskArn, ":")
+    if len(arnParts) >= 6 {
+        resourceParts := strings.Split(arnParts[5], "/")
+        if len(resourceParts) >= 3 {
+            taskID = resourceParts[2]
+        }
+    }
+}

443-463: Consolidate duplicate tag building functions.

The buildECSTagString and buildEC2TagString functions have identical implementations. This violates the DRY principle.

Create a single generic function to handle both tag types:

-func buildECSTagString(tags []*ecs.Tag) string {
-    var tagPairs []string
-    for _, tag := range tags {
-        if tag.Key != nil && tag.Value != nil {
-            tagPairs = append(tagPairs, fmt.Sprintf("%s=%s",
-                aws.StringValue(tag.Key), aws.StringValue(tag.Value)))
-        }
-    }
-    return strings.Join(tagPairs, ",")
-}
-
-func buildEC2TagString(tags []*ec2.Tag) string {
-    var tagPairs []string
-    for _, tag := range tags {
-        if tag.Key != nil && tag.Value != nil {
-            tagPairs = append(tagPairs, fmt.Sprintf("%s=%s",
-                aws.StringValue(tag.Key), aws.StringValue(tag.Value)))
-        }
-    }
-    return strings.Join(tagPairs, ",")
-}
+// buildTagString creates a comma-separated string of key=value pairs from tags
+func buildTagString(tags interface{}) string {
+    var tagPairs []string
+    
+    switch t := tags.(type) {
+    case []*ecs.Tag:
+        for _, tag := range t {
+            if tag.Key != nil && tag.Value != nil {
+                tagPairs = append(tagPairs, fmt.Sprintf("%s=%s",
+                    aws.StringValue(tag.Key), aws.StringValue(tag.Value)))
+            }
+        }
+    case []*ec2.Tag:
+        for _, tag := range t {
+            if tag.Key != nil && tag.Value != nil {
+                tagPairs = append(tagPairs, fmt.Sprintf("%s=%s",
+                    aws.StringValue(tag.Key), aws.StringValue(tag.Value)))
+            }
+        }
+    }
+    
+    return strings.Join(tagPairs, ",")
+}

Then update the callers:

-if tagString := buildEC2TagString(instance.Tags); tagString != "" {
+if tagString := buildTagString(instance.Tags); tagString != "" {
pkg/providers/aws/lambda-api-gateway.go (2)

283-289: Consider a more robust approach for slice alignment.

The current approach of padding slices with nil values works but could be fragile if the initialization logic changes.

Consider using a struct to group related clients:

type clientSet struct {
    apiGateway   *apigateway.APIGateway
    apiGatewayV2 *apigatewayv2.ApiGatewayV2
    lambda       *lambda.Lambda
}

This would eliminate the need for slice alignment and make the code more maintainable.


436-444: Consider consolidating tag string building functions across the codebase.

This function is similar to the tag building functions in ecs.go. Consider creating a shared utility function to avoid duplication.

Consider creating a shared package for common AWS utilities like tag string building, ARN parsing, etc. This would improve code reuse and maintainability across all AWS providers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7af2cd2 and ed37f28.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (24)
  • go.mod (10 hunks)
  • pkg/providers/aws/alb.go (4 hunks)
  • pkg/providers/aws/aws.go (8 hunks)
  • pkg/providers/aws/cloudfront.go (4 hunks)
  • pkg/providers/aws/ecs.go (3 hunks)
  • pkg/providers/aws/eks.go (5 hunks)
  • pkg/providers/aws/elb.go (4 hunks)
  • pkg/providers/aws/instances.go (5 hunks)
  • pkg/providers/aws/lambda-api-gateway.go (8 hunks)
  • pkg/providers/aws/lightsail.go (4 hunks)
  • pkg/providers/aws/route53.go (4 hunks)
  • pkg/providers/aws/s3.go (3 hunks)
  • pkg/providers/azure/azure.go (5 hunks)
  • pkg/providers/azure/publicips.go (4 hunks)
  • pkg/providers/azure/trafficmanager.go (3 hunks)
  • pkg/providers/azure/vm.go (7 hunks)
  • pkg/providers/gcp/bucket.go (3 hunks)
  • pkg/providers/gcp/cloud-run.go (0 hunks)
  • pkg/providers/gcp/function.go (4 hunks)
  • pkg/providers/gcp/gcp.go (8 hunks)
  • pkg/providers/gcp/gke.go (2 hunks)
  • pkg/providers/gcp/vms.go (4 hunks)
  • pkg/providers/k8s/ingress.go (2 hunks)
  • pkg/providers/k8s/kubernetes.go (3 hunks)
💤 Files with no reviewable changes (1)
  • pkg/providers/gcp/cloud-run.go
🧰 Additional context used
🧬 Code Graph Analysis (10)
pkg/providers/gcp/gke.go (1)
pkg/providers/k8s/ingress.go (1)
  • NewK8sIngressProvider (20-26)
pkg/providers/k8s/kubernetes.go (2)
pkg/schema/schema.go (2)
  • ServiceMap (255-255)
  • Provider (19-29)
pkg/providers/k8s/ingress.go (1)
  • K8sIngressProvider (14-18)
pkg/providers/aws/instances.go (2)
pkg/schema/schema.go (3)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
pkg/providers/aws/aws.go (1)
  • Provider (105-121)
pkg/providers/azure/azure.go (1)
pkg/schema/schema.go (1)
  • ServiceMap (255-255)
pkg/providers/aws/cloudfront.go (2)
pkg/schema/schema.go (4)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
  • AddMetadataInt (342-346)
pkg/providers/aws/aws.go (1)
  • Provider (105-121)
pkg/providers/aws/lightsail.go (2)
pkg/schema/schema.go (4)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
  • AddMetadataInt (342-346)
pkg/providers/aws/aws.go (1)
  • Provider (105-121)
pkg/providers/aws/ecs.go (2)
pkg/schema/schema.go (4)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
  • AddMetadataInt (342-346)
pkg/providers/aws/aws.go (1)
  • Provider (105-121)
pkg/providers/azure/publicips.go (2)
pkg/schema/schema.go (3)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
pkg/providers/azure/azure.go (1)
  • Provider (32-38)
pkg/providers/aws/alb.go (2)
pkg/schema/schema.go (5)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
  • AddMetadataList (336-340)
  • AddMetadataInt (342-346)
pkg/providers/aws/aws.go (1)
  • Provider (105-121)
pkg/providers/aws/lambda-api-gateway.go (2)
pkg/providers/aws/aws.go (3)
  • ProviderOptions (31-43)
  • Provider (105-121)
  • New (124-221)
pkg/schema/schema.go (3)
  • Resource (143-164)
  • Provider (19-29)
  • AddMetadata (330-334)
🔇 Additional comments (45)
pkg/providers/gcp/gke.go (2)

21-24: LGTM: Clean addition of extended metadata support.

The extendedMetadata field addition to the gkeProvider struct follows Go naming conventions and aligns with the broader pattern of adding extended metadata support across providers.


58-58: LGTM: Proper propagation of extended metadata flag.

The extendedMetadata flag is correctly passed to the K8sIngressProvider constructor, ensuring consistent metadata handling throughout the provider stack.

pkg/providers/k8s/kubernetes.go (3)

21-24: LGTM: Clean addition of extended metadata support.

The extendedMetadata field addition to the Provider struct is well-structured and follows Go naming conventions.


84-88: LGTM: Proper option parsing and initialization.

The extended metadata option parsing correctly handles string-to-boolean conversion and provides a clean initialization pattern. The variable naming providerExtendedMetadata is descriptive and the boolean conversion logic is straightforward.


124-124: LGTM: Consistent propagation of extended metadata flag.

The extendedMetadata flag is properly passed to the K8sIngressProvider constructor, ensuring consistent metadata handling across the Kubernetes provider stack.

pkg/providers/azure/azure.go (3)

33-37: LGTM: Clean addition of extended metadata support.

The extendedMetadata field addition to the Provider struct is well-structured and follows the consistent pattern established across other providers.


99-101: LGTM: Proper option parsing and initialization.

The extended metadata option parsing correctly handles string-to-boolean conversion with appropriate error handling and follows the established pattern.


165-165: LGTM: Consistent propagation across all Azure resource providers.

The extendedMetadata flag is properly propagated to all internal Azure providers (vmProvider, publicIPProvider, and trafficManagerProvider), ensuring consistent metadata handling across all Azure resource types.

Also applies to: 175-175, 185-185

go.mod (3)

3-5: LGTM: Go version and toolchain updates.

The Go version update to 1.23.0 with toolchain 1.24.2 is appropriate for leveraging newer language features and performance improvements.


227-231: LGTM: OpenTelemetry instrumentation additions.

The addition of OpenTelemetry packages (go.opentelemetry.io/*) provides valuable observability capabilities and aligns with modern cloud-native practices.


36-37: Dependency upgrade compatibility verified

Scanned all usages of golang.org/x/oauth2 and google.golang.org/api across the codebase—no deprecated or breaking APIs detected. Existing patterns like oauth2.StaticTokenSource, oauth2.Transport, and all Google service client imports remain compatible. Proceed with these dependency updates.

pkg/providers/aws/cloudfront.go (4)

6-8: LGTM: Required import additions.

The strings and time package imports are necessary for the metadata extraction functionality and string manipulation operations.


62-75: LGTM: Clean conditional metadata extraction.

The conditional metadata extraction logic is well-structured, with proper null checking and clean integration into the existing resource creation flow.


86-155: LGTM: Comprehensive metadata extraction implementation.

The getDistributionMetadata method provides thorough metadata collection including:

  • Distribution identifiers and status
  • ARN parsing for owner ID extraction
  • Timestamp formatting with RFC3339 standard
  • Boolean value conversions
  • Array handling for aliases and origins
  • Tag fetching with proper error handling

The implementation follows AWS SDK best practices with proper null checks and string value extraction.


182-191: LGTM: Well-structured tag formatting helper.

The buildCloudFrontTagString helper function properly handles tag formatting with null checks and creates a clean comma-separated key=value format.

pkg/providers/aws/instances.go (3)

72-77: LGTM! Clean metadata extraction implementation.

The conditional metadata extraction follows a clear pattern and properly integrates with the existing resource creation logic.

Also applies to: 85-85, 95-95


107-171: Well-structured metadata extraction with comprehensive coverage.

The method properly handles nil checks and provides extensive metadata coverage including instance details, networking, security, and monitoring information.


205-214: Clean tag formatting implementation.

The helper properly handles nil values and formats tags consistently.

pkg/providers/aws/eks.go (3)

78-82: Consistent metadata extraction pattern across cluster, node, and pod resources.

The implementation properly handles metadata extraction at different resource levels with appropriate inheritance of cluster metadata to pod resources.

Also applies to: 92-96, 119-126


112-112: Verify the impact of changing resource ID from node name to provider ID.

The resource ID assignment has been changed from using the node name to using the provider options ID. This could be a breaking change for users who rely on the ID field for resource identification.

Please confirm this change is intentional and document any migration requirements for existing users.

Also applies to: 129-129


208-208: No missing function – buildAwsMapTagString is defined in the same package
The function buildAwsMapTagString is implemented in pkg/providers/aws/lambda-api-gateway.go and available to eks.go. The code will compile as-is.

Likely an incorrect or invalid review comment.

pkg/providers/aws/lightsail.go (2)

97-198: Comprehensive Lightsail metadata extraction with excellent error handling.

The implementation provides extensive metadata coverage including hardware specifications, networking details, and proper ARN parsing to extract owner ID. The nil checks are thorough throughout.


165-182: Excellent port range formatting logic.

The implementation elegantly handles both single ports and port ranges, providing clear and consistent formatting.

pkg/providers/gcp/bucket.go (1)

80-168: Comprehensive bucket metadata extraction with thorough coverage.

The implementation provides extensive metadata including lifecycle rules, encryption settings, CORS configuration, and access controls. The label formatting and conditional checks are well-implemented.

pkg/providers/azure/vm.go (1)

94-112: Good fixes for error handling and resource group parsing.

The changes improve the code quality by:

  1. Adding proper warning logs instead of silent failures
  2. Correctly using the resource group from the parsed resource ID instead of the iteration variable

These fixes ensure proper resource association and better debugging capabilities.

pkg/providers/aws/aws.go (1)

12-12: Well-integrated API Gateway V2 support.

The implementation properly adds API Gateway V2 support by:

  • Including the necessary SDK import
  • Adding "apigatewayv2" to the supported services list
  • Initializing both API Gateway clients when the service is enabled
  • Updating the provider initialization logic to handle both clients
  • Adding proper verification for the new client

The changes maintain backward compatibility while extending functionality.

Also applies to: 29-29, 114-114, 206-206, 294-296, 373-378

pkg/providers/aws/route53.go (1)

79-84: Efficient metadata implementation for Route53 records.

Good design choices:

  1. Zone metadata is computed once and reused for all records in that zone
  2. Removal of record type filtering allows comprehensive DNS record enumeration
  3. Proper metadata inheritance from zone to individual records

The implementation efficiently avoids redundant API calls.

Also applies to: 99-122

pkg/providers/azure/trafficmanager.go (3)

14-20: LGTM!

The struct fields are well-organized and the addition of the extendedMetadata field aligns with the PR's objective to support extended metadata collection across providers.


36-49: LGTM!

The conditional metadata extraction is properly implemented, following the pattern established across other providers in this PR.


196-196: No action needed: buildAzureTagString is defined in pkg/providers/azure/vm.go within the same azure package.
The function is available and does not require an import or additional definition.

Likely an incorrect or invalid review comment.

pkg/providers/k8s/ingress.go (2)

14-26: LGTM!

The struct and constructor changes properly implement the extended metadata support pattern.


36-71: LGTM!

The GetResource method properly implements conditional metadata extraction and correctly validates IP and hostname fields before creating resources.

pkg/providers/aws/alb.go (2)

67-81: LGTM!

The conditional metadata extraction is properly implemented for both load balancers and target instances.

Also applies to: 115-128


335-344: LGTM!

The tag string building function is well-implemented and handles nil checks appropriately.

pkg/providers/gcp/gcp.go (3)

10-11: LGTM!

The dual-version Cloud Functions support is well-implemented with proper import aliasing. The struct additions follow the established pattern for extended metadata support.

Also applies to: 20-33


175-204: LGTM!

The extended metadata flag is properly propagated to all resource providers, and the Cloud Functions provider correctly receives both v1 and v2 clients.


249-253: LGTM!

The verification logic properly includes the v1 Cloud Functions client.

pkg/providers/aws/s3.go (3)

30-77: LGTM!

The refactoring to use wrappedS3Client properly supports multi-account access, and the conditional metadata extraction follows the established pattern.


79-145: Well-implemented metadata extraction!

The method comprehensively extracts bucket metadata with proper error handling and edge case management. The public access determination logic correctly checks all four public access block settings.


171-203: LGTM!

The getS3Client method properly handles region-specific and role-based client creation, and buildS3TagString correctly formats the tags.

pkg/providers/aws/ecs.go (1)

104-171: LGTM! Well-structured handling of Fargate vs EC2 tasks.

The differentiation between Fargate and EC2 tasks based on ContainerInstanceArn is correct, and the error handling ensures resilience by continuing to process other tasks even if individual operations fail.

pkg/providers/gcp/function.go (2)

26-82: LGTM! Clear separation of v1 and v2 function handling.

The implementation correctly handles both Cloud Functions API versions with appropriate URL extraction and validation for each version.


124-143: Incorrect suggestion: the prefix check is intentional
The strings.HasPrefix(functionName, "projects/") guard ensures that functionName is already the full resource name required by the Cloud Functions V1 API (projects/{project}/locations/{location}/functions/{function}) before calling GetIamPolicy. If a shorter or different name format needs to be supported, a separate normalization step would be required to construct the full resource name. As written, this validation is correct and no changes are needed.

Likely an incorrect or invalid review comment.

pkg/providers/aws/lambda-api-gateway.go (2)

32-70: LGTM! Well-structured resource collection.

The refactored approach cleanly separates API Gateway v1, v1 with Lambda integrations, and v2 resources, with appropriate error handling that ensures partial failures don't prevent other resources from being collected.


134-137: Good improvement in error handling.

Changing from returning an error to breaking allows the function to continue processing other APIs even if resources for one API cannot be retrieved. This improves resilience.

@Ice3man543 Ice3man543 requested a review from Copilot July 14, 2025 09:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds an “extended metadata” flag across all providers and enriches each resource with additional provider-specific metadata when the flag is enabled.

  • Added Metadata field to the core Resource struct and helper functions (AddMetadata, AddMetadataList, AddMetadataInt) in pkg/schema/schema.go.
  • Introduced extendedMetadata option parsing in the CLI and provider constructors and wired it through each provider’s GetResource methods.
  • Implemented getXMetadata helper functions in each provider to populate Metadata maps.

Reviewed Changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated no comments.

Show a summary per file
File/Directory Description
pkg/schema/schema.go Added Metadata field, helper functions, and core wiring
pkg/providers/k8s Propagate extendedMetadata flag and ingress metadata
pkg/providers/gcp VM, GKE, Cloud Functions, Cloud Run, Storage metadata
pkg/providers/digitalocean Droplet and App metadata support
pkg/providers/cloudflare DNS record metadata extraction
pkg/providers/azure VM, Traffic Manager, Public IP metadata
pkg/providers/aws S3, Route53, Lightsail, ELB, Lambda/API Gateway, EC2, ECS, EKS, CloudFront metadata
internal/runner Added CLI --extended-metadata option and pass-through
Comments suppressed due to low confidence (4)

pkg/providers/aws/lambda-api-gateway.go:22

  • [nitpick] The type name lambdaAndapiGatewayProvider has an inconsistent casing for 'API'. Consider renaming it to lambdaAndApiGatewayProvider for clarity.
type lambdaAndapiGatewayProvider struct {

pkg/providers/digitalocean/digitalocean.go:85

  • [nitpick] The local variable instanceprovider mixes casing. Consider renaming it to instanceProvider or instProvider to follow camelCase conventions.
		instanceprovider := &instanceProvider{

pkg/schema/schema.go:329

  • [nitpick] Public helper functions (AddMetadata, AddMetadataList, AddMetadataInt) lack Godoc comments. Consider adding brief documentation above each to explain expected behavior.
// Helper functions for metadata handling

pkg/schema/schema.go:342

  • The function AddMetadataInt uses fmt.Sprintf but fmt is not imported in this file. Please add import "fmt" to avoid a compilation error.
func AddMetadataInt(metadata map[string]string, key string, value int) {

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/providers/gcp/vms.go (1)

170-186: Security concern properly addressed with sensitive key filtering.

The implementation correctly filters out sensitive metadata keys as previously suggested, preventing potential exposure of SSH keys, scripts, and other sensitive data.

🧹 Nitpick comments (3)
pkg/providers/k8s/ingress.go (2)

13-14: Fix comment to match the struct name.

The comment references k8sServiceProvider but the struct is named K8sIngressProvider.

-// k8sServiceProvider is a provider for k8s ingress resources
+// K8sIngressProvider is a provider for k8s ingress resources

145-155: Consider preserving backend service order during deduplication.

The current deduplication approach using a map doesn't preserve the original order of backend services, which might be useful for debugging or understanding service routing priority.

You could maintain order by checking for duplicates before appending:

 if len(backendServices) > 0 {
-    // Deduplicate backend services
-    uniqueBackends := make(map[string]struct{})
-    for _, backend := range backendServices {
-        uniqueBackends[backend] = struct{}{}
-    }
-    var uniqueBackendsList []string
-    for backend := range uniqueBackends {
-        uniqueBackendsList = append(uniqueBackendsList, backend)
-    }
-    metadata["backend_services"] = strings.Join(uniqueBackendsList, ",")
+    // Deduplicate backend services while preserving order
+    seen := make(map[string]bool)
+    var uniqueBackends []string
+    for _, backend := range backendServices {
+        if !seen[backend] {
+            seen[backend] = true
+            uniqueBackends = append(uniqueBackends, backend)
+        }
+    }
+    metadata["backend_services"] = strings.Join(uniqueBackends, ",")
 }
pkg/providers/aws/ecs.go (1)

127-130: Consider improving error handling for EC2 instance description.

The current error handling continues on EC2 instance description failures, which might silently skip valid ECS resources. Consider logging the error or providing more specific error handling.

 describeInstancesOutput, err := ec2Client.DescribeInstances(describeInstancesInput)
 if err != nil {
+    // Log the error for debugging purposes
+    // log.Printf("Failed to describe EC2 instance %s: %v", aws.StringValue(instanceID), err)
     continue
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 595db91 and 44953e7.

📒 Files selected for processing (14)
  • pkg/providers/aws/alb.go (4 hunks)
  • pkg/providers/aws/aws.go (10 hunks)
  • pkg/providers/aws/cloudfront.go (4 hunks)
  • pkg/providers/aws/ecs.go (5 hunks)
  • pkg/providers/aws/eks.go (5 hunks)
  • pkg/providers/aws/elb.go (4 hunks)
  • pkg/providers/aws/lambda-api-gateway.go (8 hunks)
  • pkg/providers/aws/lightsail.go (4 hunks)
  • pkg/providers/aws/s3.go (3 hunks)
  • pkg/providers/digitalocean/instances.go (4 hunks)
  • pkg/providers/gcp/bucket.go (3 hunks)
  • pkg/providers/gcp/vms.go (4 hunks)
  • pkg/providers/k8s/ingress.go (2 hunks)
  • pkg/schema/schema.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • pkg/providers/digitalocean/instances.go
  • pkg/providers/aws/cloudfront.go
  • pkg/providers/aws/elb.go
  • pkg/providers/aws/lightsail.go
  • pkg/schema/schema.go
  • pkg/providers/gcp/bucket.go
  • pkg/providers/aws/aws.go
  • pkg/providers/aws/alb.go
  • pkg/providers/aws/s3.go
🧰 Additional context used
🧠 Learnings (1)
pkg/providers/aws/lambda-api-gateway.go (1)
Learnt from: mkrs2404
PR: projectdiscovery/cloudlist#691
File: pkg/providers/aws/aws.go:397-418
Timestamp: 2025-07-11T18:33:12.398Z
Learning: In the AWS provider (pkg/providers/aws/aws.go), if the Verify method fails, the provider instance is discarded and not used, so state inconsistency after verification failure is not a concern.
🧬 Code Graph Analysis (2)
pkg/providers/aws/ecs.go (2)
pkg/schema/schema.go (4)
  • Resource (142-163)
  • Provider (18-28)
  • AddMetadata (329-333)
  • AddMetadataInt (358-362)
pkg/providers/aws/aws.go (1)
  • Provider (106-122)
pkg/providers/aws/lambda-api-gateway.go (2)
pkg/providers/aws/aws.go (3)
  • ProviderOptions (32-44)
  • Provider (106-122)
  • New (125-214)
pkg/schema/schema.go (5)
  • NewResources (45-50)
  • Resources (39-42)
  • Resource (142-163)
  • Provider (18-28)
  • AddMetadata (329-333)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Analyze (go)
  • GitHub Check: Test Builds (1.22.x, ubuntu-latest)
  • GitHub Check: release-test
  • GitHub Check: Test Builds (1.22.x, macOS-latest)
  • GitHub Check: Test Builds (1.22.x, windows-latest)
  • GitHub Check: Lint Test
🔇 Additional comments (25)
pkg/providers/gcp/vms.go (1)

44-56: LGTM! Extended metadata collection implemented correctly.

The conditional metadata collection follows the established pattern across providers and properly integrates with the resource schema.

pkg/providers/k8s/ingress.go (1)

52-62: Good improvement: Added validation for non-empty values.

The checks for non-empty IP and hostname prevent creating invalid resources, which improves data quality.

pkg/providers/aws/eks.go (5)

7-7: LGTM: Imports are correctly added for new functionality.

The additional imports for strings, time, and Kubernetes corev1 types are appropriate for the extended metadata functionality being added.

Also applies to: 9-9, 18-18


78-81: LGTM: Conditional cluster metadata extraction is well-implemented.

The conditional metadata extraction based on the ExtendedMetadata option follows the established pattern and correctly initializes the metadata map.


92-95: LGTM: Conditional node metadata extraction follows the same pattern.

The node metadata extraction is consistently implemented with the same conditional logic as cluster metadata.


146-213: LGTM: Comprehensive cluster metadata extraction with proper error handling.

The getClusterMetadata method thoroughly extracts cluster information including ARN parsing, VPC configuration, OIDC details, and tags. The implementation properly handles nil checks and uses the schema helper functions consistently.


215-310: LGTM: Detailed node metadata extraction with good organization.

The getNodeMetadata method comprehensively extracts node information including labels, capacity, node info, and addresses. The code is well-organized with proper nil checks and consistent use of helper functions.

pkg/providers/aws/ecs.go (7)

6-6: LGTM: Imports are correctly added for new functionality.

The additional imports for strings and time are appropriate for the extended metadata functionality.

Also applies to: 8-8


38-39: LGTM: Client handling is properly updated for dual ECS/EC2 support.

The client initialization and goroutine parameter updates correctly handle both ECS and EC2 clients needed for comprehensive resource enumeration.

Also applies to: 42-42, 49-49


105-109: LGTM: Proper Fargate task handling with appropriate error handling.

The conditional logic correctly identifies Fargate tasks (no container instance ARN) and handles them separately through the processFargateTask method with appropriate error handling.


225-303: LGTM: Comprehensive Fargate task processing with proper ENI handling.

The processFargateTask method correctly handles Fargate tasks by describing ENIs to extract IP addresses. The implementation properly handles attachment details, ENI description, and resource creation with appropriate error handling.


305-402: LGTM: Detailed ECS task metadata extraction with good organization.

The getECSTaskMetadata method comprehensively extracts task information including ARN parsing, cluster/service details, task definition, status, and container information. The implementation is well-organized with proper nil checks.


404-434: LGTM: Effective Fargate metadata extraction reusing base metadata.

The getFargateTaskMetadata method efficiently reuses the base task metadata and adds Fargate-specific ENI details. The implementation correctly handles security groups and network interface information.


436-456: LGTM: Consistent tag building helper functions.

The tag building helper functions for both ECS and EC2 tags are well-implemented with proper nil checks and consistent formatting.

pkg/providers/aws/lambda-api-gateway.go (11)

8-8: LGTM: Imports are correctly added for new functionality.

The additional imports for time and apigatewayv2 SDK are appropriate for the API Gateway v2 support and extended metadata functionality.

Also applies to: 14-14


26-26: LGTM: API Gateway v2 client properly added to provider struct.

The addition of the apiGatewayV2 client field is consistent with the existing pattern and enables v2 API support.


38-65: LGTM: Well-structured resource retrieval with proper concurrency handling.

The refactored resource retrieval method properly handles both API Gateway v1 and v2 APIs with appropriate concurrency control and resource merging.


73-101: LGTM: Clean API Gateway v1 resource listing with metadata support.

The listAPIGateways method is well-implemented with proper API URL construction and conditional metadata extraction following the established pattern.


207-242: LGTM: Consistent API Gateway v2 resource listing implementation.

The listAPIGatewayV2s method follows the same pattern as the v1 implementation with proper pagination handling and metadata extraction.


267-276: LGTM: Proper client initialization with nil handling.

The client initialization correctly handles cases where API Gateway v1 or v2 clients may not be available, with appropriate nil checks and client creation.


282-289: LGTM: Consistent slice length management for client arrays.

The logic to ensure all client slices have the same length is well-implemented and prevents index mismatches when clients are not available.


304-315: LGTM: Proper assumed role client creation with nil handling.

The assumed role client creation correctly handles cases where API Gateway clients may not be available, maintaining consistency with the base client initialization.


321-369: LGTM: Comprehensive API Gateway v1 metadata extraction.

The getAPIGatewayMetadata method thoroughly extracts API information including creation date, endpoint configuration, stages, and tags with proper error handling and nil checks.


371-413: LGTM: Detailed API Gateway v2 metadata extraction with CORS support.

The getAPIGatewayV2Metadata method comprehensively extracts v2 API information including protocol type, CORS configuration, and endpoint settings with consistent implementation patterns.


425-433: LGTM: Consistent tag building helper function.

The buildAwsMapTagString helper function is well-implemented with proper nil checks and consistent formatting, following the established pattern used across other AWS providers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
pkg/providers/gcp/cloud-run.go (2)

89-183: Consider breaking down this long method for better maintainability.

The getServiceMetadata method is 94 lines long, which impacts readability. Consider extracting logical sections into helper methods (e.g., extractServiceLabels, extractServiceAnnotations, extractContainerInfo).


117-123: Label pairs may have non-deterministic ordering.

The map iteration order is not guaranteed in Go, which could result in inconsistent label string formatting across runs.

Apply this diff to ensure consistent ordering:

 		if len(service.Metadata.Labels) > 0 {
 			var labelPairs []string
 			for key, value := range service.Metadata.Labels {
 				labelPairs = append(labelPairs, fmt.Sprintf("%s=%s", key, value))
 			}
+			sort.Strings(labelPairs)
 			metadata["labels"] = strings.Join(labelPairs, ",")
 		}

Don't forget to add "sort" to the imports at the top of the file.

pkg/providers/gcp/dns.go (1)

121-127: Zone labels may have non-deterministic ordering.

Similar to the Cloud Run provider, the zone labels map iteration could result in inconsistent formatting.

Apply this diff to ensure consistent ordering:

 		if len(zone.Labels) > 0 {
 			var labelPairs []string
 			for key, value := range zone.Labels {
 				labelPairs = append(labelPairs, fmt.Sprintf("%s=%s", key, value))
 			}
+			sort.Strings(labelPairs)
 			metadata["zone_labels"] = strings.Join(labelPairs, ",")
 		}

Don't forget to add "sort" to the imports if not already present.

pkg/providers/gcp/assets_api.go (1)

314-342: Consider simplifying the complex DNS record reconstruction logic.

The DNS extended metadata extraction rebuilds a dns.ResourceRecordSet from protobuf data. This reconstruction logic is complex and could be simplified.

Consider extracting the record set reconstruction into a helper method:

func reconstructRecordSetFromProto(data *structpb.Struct) *dns.ResourceRecordSet {
	recordSet := &dns.ResourceRecordSet{
		Name: getStringField(data, "name"),
		Type: getStringField(data, "type"),
	}
	// ... rest of the reconstruction logic
	return recordSet
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 44953e7 and 06cd01c.

📒 Files selected for processing (4)
  • pkg/providers/gcp/assets_api.go (1 hunks)
  • pkg/providers/gcp/cloud-run.go (3 hunks)
  • pkg/providers/gcp/dns.go (5 hunks)
  • pkg/providers/gcp/gcp.go (14 hunks)
🧰 Additional context used
🧠 Learnings (2)
pkg/providers/gcp/assets_api.go (2)
Learnt from: dogancanbakir
PR: projectdiscovery/cloudlist#687
File: pkg/providers/gcp/bucket.go:30-30
Timestamp: 2025-07-09T17:50:43.982Z
Learning: In Google Cloud Go SDK's assetpb package, ContentType_RESOURCE includes IAM policy data when available. There is no ContentType_RESOURCE_AND_IAM_POLICY constant in the assetpb package. The correct approach is to use ContentType_RESOURCE and access asset.IamPolicy directly, as shown in the GCP provider implementations.
Learnt from: dogancanbakir
PR: projectdiscovery/cloudlist#687
File: pkg/providers/gcp/bucket.go:30-30
Timestamp: 2025-07-09T17:50:43.982Z
Learning: The Google Cloud Go SDK assetpb package does not have a ContentType_RESOURCE_AND_IAM_POLICY constant. The available constants are ContentType_RESOURCE and ContentType_IAM_POLICY as separate values. Using ContentType_RESOURCE appears to include IAM policy data in the response, allowing access to asset.IamPolicy.
pkg/providers/gcp/gcp.go (1)
Learnt from: dogancanbakir
PR: projectdiscovery/cloudlist#687
File: pkg/providers/gcp/bucket.go:30-30
Timestamp: 2025-07-09T17:50:43.982Z
Learning: In Google Cloud Go SDK's assetpb package, ContentType_RESOURCE includes IAM policy data when available. There is no ContentType_RESOURCE_AND_IAM_POLICY constant in the assetpb package. The correct approach is to use ContentType_RESOURCE and access asset.IamPolicy directly, as shown in the GCP provider implementations.
🧬 Code Graph Analysis (1)
pkg/providers/gcp/cloud-run.go (3)
pkg/schema/schema.go (3)
  • Resource (142-163)
  • Provider (18-28)
  • AddMetadata (329-333)
pkg/providers/gcp/gcp.go (1)
  • Provider (26-38)
pkg/providers/k8s/kubernetes.go (1)
  • Provider (20-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Analyze (go)
  • GitHub Check: Test Builds (1.22.x, windows-latest)
  • GitHub Check: Test Builds (1.22.x, ubuntu-latest)
  • GitHub Check: Test Builds (1.22.x, macOS-latest)
  • GitHub Check: release-test
  • GitHub Check: Lint Test
🔇 Additional comments (2)
pkg/providers/gcp/gcp.go (2)

263-275: Good implementation of dual Cloud Functions API support.

The code properly initializes both v1 and v2 Cloud Functions clients with appropriate error messages. This provides good backward compatibility while supporting newer API features.


576-585: Useful helper function for null-safe metadata extraction.

The getStringFieldPointer helper is a good addition that works well with the schema.AddMetadata pattern for safe null handling.

@Ice3man543
Copy link
Member Author

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pkg/providers/aws/eks.go (1)

112-112: Critical: Duplicate resource IDs still present (existing issue).

This is the same issue flagged in the previous review. Using ep.options.Id for all resources creates duplicate IDs, breaking resource uniqueness and causing schema deduplication issues.

The fixes remain the same:

  • Line 112: Replace ep.options.Id with node.GetName()
  • Line 129: Replace ep.options.Id with a unique identifier like pod.GetName() or a composite key
-    ID:         ep.options.Id,
+    ID:         node.GetName(),

For pods, you'll need to capture the pod name during iteration:

-    ID:          ep.options.Id,
+    ID:          fmt.Sprintf("%s/%s", node.GetName(), pod.GetName()),

Also applies to: 129-129

🧹 Nitpick comments (2)
pkg/providers/gcp/function.go (2)

22-78: Well-structured dual API version handling.

The method correctly processes both v1 and v2 functions with appropriate filtering and error handling. The conditional metadata extraction and service differentiation are well implemented.

Consider extracting the resource creation logic into helper methods to reduce duplication:

+func (d *cloudFunctionsProvider) createV2Resource(function *cloudfunctions.Function, metadata map[string]string) *schema.Resource {
+	return &schema.Resource{
+		ID:       d.id,
+		Provider: providerName,
+		DNSName:  function.Url,
+		Public:   d.isPublicFunctionV2(function.Name),
+		Service:  "cloud-function-v2",
+		Metadata: metadata,
+	}
+}
+
+func (d *cloudFunctionsProvider) createV1Resource(function *cloudfunctionsv1.CloudFunction, metadata map[string]string) *schema.Resource {
+	return &schema.Resource{
+		ID:       d.id,
+		Provider: providerName,
+		DNSName:  function.HttpsTrigger.Url,
+		Public:   d.isPublicFunctionV1(function.Name),
+		Service:  "cloud-function-v1",
+		Metadata: metadata,
+	}
+}

141-259: Comprehensive metadata extraction with proper API version handling.

Both metadata extraction methods thoroughly capture relevant function attributes with appropriate nil checking and consistent use of schema helpers. The different field structures between v1 and v2 APIs are handled correctly.

Consider optimizing label processing for functions with many labels:

	if len(function.Labels) > 0 {
-		var labelPairs []string
-		for k, v := range function.Labels {
-			labelPairs = append(labelPairs, fmt.Sprintf("%s=%s", k, v))
-		}
-		labelString := strings.Join(labelPairs, ",")
+		labelPairs := make([]string, 0, len(function.Labels))
+		for k, v := range function.Labels {
+			labelPairs = append(labelPairs, fmt.Sprintf("%s=%s", k, v))
+		}
+		labelString := strings.Join(labelPairs, ",")
		schema.AddMetadata(metadata, "labels", &labelString)
	}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe2276e and 8cc5072.

📒 Files selected for processing (6)
  • pkg/providers/aws/eks.go (5 hunks)
  • pkg/providers/azure/trafficmanager.go (3 hunks)
  • pkg/providers/azure/vm.go (7 hunks)
  • pkg/providers/dnssimple/dns.go (1 hunks)
  • pkg/providers/gcp/function.go (3 hunks)
  • pkg/providers/k8s/ingress.go (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • pkg/providers/dnssimple/dns.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/providers/k8s/ingress.go
  • pkg/providers/azure/trafficmanager.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: release-test
  • GitHub Check: Test Builds (1.22.x, ubuntu-latest)
  • GitHub Check: Test Builds (1.22.x, windows-latest)
  • GitHub Check: Analyze (go)
  • GitHub Check: Test Builds (1.22.x, macOS-latest)
  • GitHub Check: Lint Test
🔇 Additional comments (18)
pkg/providers/aws/eks.go (7)

7-7: LGTM: Import additions are appropriate.

The new imports (strings, time, and corev1) are correctly added to support the extended metadata functionality.

Also applies to: 9-9, 18-18


78-81: LGTM: Conditional metadata collection is well implemented.

The conditional logic properly checks ep.options.ExtendedMetadata before calling the metadata collection methods, ensuring performance impact is minimal when extended metadata is not needed.

Also applies to: 92-95


116-116: LGTM: Metadata attachment is properly implemented.

The metadata is correctly attached to the Metadata field of schema resources, following the established pattern.

Also applies to: 133-133


119-126: Good implementation of pod metadata with cluster context.

The pod metadata creation properly inherits cluster information and adds node context, providing good traceability across the resource hierarchy.


146-213: Excellent comprehensive cluster metadata collection.

The getClusterMetadata method provides thorough coverage of EKS cluster attributes including:

  • Basic cluster information (name, ARN, version, status)
  • VPC configuration details
  • OIDC identity provider information
  • Certificate authority presence
  • Resource tags via API call

The implementation properly handles nil checks and uses appropriate helper functions like schema.AddMetadata and buildAwsMapTagString.


215-310: Excellent comprehensive node metadata collection.

The getNodeMetadata method provides extensive node-level details including:

  • Kubernetes node information (labels, conditions, capacity)
  • AWS-specific attributes (instance type, availability zone, nodegroup)
  • System information (kubelet version, OS image, architecture)
  • Network addresses (internal/external IPs and DNS)
  • Provider ID parsing to extract instance ID

The implementation handles edge cases well with proper nil checks and conditional logic.


156-160: No action needed: parseARN is already defined
The helper parseARN(arn string) *ARNComponents exists in pkg/providers/aws/aws.go (lines 519–541), so no additional imports or definitions are required.

pkg/providers/gcp/function.go (3)

6-18: LGTM! Proper dual API version support setup.

The imports and struct modifications correctly establish support for both Cloud Functions v1 and v2 APIs, with appropriate field additions for the extended metadata feature.


80-102: Consistent and correct function fetching implementation.

Both helper methods follow the same pattern with proper pagination handling and appropriate return types for their respective API versions.


104-139: Proper IAM policy checking for both API versions.

The public function checking methods correctly validate IAM policies and handle the different function name formats between v1 and v2 APIs. The additional validation in the v1 method is appropriate.

pkg/providers/azure/vm.go (8)

5-8: LGTM: Import additions are well-justified.

The new imports (fmt, strings, time) are appropriately added to support the extended metadata functionality for string formatting, manipulation, and timestamp handling.


23-26: LGTM: Extended metadata field properly added.

The extendedMetadata boolean field is correctly added to the vmProvider struct, enabling conditional metadata collection as described in the PR objectives.


94-94: LGTM: Enhanced logging for debugging.

Adding a warning log when no public IP is found helps with troubleshooting connectivity issues during resource discovery.


104-104: LGTM: Bug fix for resource group parameter.

Correctly using res.ResourceGroup (parsed from the resource ID) instead of the method parameter group ensures the public IP is fetched from the correct resource group.


122-127: LGTM: Conditional metadata collection implemented correctly.

The extended metadata is collected only when the flag is enabled, and properly assigned to the resource. The conditional approach ensures no performance impact when metadata is not needed.


136-150: LGTM: Enhanced DNS resource handling with proper null checks.

The code now properly checks for nil DNSSettings before accessing Fqdn, preventing potential panics. The metadata copying ensures DNS resources inherit VM metadata, maintaining consistency across related resources.


216-295: LGTM: Comprehensive and well-structured metadata extraction.

The getVMMetadata method properly extracts extensive VM details while handling Azure SDK's nullable fields correctly. Key strengths:

  • Uses schema.AddMetadata helper for safe null handling of string pointers
  • Proper nested null checks for complex objects (OsProfile, StorageProfile, etc.)
  • Correct enum-to-string conversions and time formatting
  • Comprehensive coverage of VM attributes including identity, plan, and availability information

297-305: LGTM: Well-implemented tag conversion helper.

The buildAzureTagString function properly converts Azure's tag format (map[string]*string) to a comma-separated string while safely handling nil values. The implementation is clean and follows good defensive programming practices.

@dogancanbakir dogancanbakir requested a review from Mzack9999 July 23, 2025 09:33
@ehsandeep ehsandeep requested review from dogancanbakir and removed request for Mzack9999 July 23, 2025 15:57
@ehsandeep ehsandeep merged commit f436584 into dev Aug 17, 2025
8 checks passed
@ehsandeep ehsandeep deleted the additional-metadata-cloudlist-sources branch August 17, 2025 13:17
@ehsandeep ehsandeep removed the request for review from dogancanbakir August 17, 2025 13:17
visnetodev pushed a commit to visnetotest/cloudlist that referenced this pull request Dec 7, 2025
…ojectdiscovery#686)

* feat: added extended metadata support for digitalocean + cloudlist + misc

* feat: added metadata to gcp + k8s ingress source

* feat: added metadata support for aws services + misc

* got all aws providers working

* feat: added azure metadata additions + misc

* feat: more additions for review comments

* feat: added additional fetch logic for gcp providers

* misc

* feat: use list APIs for gcp assets list extended metadata

* feat: lint fixes

* avoid de-dup eks id

---------

Co-authored-by: PDTeamX <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve AWS Integration Contextualization and Customization in cloudlist Expose additional metadata from cloud assets

2 participants