Skip to content

Unable to create namespace when s3 archival and internal-frontend is enabled #7631

@dlydiard

Description

@dlydiard

Expected Behavior

Namespace is created when archival is enabled.

Actual Behavior

Error: Register namespace operation failed.
Error Details: rpc error: code = Unknown desc = something went wrong, please retry (6b50ceaa)

Steps to Reproduce the Problem

global:
    membership:
        maxJoinDuration: 30s
        broadcastAddress: '{{ default .Env.POD_IP "0.0.0.0" }}'
    pprof:
        port: 0
        host: ""
    tls:
        internode:
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
            server:
                certFile: ""
                keyFile: ""
                clientCaFiles: []
                certData: ""
                keyData: ""
                clientCaData: []
                requireClientAuth: false
            hostOverrides: {}
        frontend:
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
            server:
                certFile: ""
                keyFile: ""
                clientCaFiles: []
                certData: ""
                keyData: ""
                clientCaData: []
                requireClientAuth: false
            hostOverrides: {}
        systemWorker:
            certFile: ""
            keyFile: ""
            certData: ""
            keyData: ""
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
        remoteClusters: {}
        expirationChecks:
            warningWindow: 0s
            errorWindow: 0s
            checkInterval: 0s
        refreshInterval: 0s
    metrics:
        tags:
            type: '{{ .Env.SERVICES }}'
        excludeTags: {}
        prefix: ""
        perUnitHistogramBoundaries: {}
        m3: null
        statsd: null
        prometheus:
            framework: ""
            listenAddress: 0.0.0.0:9090
            handlerPath: ""
            loggerRPS: 0
            listenNetwork: ""
            timerType: histogram
            defaultHistogramBoundaries: []
            defaultHistogramBuckets: []
            defaultSummaryObjectives: []
            onError: ""
            sanitizeOptions: null
    authorization:
        jwtKeyProvider:
            keySourceURIs:
                - https://dev.auth.test.com/.well-known/jwks.json
            refreshInterval: 10m0s
        permissionsClaimName: https://temporal/permissions
        authorizer: default
        claimMapper: default
        authHeaderName: ""
        authExtraHeaderName: ""
persistence:
    defaultStore: default
    visibilityStore: visibility
    secondaryVisibilityStore: ""
    numHistoryShards: 4
    datastores:
        default:
            faultInjection: null
            cassandra: null
            sql:
                user: temporal-default-store
                password: '{{ .Env.TEMPORAL_DEFAULT_DATASTORE_PASSWORD }}'
                pluginName: postgres12
                databaseName: temporal-default-store
                connectAddr: test.us-east-1.rds.amazonaws.com:5432
                connectProtocol: tcp
                connectAttributes: {}
                maxConns: 0
                maxIdleConns: 0
                maxConnLifetime: 0s
                taskScanPartitions: 0
                tls: null
            customDatastore: null
            elasticsearch: null
        visibility:
            faultInjection: null
            cassandra: null
            sql:
                user: temporal-visibility-store
                password: '{{ .Env.TEMPORAL_VISIBILITY_DATASTORE_PASSWORD }}'
                pluginName: postgres12
                databaseName: temporal-visibility-store
                connectAddr: test.us-east-1.rds.amazonaws.com:5432
                connectProtocol: tcp
                connectAttributes: {}
                maxConns: 0
                maxIdleConns: 0
                maxConnLifetime: 0s
                taskScanPartitions: 0
                tls: null
            customDatastore: null
            elasticsearch: null
    advancedVisibilityStore: ""
log:
    stdout: true
    level: debug
    outputFile: ""
    format: json
    development: false
clusterMetadata:
    enableGlobalNamespace: false
    failoverVersionIncrement: 10
    masterClusterName: temporal
    currentClusterName: temporal
    clusterInformation:
        temporal:
            enabled: true
            initialFailoverVersion: 1
            rpcAddress: 127.0.0.1:7233
            httpAddress: ""
    tags: {}
dcRedirectionPolicy:
    policy: ""
services:
    frontend:
        rpc:
            grpcPort: 7233
            membershipPort: 6933
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
            httpPort: 7243
            httpAdditionalForwardedHeaders: []
    history:
        rpc:
            grpcPort: 7234
            membershipPort: 6934
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
            httpPort: 0
            httpAdditionalForwardedHeaders: []
    internal-frontend:
        rpc:
            grpcPort: 7236
            membershipPort: 6936
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
            httpPort: 0
            httpAdditionalForwardedHeaders: []
    matching:
        rpc:
            grpcPort: 7235
            membershipPort: 6935
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
            httpPort: 0
            httpAdditionalForwardedHeaders: []
    worker:
        rpc:
            grpcPort: 7239
            membershipPort: 6939
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
            httpPort: 0
            httpAdditionalForwardedHeaders: []
archival:
    history:
        state: enabled
        enableRead: true
        provider:
            filestore: null
            gstorage: null
            s3store:
                region: us-east-1
                endpoint: null
                s3ForcePathStyle: false
                logLevel: 0
    visibility:
        state: enabled
        enableRead: true
        provider:
            filestore: null
            s3store:
                region: us-east-1
                endpoint: null
                s3ForcePathStyle: false
                logLevel: 0
            gstorage: null
publicClient:
    hostPort: ""
    httpHostPort: ""
    forceTLSConfig: ""
dynamicConfigClient:
    filepath: /etc/temporal/config/dynamic_config.yaml
    pollInterval: 1m0s
namespaceDefaults:
    archival:
        history:
            state: enabled
            URI: s3://temporal-system-history
        visibility:
            state: enabled
            URI: s3://temporal-system-visibility
otel: {}

I verified the PODs have access to the S3 buckets via IRSA. All ServiceAccount annotations look correct and the IAM Role trust relationship and policies look correct.

internal-frontend logs

{"level":"error","ts":"2025-04-18T02:16:42.922Z","msg":"service failures","operation":"RegisterNamespace","wf-namespace":"test","error":"unable to find bootstrap container for the given service name","logging-call-at":"/home/runner/work/docker-builds/docker-builds/temporal/common/rpc/interceptor/telemetry.go:434","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/runner/work/docker-builds/docker-builds/temporal/common/log/zap_logger.go:155\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).HandleError\n\t/home/runner/work/docker-builds/docker-builds/temporal/common/rpc/interceptor/telemetry.go:434\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).UnaryIntercept\n\t/home/runner/work/docker-builds/docker-builds/temporal/common/rpc/interceptor/telemetry.go:202\ngoogle.golang.org/grpc.getChainUnaryHandler.func1\n\t/home/runner/go/pkg/mod/google.golang.org/[email protected]/server.go:1196\ngo.temporal.io/server/common/rpc/interceptor.(*Redirection).handl...

Main error:

unable to find bootstrap container for the given service name

I do not see any s3 errors in the history POD.
I'm running from the admintools POD:

export NS=test
export TEMPORAL_RETENTION="15"
export [email protected]

tctl --namespace "$NS" namespace register \
      --description "$NS" \
      --owner_email "$TEMPORAL_OWNER" \
      --retention "$TEMPORAL_RETENTION" \
      --history_archival_state enabled \
      --visibility_archival_state enabled 

Error: Register namespace operation failed.

If I disable history_archival_state and visibility_archival_state the Namespace creates successfully.

Specifications

  • Version: 1.25.2
  • Platform: OpenShift

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions