-
Notifications
You must be signed in to change notification settings - Fork 38
Closed
Labels
type: bugSomething isn't workingSomething isn't working
Description
I tried this:
When upgrading to latest alpha17, the operator fails to start an instance and throws errors about optimistic concurrency, the job doesnt start but the operator still tries to delete it. Please see logs below.
BTW, i did:
- clean the redis DB,
- recreated a new service account (was named 'adapter.default', now needed to add 'default.default')
- added new rule to the ClusterRole
operator-roleto manage Job' batch - created new workflow definition, mapped it to the only available operator (not sure if really required?)
- tried to start a workflow instance from the UI, nothing happens, check operator logs...
This happened:
[21:59:06] info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
[21:59:06] info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
[21:59:06] info: Microsoft.Hosting.Lifetime[0]
Content root path: /app
[21:59:45] fail: Synapse.Operator.Services.WorkflowInstanceController[0]
An error occurred while handling the creation of workflow instance 'form-status-set-3354c346f663.default': Neuroglia.ProblemDetailsException: [409 - Conflict] Failed to update the resource 'synapse.io/v1/namespaces/default/workflow-instances/form-status-set-3354c346f663/status' due to an optimistic concurrency error: the resource's target version '62581DD3' differs from the actual version 'B87D1FFA'
at Neuroglia.Data.Infrastructure.ResourceOriented.Services.RedisDatabase.PatchSubResourceAsync(Patch patch, String group, String version, String plural, String name, String subResource, String namespace, String resourceVersion, Boolean dryRun, CancellationToken cancellationToken) in /home/runner/work/framework/framework/src/Neuroglia.Data.Infrastructure.ResourceOriented.Redis/Services/RedisDatabase.cs:line 248
at Neuroglia.Data.Infrastructure.ResourceOriented.Services.ResourceRepository.PatchSubResourceAsync(Patch patch, String group, String version, String plural, String name, String subResource, String namespace, String resourceVersion, Boolean dryRun, CancellationToken cancellationToken) in /home/runner/work/framework/framework/src/Neuroglia.Data.Infrastructure.ResourceOriented/Services/ResourceRepository.cs:line 318
at Neuroglia.Data.Infrastructure.ResourceOriented.IResourceRepositoryExtensions.PatchStatusAsync[TResource](IResourceRepository repository, Patch patch, String name, String namespace, String resourceVersion, Boolean dryRun, CancellationToken cancellationToken) in /home/runner/work/framework/framework/src/Neuroglia.Data.Infrastructure.ResourceOriented.Abstractions/Extensions/IResourceRepositoryExtensions.cs:line 265
at Synapse.Operator.Services.WorkflowInstanceHandler.UpdateWorkflowInstanceStatusAsync(Action`1 statusUpdate, CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceHandler.cs:line 232
at Synapse.Operator.Services.WorkflowInstanceHandler.StartProcessAsync(CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceHandler.cs:line 138
at Synapse.Operator.Services.WorkflowInstanceHandler.HandleAsync(CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceHandler.cs:line 123
at Synapse.Operator.Services.WorkflowInstanceController.OnResourceCreatedAsync(WorkflowInstance workflowInstance, CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceController.cs:line 223
[22:00:59] fail: Synapse.Runtime.Kubernetes.Services.KubernetesRuntime[0]
An error occurred while deleting the Kubernetes process with id 'form-status-set-f848363e9b0f.default-613605c320ca.synapse': k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'NotFound', response body {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"form-status-set-f848363e9b0f\" not found","reason":"NotFound","details":{"name":"form-status-set-f848363e9b0f","group":"batch","kind":"jobs"},"code":404}
at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.IBatchV1Operations_DeleteNamespacedJobWithHttpMessagesAsync[T](String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.k8s.IBatchV1Operations.DeleteNamespacedJobWithHttpMessagesAsync(String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.BatchV1OperationsExtensions.DeleteNamespacedJobAsync(IBatchV1Operations operations, String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, CancellationToken cancellationToken)
at Synapse.Runtime.Kubernetes.Services.KubernetesRuntime.DeleteProcessAsync(String processId, CancellationToken cancellationToken) in /src/src/runtime/Synapse.Runtime.Kubernetes/Services/KubernetesRuntime.cs:line 177
[22:00:59] warn: Synapse.Operator.Services.WorkflowInstanceController[0]
Failed to delete process with id 'form-status-set-f848363e9b0f.default-613605c320ca.synapse' for workflow instance 'form-status-set-f848363e9b0f.default'
k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'NotFound', response body {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"form-status-set-f848363e9b0f\" not found","reason":"NotFound","details":{"name":"form-status-set-f848363e9b0f","group":"batch","kind":"jobs"},"code":404}
at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.IBatchV1Operations_DeleteNamespacedJobWithHttpMessagesAsync[T](String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.k8s.IBatchV1Operations.DeleteNamespacedJobWithHttpMessagesAsync(String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.BatchV1OperationsExtensions.DeleteNamespacedJobAsync(IBatchV1Operations operations, String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, CancellationToken cancellationToken)
at Synapse.Runtime.Kubernetes.Services.KubernetesRuntime.DeleteProcessAsync(String processId, CancellationToken cancellationToken) in /src/src/runtime/Synapse.Runtime.Kubernetes/Services/KubernetesRuntime.cs:line 177
at Synapse.Operator.Services.WorkflowInstanceController.OnResourceDeletedAsync(WorkflowInstance workflowInstance, CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceController.cs:line 282
[22:00:59] warn: Synapse.Runtime.Kubernetes.Services.KubernetesRuntime[0]
Failed to gracefully stop process 'form-status-set-3354c346f663.default-565505594e74.synapse': k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'NotFound', response body {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"form-status-set-3354c346f663.default-565505594e74\" not found","reason":"NotFound","details":{"name":"form-status-set-3354c346f663.default-565505594e74","group":"batch","kind":"jobs"},"code":404}
at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.IBatchV1Operations_DeleteNamespacedJobWithHttpMessagesAsync[T](String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.k8s.IBatchV1Operations.DeleteNamespacedJobWithHttpMessagesAsync(String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.BatchV1OperationsExtensions.DeleteNamespacedJobAsync(IBatchV1Operations operations, String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, CancellationToken cancellationToken)
at Synapse.Runtime.Kubernetes.Services.KubernetesWorkflowProcess.StopAsync(CancellationToken cancellationToken) in /src/src/runtime/Synapse.Runtime.Kubernetes/Services/KubernetesWorkflowProcess.cs:line 164
at Synapse.Runtime.Kubernetes.Services.KubernetesRuntime.DeleteProcessAsync(String processId, CancellationToken cancellationToken) in /src/src/runtime/Synapse.Runtime.Kubernetes/Services/KubernetesRuntime.cs:line 170
[22:00:59] fail: Synapse.Runtime.Kubernetes.Services.KubernetesRuntime[0]
An error occurred while deleting the Kubernetes process with id 'form-status-set-3354c346f663.default-565505594e74.synapse': k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'NotFound', response body {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"form-status-set-3354c346f663\" not found","reason":"NotFound","details":{"name":"form-status-set-3354c346f663","group":"batch","kind":"jobs"},"code":404}
at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.IBatchV1Operations_DeleteNamespacedJobWithHttpMessagesAsync[T](String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.k8s.IBatchV1Operations.DeleteNamespacedJobWithHttpMessagesAsync(String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.BatchV1OperationsExtensions.DeleteNamespacedJobAsync(IBatchV1Operations operations, String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, CancellationToken cancellationToken)
at Synapse.Runtime.Kubernetes.Services.KubernetesRuntime.DeleteProcessAsync(String processId, CancellationToken cancellationToken) in /src/src/runtime/Synapse.Runtime.Kubernetes/Services/KubernetesRuntime.cs:line 177
[22:00:59] warn: Synapse.Operator.Services.WorkflowInstanceController[0]
Failed to delete process with id 'form-status-set-3354c346f663.default-565505594e74.synapse' for workflow instance 'form-status-set-3354c346f663.default'
k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'NotFound', response body {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"form-status-set-3354c346f663\" not found","reason":"NotFound","details":{"name":"form-status-set-3354c346f663","group":"batch","kind":"jobs"},"code":404}
at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.IBatchV1Operations_DeleteNamespacedJobWithHttpMessagesAsync[T](String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.AbstractKubernetes.k8s.IBatchV1Operations.DeleteNamespacedJobWithHttpMessagesAsync(String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, IReadOnlyDictionary`2 customHeaders, CancellationToken cancellationToken)
at k8s.BatchV1OperationsExtensions.DeleteNamespacedJobAsync(IBatchV1Operations operations, String name, String namespaceParameter, V1DeleteOptions body, String dryRun, Nullable`1 gracePeriodSeconds, Nullable`1 ignoreStoreReadErrorWithClusterBreakingPotential, Nullable`1 orphanDependents, String propagationPolicy, Nullable`1 pretty, CancellationToken cancellationToken)
at Synapse.Runtime.Kubernetes.Services.KubernetesRuntime.DeleteProcessAsync(String processId, CancellationToken cancellationToken) in /src/src/runtime/Synapse.Runtime.Kubernetes/Services/KubernetesRuntime.cs:line 177
at Synapse.Operator.Services.WorkflowInstanceController.OnResourceDeletedAsync(WorkflowInstance workflowInstance, CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceController.cs:line 282
[22:06:31] fail: Synapse.Operator.Services.WorkflowInstanceController[0]
An error occurred while handling the creation of workflow instance 'form-status-set-2965fe738f0f.default': Neuroglia.ProblemDetailsException: [409 - Conflict] Failed to update the resource 'synapse.io/v1/namespaces/default/workflow-instances/form-status-set-2965fe738f0f/status' due to an optimistic concurrency error: the resource's target version '22839830' differs from the actual version '5D490D61'
at Neuroglia.Data.Infrastructure.ResourceOriented.Services.RedisDatabase.PatchSubResourceAsync(Patch patch, String group, String version, String plural, String name, String subResource, String namespace, String resourceVersion, Boolean dryRun, CancellationToken cancellationToken) in /home/runner/work/framework/framework/src/Neuroglia.Data.Infrastructure.ResourceOriented.Redis/Services/RedisDatabase.cs:line 248
at Neuroglia.Data.Infrastructure.ResourceOriented.Services.ResourceRepository.PatchSubResourceAsync(Patch patch, String group, String version, String plural, String name, String subResource, String namespace, String resourceVersion, Boolean dryRun, CancellationToken cancellationToken) in /home/runner/work/framework/framework/src/Neuroglia.Data.Infrastructure.ResourceOriented/Services/ResourceRepository.cs:line 318
at Neuroglia.Data.Infrastructure.ResourceOriented.IResourceRepositoryExtensions.PatchStatusAsync[TResource](IResourceRepository repository, Patch patch, String name, String namespace, String resourceVersion, Boolean dryRun, CancellationToken cancellationToken) in /home/runner/work/framework/framework/src/Neuroglia.Data.Infrastructure.ResourceOriented.Abstractions/Extensions/IResourceRepositoryExtensions.cs:line 265
at Synapse.Operator.Services.WorkflowInstanceHandler.UpdateWorkflowInstanceStatusAsync(Action`1 statusUpdate, CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceHandler.cs:line 232
at Synapse.Operator.Services.WorkflowInstanceHandler.StartProcessAsync(CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceHandler.cs:line 138
at Synapse.Operator.Services.WorkflowInstanceHandler.HandleAsync(CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceHandler.cs:line 123
at Synapse.Operator.Services.WorkflowInstanceController.OnResourceCreatedAsync(WorkflowInstance workflowInstance, CancellationToken cancellationToken) in /src/src/operator/Synapse.Operator/Services/WorkflowInstanceController.cs:line 223
I expected this:
No response
Is there a workaround?
No response
Anything else?
No response
Platform(s)
No response
Community Notes
- Please vote by adding a 👍 reaction to the issue to help us prioritize.
- If you are interested to work on this issue, please leave a comment.name: Bug Report 🐞
Metadata
Metadata
Assignees
Labels
type: bugSomething isn't workingSomething isn't working