Skip to content

Conversation

@jsturtevant
Copy link
Contributor

@jsturtevant jsturtevant commented Jul 19, 2022

Signed-off-by: James Sturtevant [email protected]

fixes: #7184

using same steps as in #7184:

./crictl stats  -o json
{
  "stats": [
    {
      "attributes": {
        "id": "bd4495d239799d7cad9080288b919f7386a2d75ef78891a40bee6fd8b8bec836",
        "metadata": {
          "name": "pause",
          "attempt": 0
        },
        "labels": {
        },
        "annotations": {
        }
      },
      "cpu": {
        "timestamp": "1658273680420262900",
        "usageCoreNanoSeconds": {
          "value": "6718750000"
        },
        "usageNanoCores": {
          "value": "91776"
        }
      },
      ....

@k8s-ci-robot
Copy link

Hi @jsturtevant. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jsturtevant
Copy link
Contributor Author

Looks like the windows jobs failed when pulling the images.

 unexpected status from HEAD request to https://mcr.microsoft.com/v2/windows/nanoserver/manifests/ltsc2022: 503 Service 

Linux job looks like a flake

E0720 00:20:32.045046   15968 portforward.go:406] an error occurred forwarding 12002 -> 12003: error forwarding port 12003 to pod 28e3cd716afb6d112dbc67c53e9a512c7237fe86ed5186922db92dd5f5cde613, uid : failed to execute portforward in network namespace "host": failed to connect to localhost:12003 inside namespace "28e3cd716afb6d112dbc67c53e9a512c7237fe86ed5186922db92dd5f5cde613", IPv4: dial tcp4 127.0.0.1:12003: connect: connection refused IPv6 dial tcp6 [::1]:12003: connect: connection refused

Is there a way to retrigger those jobs?

@jsturtevant
Copy link
Contributor Author

/assign @dcantah @bobbypage

@k8s-ci-robot
Copy link

@jsturtevant: GitHub didn't allow me to assign the following users: bobbypage.

Note that only containerd members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

/assign @dcantah @bobbypage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dmcgowan dmcgowan merged commit f1eced5 into containerd:main Jul 20, 2022
@endocrimes
Copy link
Contributor

This PR has caused a regression - we now panic when entering line 69 in kubernetes CI that targets containerd main.

Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: panic: runtime error: invalid memory address or nil pointer dereference [recovered]
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         panic: runtime error: invalid memory address or nil pointer dereference
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x55594800a09a]
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: goroutine 6425 [running]:
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.func1()
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/go.opentelemetry.io/otel/sdk/trace/span.go:359 +0x2a
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End(0xc000b69980, {0x0, 0x0, 0x28?})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/go.opentelemetry.io/otel/sdk/trace/span.go:398 +0x8dd
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: panic({0x5559484d79c0, 0x5559492ede00})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /usr/local/go/src/runtime/panic.go:838 +0x207
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/containerd/containerd/pkg/cri/server.(*criService).toCRIContainerStats(0xc000228fc0?, {0xc000039140, 0x2, 0x5559484d3940?}, {0xc0008e3800, 0x8, 0x5559471856b8?})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/pkg/cri/server/container_stats_list.go:69 +0x23a
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/containerd/containerd/pkg/cri/server.(*criService).ListContainerStats(0xc0001f8d80, {0x555948737ad0, 0xc000fee360}, 0x6?)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/pkg/cri/server/container_stats_list.go:46 +0x13f
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/containerd/containerd/pkg/cri/server.(*instrumentedService).ListContainerStats(0xc0000107d0, {0x555948737ad0, 0xc000fee1b0}, 0xc000038b20)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/pkg/cri/server/instrumented_service.go:1372 +0x1c6
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: k8s.io/cri-api/pkg/apis/runtime/v1._RuntimeService_ListContainerStats_Handler.func1({0x555948737ad0, 0xc000fee1b0}, {0x555948677560?, 0xc000038b20})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go:9430 +0x78
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/containerd/containerd/services/server.unaryNamespaceInterceptor({0x555948737ad0, 0xc000fee1b0}, {0x555948677560, 0xc000038b20}, 0x3?, 0xc0010f9440)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/services/server/namespace.go:31 +0x6b
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x555948737ad0?, 0xc000fee1b0?}, {0x555948677560?, 0xc000038b20?})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1({0x555948737ad0, 0xc000fee1b0}, {0x555948677560, 0xc000038b20}, 0x0?, 0xc000b76080)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server_metrics.go:107 +0x87
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x555948737ad0?, 0xc000fee1b0?}, {0x555948677560?, 0xc000038b20?})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.UnaryServerInterceptor.func1({0x555948737ad0, 0xc000fee090}, {0x555948677560, 0xc000038b20}, 0xc000b76060, 0xc000b760a0)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc/interceptor.go:325 +0x664
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x555948737ad0?, 0xc000fee090?}, {0x555948677560?, 0xc000038b20?})
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25 +0x3a
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1({0x555948737ad0, 0xc000fee090}, {0x555948677560, 0xc000038b20}, 0xc000f53af0?, 0x5559484d5f20?)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:34 +0xbf
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: k8s.io/cri-api/pkg/apis/runtime/v1._RuntimeService_ListContainerStats_Handler({0x5559486dd8c0?, 0xc0000107d0}, {0x555948737ad0, 0xc000fee090}, 0xc0021d2060, 0xc000368240)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go:9432 +0x138
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: google.golang.org/grpc.(*Server).processUnaryRPC(0xc000320a80, {0x55594873ddc0, 0xc00063f380}, 0xc000c92fc0, 0xc000426180, 0x555949308048, 0x0)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/google.golang.org/grpc/server.go:1283 +0xcfd
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: google.golang.org/grpc.(*Server).handleStream(0xc000320a80, {0x55594873ddc0, 0xc00063f380}, 0xc000c92fc0, 0x0)
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/google.golang.org/grpc/server.go:1620 +0xa1b
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: google.golang.org/grpc.(*Server).serveStreams.func1.2()
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/google.golang.org/grpc/server.go:922 +0x98
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]: created by google.golang.org/grpc.(*Server).serveStreams.func1
Jul 22 10:16:50 tmp-node-e2e-711c8626-cos-97-16919-103-16 containerd[888]:         /go/src/github.com/containerd/containerd/vendor/google.golang.org/grpc/server.go:920 +0x28a
  • I'm trying to understand exactly why to follow up with a patch now

@fuweid
Copy link
Member

fuweid commented Jul 22, 2022

thanks! I think there is some created containers which doesn't start. The metrics from shim doesn't have stat for the container. But the line69 just reads the cpu from nil instance.

@endocrimes
Copy link
Contributor

yeah - just testing a patch that handles that now

}

// this is a calculated value and should be computed for all OSes
nanoUsage, err := c.getUsageNanoCores(cntr.Metadata.ID, false, cs.Cpu.UsageCoreNanoSeconds.Value, time.Unix(0, cs.Cpu.Timestamp))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why the code was refactored, always passing false even for the sandbox "pause" container?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk code works, fix is fine.. later we can refactor this so windows/ other impls of containerMetrics() have usage nano as well.. and the other stats necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess could have cleaned that up a bit more. merged much faster than I expected :-).

I can revisit this in #7099 once we have some agreement on WindowsPodsandbox stats for kubernetes/enhancements#3439

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/sbserver Changes are backported to sbserver needs-ok-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows cri container stats call doesn't return usageNanoCores

9 participants