Skip to content

Can't create CRaC checkpoint, fails with the error message "CheckpointOpenSocketException: ... kafka/172.18.0.3:9092" #3062

@magnus-larsson

Description

@magnus-larsson

spring-cloud-stream seems to prevent creating CRaC checkpoints.

The issue
CraC checkpoints fail with errors that indicate that connections to the message broker, e.g. Kafka, are not closed by Spring Cloud Stream before the checkpoint.

How to reproduce

  1. Get the source code:

    git clone https://github.com/magnus-larsson/ml-spring-cloud-stream-samples.git
    cd ml-spring-cloud-stream-samples/kafka-binder-native-app
    git checkout crac-issue
    
  2. Start Kafka:

    cd ../kafka-batch-sample
    docker compose up -d
    cd -
    
  3. Build and start the Kafka consumer application in a Docker container

    ./mvnw package
    
    docker run -it --rm  --network kafka-batch-sample_default -v ${PWD}:/demo --privileged --name demo azul/zulu-openjdk:21-jdk-crac bash
    
    # Commands to be executed in the container
    cd demo
    java -XX:CRaCCheckpointTo=checkpoint -jar target/kafka-binder-native-app-0.0.1-SNAPSHOT.jar
    
  4. Verify a successful connection to Kafka by looking for log output similar to:

    2025-01-03T10:35:42.993Z  INFO 9 --- [pool-2-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-anonymous.8a5b54bb-95a4-48ff-bf40-ed6647338d8d-3, groupId=anonymous.8a5b54bb-95a4-48ff-bf40-ed6647338d8d] Discovered group coordinator kafka:9092 (id: 2147482646 rack: null)
    
  5. In a second terminal, try taking a checkpoint of the application in the Docker container:

    docker exec demo jcmd target/kafka-binder-native-app-0.0.1-SNAPSHOT.jar JDK.checkpoint
    

    The checkpoint command will be aborted with error messages like the following:

    An exception during a checkpoint operation:
    jdk.internal.crac.mirror.CheckpointException
           Suppressed: java.nio.channels.IllegalSelectorException
                    at java.base/sun.nio.ch.EPollSelectorImpl.beforeCheckpoint(EPollSelectorImpl.java:401)
                    ...
           Suppressed: jdk.internal.crac.mirror.impl.CheckpointOpenSocketException: java.nio.channels.SocketChannel[connected local=/172.18.0.4:34800 remote=kafka/172.18.0.3:9092]
                    at java.base/jdk.internal.crac.JDKSocketResourceBase.lambda$beforeCheckpoint$0(JDKSocketResourceBase.java:68)
                    ...
    

    The error messages indicate that the checkpoint was aborted since the connection with Kafka was not closed before the checkpoint.

Version of the framework
Java 21
Spring Boot 3.4.1
Spring Cloud Stream 4.2.0

Expected behavior
That the checkpoint command runs successfully.

Additional context
Using Spring-Kafka without Spring Cloud Stream works fine with CRaC.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions