brave icon indicating copy to clipboard operation
brave copied to clipboard

Brave JMS instrumentation does not work properly when ActiveMQ BytesMessage is received

Open raduarsenii opened this issue 4 years ago • 6 comments

Description

When org.apache.activemq.command.ActiveMQBytesMessage JMS message is received, and sleuth is enabled, the traceId and spanId are working as expected, but the content (or payload) of the message is lost. This occurs when using spring boot 2.5.7 and spring cloud 2020.0.4 (brave 5.13.2, and ActiveMQ Client 5.16.3)! I came to the conclusion that ActiveMQBytesMessage does not behave correctly after brave instrumentation JMS edits the message to remove some headers. However, I'm not sure if this is an issue on the ActiveMQ side, or Brave instrumentation does not edit the BytesMessage correctly.

Steps to Reproduce

Before calling brave.jms.PropertyFilter#filterProperties(javax.jms.Message, java.util.Set<java.lang.String>, java.util.List<java.lang.Object>) method we can see the following in the BytesMessage: image

After calling brave.jms.PropertyFilter#filterProperties(javax.jms.Message, java.util.Set<java.lang.String>, java.util.List<java.lang.Object>) method we can see the following in the BytesMessage: image To be more precicelly, after calling org.apache.activemq.command.ActiveMQBytesMessage#setObjectProperty the content is moved in dataOut/bytesOut.

Next, if you try to read something from BytesMessage, using org.apache.activemq.command.ActiveMQBytesMessage#readBytes method for example, then you will discover that message has no content, and this is unexpected: image

Expected Behaviour

After setting the properties back, via org.apache.activemq.command.ActiveMQBytesMessage#setObjectProperty, the BytesMessage content should be restored, before reaching the MessageListener or MessageConverter.

For example, after calling brave.jms.PropertyFilter#filterProperties(javax.jms.Message, java.util.Set<java.lang.String>, java.util.List<java.lang.Object>) method, you can reset the BytesMessage using reset() method: image Note that content was restored, and dataIn, dataOut, and bytesOut were cleared.

After that you will be able to read the content: image

Things to consider

Have a look at the org.apache.activemq.command.ActiveMQBytesMessage#setObjectProperty method: image The initializeWritingNoCheck() method will simply move the content to dataOut/bytesOut. As a result, content will be null and dataOut/bytesOut will contain the bytes moved from content. Note dataIn remains null.

Have a look at org.apache.activemq.command.ActiveMQBytesMessage#read* methods, for example: image They call initializeReading() method, that does the following: image I.e. when content is missing, the dataIn is initialized with an empty DataInputStream. In my opinion here is the issue ^, meaning this is a bug in ActiveMQ client. But I'm not sure about this, that's why I created this bug here.

Note: I do not know who should do the fix, that's why I have created this bug, and additionally I have created a bug on the ActiveMQ client side, in hope to obtain a resolution ASAP... check https://issues.apache.org/jira/browse/AMQ-8417

raduarsenii avatar Nov 25 '21 03:11 raduarsenii

Not sure if this is related, I found this on ActiveMQ side: https://issues.apache.org/jira/browse/AMQ-4887 That ticket was closed long ago, with fixed version 5.9.1, 5.10.0... however, in 5.16.3 this is reproducible.

raduarsenii avatar Nov 25 '21 03:11 raduarsenii

Here is another issue, that may be related (it tries to fix another issue, but is related somehow with this issue) https://github.com/openzipkin/brave/pull/1042

raduarsenii avatar Nov 25 '21 11:11 raduarsenii

The ActiveMQ team claim that this is not an issue on their side: https://issues.apache.org/jira/browse/AMQ-8417 They said that reset() method should be called after properties were edited.

In my opinion, this is not right... for example, when using Artemis ActiveMQ, the reset() method is not required, and I'm wondering why the same is not valid for ActiveMQ 5 as well?!

Is there a way to fix this issue on brave instrumentation jms library? A simple fix would be:

public final class JmsTracing {
    **private Set<String> messageTypesToReset = new HashSet<>();**
...
// somewhere in the constructor, can be externally configured
    **messageTypesToReset = new HashSet<>();
    messageTypesToReset.add("org.apache.activemq.command.ActiveMQBytesMessage");**
...
  <R> TraceContextOrSamplingFlags extractAndClearTraceIdProperties(
    Extractor<R> extractor, R request, Message message
  ) {
    TraceContextOrSamplingFlags extracted = extractor.extract(request);
    // Clear propagation regardless of extraction as JMS requires clearing as a means to make the
    // message writable
    PropertyFilter.filterProperties(message, traceIdProperties);
    **if (messageTypesToReset.contains(message.getClass().getName()) {
        if (message instanceof BytesMessage) {
            ((BytesMessage) message).reset();
        } else if (message instanceof StreamMessage) {
            ((StreamMessage) message).reset();
        }
    }**
    return extracted;
  }
...
}

Note that for org.apache.activemq.command.ActiveMQStreamMessage the reset() is not required

raduarsenii avatar Nov 30 '21 06:11 raduarsenii

I also got bitten by this error after upgrading from activemq 5.15.15 to 5.16.4, And brave-instrumentation-jms from 512.7 to 5.13.7 (via spring-cloud-dependencies). Our workaround is to call message.reset() in our own message receiver.

urld avatar May 04 '22 12:05 urld

Got bitten again... This time in a component that uses camel to receive jms messages. But in this case camel maps ByteMessage to byte[] and sends it to a bean. Makes it harder to apply the workaround.

urld avatar Jul 19 '22 15:07 urld

It just happened to us. We had some @JmsListener methods that were working fine and simply adding spring-cloud-starter-sleuth and some brave dependencies, the listeners started failing due to JAXB throwing End Of File exceptions. We suspected it was because the tracing was not resetting the XML content after doing something with it, and this post seems to confirm it.

Removing the dependencies made all work again, without any other change.

In the end, we avoided the problem altogether with spring.sleuth.messaging.jms.enabled=false as the tracing, in our case, was included for HttpClient calls, not JMS.

Edit: Note that we cannot call the reset() method as the problems happens when Spring tries to unmarshall the message to create the bean/DTO to pass to our code, so the problem occurs before our code is called.

Verdoso avatar Nov 22 '22 12:11 Verdoso