Skip to content
This repository was archived by the owner on Jan 21, 2026. It is now read-only.
This repository was archived by the owner on Jan 21, 2026. It is now read-only.

Tracer periodically forgets configuration and traces every request (badly) #994

@AaronFriel

Description

@AaronFriel

Periodically tracing stops working in a weird way that results in every message being traced, disregarding the sampling rate and the top level root span.

What I expect to happen

Tracing is configured as so:

const TRACE_HEADER_BEHAVIOR = process.env.TRACE_HEADER_BEHAVIOR;
let contextHeaderBehavior: 'default' | 'require' = 'default';
let samplingRate: number = 1/60;
if (TRACE_HEADER_BEHAVIOR === 'require') {
  contextHeaderBehavior = TRACE_HEADER_BEHAVIOR;
  samplingRate = 0;
}

export const tracer = traceAgent.start({
  contextHeaderBehavior,
  enabled: process.env.TRACING === 'enabled',
  keyFilename: process.env.GOOGLE_APPLICATION_CREDENTIALS,
  logLevel: 0,
  projectId: process.env.TRACING_PROJECT,
  samplingRate,
});

We should expect that we will see one trace per minute, and we do when it is functioning normally. A service is configured to send a heartbeat once per second to a queue service, it runs in a root span called heartbeat.

export async function runHeartbeat(
  parentLogger: Logger,
  dispatcherUri: string,
  rateHz: number,
) {
  const chance = new Chance();

  const durationMs = 1000 / rateHz;

  return pForever(
    async i =>
      tracer.runInRootSpan({ name: 'heartbeat' }, async rootSpan => {
        const logger = traceLogger(parentLogger);

        const message: Payload = {
          payload: JSON.stringify({
            hops: 0,
            i,
            origin: `heartbeat-${os.hostname}-${i}`,
            [chance.string()]: chance.string(),
          }),
        };

        const uri = `${dispatcherUri}`;

        try {
          await postHeartbeat(logger, uri, message);
        } catch (err) {
          logger.error(
            { message, err, dispatcherUri },
            'Error posting message to Dispatcher',
          );
        } finally {
          rootSpan.endSpan();
        }

        await sleep(durationMs);

        return i + 1;
      }),
    1,
  );
}

async function postHeartbeat(
  parentLogger: Logger,
  uri: string,
  message: Payload,
) {
  const logger = traceLogger(parentLogger);
  logger.trace({ message }, 'Posting heartbeat');

  const response = await got.post(uri, {
    body: JSON.stringify(message),
    headers: {
      'Content-Type': 'application/json',
    },
    timeout: 30_000,
  });

  // SAFETY: ! allowed because we are using a POST and not a st1ream
  const statusCode = response.statusCode!;
  const body = response.body;

  if (statusCode < 200 || statusCode >= 300) {
    logger.error({ body, statusCode }, 'Invalid status code');
  }

  return;
}

This is what a normal set of traces look like in Stackdriver Trace:

image

We see the root span, we see traces about once per minute (obeying the sampling rate of 1/60), and I can dig into the contents of the child spans, and there are log messages associated with the spans.

What happens when it breaks

Instead of observing the root span and all child spans across multiple services, I instead see the first HTTP call, a POST to /dispatcher/ping-pong. And instead of being traced once a minute, I see traces for every POST once a second. This is extremely strange! It's like the trace service is disregarding my root span entirely, and the sampling rate is being set to 0!

image

And do you see that readiness trace in the screenshot? That's supposed to be filtered out! It's a Kubernetes readiness probe. The server that emitted that has TRACE_HEADER_BEHAVIOR=require, so it shouldn't be possible for it to appear! Here's a search for just that path filtered over the last four hours:

image

Metadata

Metadata

Assignees

Labels

api: cloudtraceIssues related to the googleapis/cloud-trace-nodejs API.type: questionRequest for information or clarification. Not an issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions