Skip to content

OtlpHttpClient::Export() fails in ASYNC build #1955

@marcalff

Description

@marcalff

Problem

Start a local OpenTelemetry-Collector process on port 4318.

Run the example_otlp_http client.

In the sync build (WITH_ASYNC_EXPORT_PREVIEW=OFF), the client prints no error messages, and the collector gets all the trace data.

The SYNC build works as expected.

Repeat in the async build (WITH_ASYNC_EXPORT_PREVIEW=ON)

The client prints errors:

[email protected]:otlp> ./example_otlp_http
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.No error
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_exporter.cc:93 [OTLP HTTP Client] ERROR: Export 1 trace span(s) error: 1
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.No error
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_exporter.cc:93 [OTLP HTTP Client] ERROR: Export 1 trace span(s) error: 1
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.No error
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_exporter.cc:93 [OTLP HTTP Client] ERROR: Export 1 trace span(s) error: 1
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.No error
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_exporter.cc:93 [OTLP HTTP Client] ERROR: Export 1 trace span(s) error: 1
[Error] File: /home/malff/CODE/MY_GITHUB/opentelemetry-cpp/exporters/otlp/src/otlp_http_client.cc:303 [OTLP HTTP Client] Session state: (manually) cancelled.

The collector gets incomplete data.

Analysis

In both cases, sync and async, the code reaches the async export method:

sdk::common::ExportResult OtlpHttpClient::Export(
    const google::protobuf::Message &message,
    std::function<bool(opentelemetry::sdk::common::ExportResult)> &&result_callback,
    std::size_t max_running_requests) noexcept
{
  ...

  // Wait for the response to be received
  if (options_.console_debug)
  {
    OTEL_INTERNAL_LOG_DEBUG(
        "[OTLP HTTP Client] Waiting for response from "
        << options_.url << " (timeout = "
        << std::chrono::duration_cast<std::chrono::milliseconds>(options_.timeout).count()
        << " milliseconds)");
  }

  // Wait for any session to finish if there are to many sessions
  std::unique_lock<std::mutex> lock(session_waker_lock_);
  bool wait_successful =
      session_waker_.wait_for(lock, options_.timeout, [this, max_running_requests] {
        std::lock_guard<std::recursive_mutex> guard{session_manager_lock_};
        return running_sessions_.size() <= max_running_requests;
      });

  cleanupGCSessions();

  ...

In the sync case:

  • max_running_requests is 0,
  • running_sessions_.size() is 1,
  • the code blocks in session_waker_.wait_for(),
  • the http server reply
  • the response is received
  • running_sessions_.size() become 0
  • session_waker_ is notified
  • the wait_for return.

In the async case:

  • max_running_requests is 64,
  • running_sessions_.size() is 1,
  • the code does not blocks in session_waker_.wait_for(),
  • cleanupGCSessions() is called
  • the running session is aborted

and all this happens before the http server response

Export fails, printing Session state: (manually) cancelled.

The wait_for() predicate:

running_sessions_.size() <= max_running_requests

does not work for async builds.

Setting OtlpHttpExporterOptions::max_concurrent_requests = 0 as a work around makes OtlpHttpClient::Export() to work.

Not sure what the proper fix is, max_running_requests appear broken.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions