Skip to content

Possible OOM When HttpUrlConnection is loaded before agent start #16129

@pepeshore

Description

@pepeshore

Describe the bug

In this scenario, I first extended OpenTelemetry. Before the OT agent starts, I send an HTTP request via HttpURLConnection to fetch remote configuration, so I can customize some agent behaviors. This causes the HttpURLConnection class to be loaded before the agent starts. The code below is a simplified reproduction of an OOM issue. It is a basic Tomcat application, and the key part is that I implemented a custom Servlet. In this Servlet, I use HttpURLConnection to send a request to a non-existent domain name. This throws an exception, which I catch and wrap into another exception and then rethrow. While wrapping it into the new exception, I store the current HttpURLConnection instance inside the exception class.

@WebServlet("/http-connection-test")
public class HttpConnectionServlet extends HttpServlet {

    private static final String NON_EXISTENT_DOMAIN = "http://this-domain-does-not-exist-12345.invalid";

    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {
        
        response.setContentType("text/html;charset=UTF-8");
        PrintWriter out = response.getWriter();
        sendRequestToNonExistentDomain();
        out.write("success");
    }

    private void sendRequestToNonExistentDomain() throws HttpConnectionException {
        HttpURLConnection connection = null;
        
        try {
            URL url = new URL(NON_EXISTENT_DOMAIN);
            connection = (HttpURLConnection) url.openConnection();
            connection.setRequestMethod("GET");
            connection.setConnectTimeout(5000);
            connection.setReadTimeout(5000);
            
            // Try to connect, this will throw an exception because the domain does not exist
            connection.connect();
            
            // Try to get the response code (will not actually reach here)
            int responseCode = connection.getResponseCode();
            System.out.println("Response Code: " + responseCode);
            
        } catch (IOException e) {
            // Catch the original exception and wrap it into HttpConnectionException
            // Save the HttpURLConnection instance as a field of the exception
            throw new HttpConnectionException(
                    "connect failed: " + NON_EXISTENT_DOMAIN,
                    e,
                    connection
            );
        }
    }
}

/**
 * Custom exception for wrapping HTTP connection related exceptions.
 * Contains an HttpURLConnection instance as a field.
 */
class HttpConnectionException extends RuntimeException {

  private final HttpURLConnection connection;

  public HttpConnectionException(String message, Throwable cause, HttpURLConnection connection) {
    super(message, cause);
    this.connection = connection;
  }

  public HttpConnectionException(String message, HttpURLConnection connection) {
    super(message);
    this.connection = connection;
  }

  /**
   * Get the associated HttpURLConnection instance.
   */
  public HttpURLConnection getConnection() {
    return connection;
  }
}

In this scenario, the following circular dependency is formed:

  1. HttpURLConnection -> io.opentelemetry.javaagent.instrumentation.httpurlconnection.HttpUrlState
  2. io.opentelemetry.javaagent.instrumentation.httpurlconnection.HttpUrlState -> io.opentelemetry.context.Context
  3. io.opentelemetry.context.Context -> io.opentelemetry.javaagent.bootstrap.servlet.AppServerBridge
  4. io.opentelemetry.javaagent.bootstrap.servlet.AppServerBridge -> HttpConnectionException
  5. HttpConnectionException -> HttpURLConnection

As a result, the keys in the underlying WeakHashMap of VirtualField<HttpURLConnection, HttpUrlState> always have a strong reference and cannot be released.

A possible fixed way:
Change the reference between io.opentelemetry.javaagent.instrumentation.httpurlconnection.HttpUrlState and io.opentelemetry.context.Context to a weak reference

// everything is public since called directly from advice code
// (which is inlined into other packages)
public class HttpUrlState {

  private final WeakReference<Context> contextWeakReference;
  private boolean finished;
  // by default 0 is ignored
  private int statusCode = 0;

  public HttpUrlState(Context context) {
    this.contextWeakReference = new WeakReference<>(context);
  }

  public Context getContext() {
    return contextWeakReference.get();
  }
}

Steps to reproduce

Use the official OT agent can't reproduce this bug

Expected behavior

..

Actual behavior

..

Javaagent or library instrumentation version

NA

Environment

JDK:
OS:

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageNew issue that requires triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions