Skip to content

Flaky timeout: TaskDeduplicatorTest.executeIfNeeded_executeAndCancelLoop_noErrors() #28302

@pzembrod

Description

@pzembrod

Description of the bug:

I'm seeing flaky timeouts of of //src/test/java/com/google/devtools/build/lib/concurrent:ConcurrentTests

More specifically, it is TaskDeduplicatorTest.executeIfNeeded_executeAndCancelLoop_noErrors() that sometimes hangs until the test times out after 5 minutes.

Starting full thread dump ...

"main" Id=3 TIMED_WAITING on java.util.concurrent.CountDownLatch$Sync@3577acf2
	at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
	-  waiting on java.util.concurrent.CountDownLatch$Sync@3577acf2
	at [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:271)
	at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:792)
	at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1166)
	at [email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:276)
	at [email protected]/java.util.concurrent.ThreadPerTaskExecutor.awaitTermination(ThreadPerTaskExecutor.java:159)
	at [email protected]/java.util.concurrent.ThreadPerTaskExecutor.awaitTermination(ThreadPerTaskExecutor.java:173)
	at [email protected]/java.util.concurrent.ThreadPerTaskExecutor.close(ThreadPerTaskExecutor.java:189)
	at app//com.google.devtools.build.lib.concurrent.TaskDeduplicatorTest.executeIfNeeded_executeAndCancelLoop_noErrors(TaskDeduplicatorTest.java:267)
	at [email protected]/java.lang.invoke.LambdaForm$DMH/0x0000000017541000.invokeVirtual(LambdaForm$DMH)
	at [email protected]/java.lang.invoke.LambdaForm$MH/0x0000000017541800.invoke(LambdaForm$MH)
	at [email protected]/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder)
	at [email protected]/jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(DirectMethodHandleAccessor.java:154)
	at [email protected]/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at [email protected]/java.lang.reflect.Method.invoke(Method.java:565)
	at app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)

It seems that testExecutorService.close() is hanging.

Which category does this issue belong to?

Remote Execution

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

My repro:

bazelisk test //src/test/java/com/google/devtools/build/lib/concurrent:ConcurrentTests --runs_per_test=100 --config=remote

I got 1/100 flakes in 2 different invocations.

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

release 8.5.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?


If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

Metadata

Metadata

Assignees

Labels

P2We'll consider working on this in future. (Assignee optional)team-Remote-ExecIssues and PRs for the Execution (Remote) teamtype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions