Skip to content

script: LoadBlocker's drop impl shouldn't run after termination.#36508

Merged
mukilan merged 1 commit intoservo:mainfrom
mukilan:fix-loadblocker-terminate
Apr 21, 2025
Merged

script: LoadBlocker's drop impl shouldn't run after termination.#36508
mukilan merged 1 commit intoservo:mainfrom
mukilan:fix-loadblocker-terminate

Conversation

@mukilan
Copy link
Copy Markdown
Member

@mukilan mukilan commented Apr 14, 2025

The logic in LoadBlocker::terminate was modified in #34122 to clone
the LoadBlocker's inner load member instead of takeing it. However,
this member serves as a flag so that LoadBlocker's Drop impl can avoid
calling doc.finish_load on already terminated loads. The change results in
unnecessary 'unknown completed load' warnings when Servo is run with
logging enabled.

Signed-off-by: Mukilan Thiyagarajan [email protected]

@mukilan mukilan requested a review from gterzian as a code owner April 14, 2025 04:07
@mukilan mukilan requested a review from jdm April 14, 2025 04:10
Copy link
Copy Markdown
Member

@jdm jdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find!

@mukilan mukilan enabled auto-merge April 14, 2025 04:23
@mukilan mukilan added this pull request to the merge queue Apr 14, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 14, 2025
@mukilan mukilan added this pull request to the merge queue Apr 14, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 14, 2025
@mukilan mukilan added this pull request to the merge queue Apr 14, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 14, 2025
@jdm
Copy link
Copy Markdown
Member

jdm commented Apr 14, 2025

Those were all tests that changed results in the original PR that you referenced.

@mukilan
Copy link
Copy Markdown
Member Author

mukilan commented Apr 15, 2025

Those were all tests that changed results in the original PR that you referenced.

Ah, thanks for pointing that out! I looked into one of the tests - /dom/ranges/Range-cloneContents.html. It fails because when the iframe's body.onload here runs, the function run seems to be not defined yet.

The test passes if we delete the onbody=run() from Range-test-iframe.html. I think this is because the parent html already explicity invokes the run() function. The test also passes, if the body onload=run() is replaced with window.onload = run after the definition of function run().

I am not sure how this relates to the load blocker. I guess for now I can mark them as ERROR again and investigate these separately.

@jdm
Copy link
Copy Markdown
Member

jdm commented Apr 16, 2025

Given that the load blocker is what controls when the onload event is dispatched, that definitely sounds suspicious! Please file a followup issue about it; there's no reason that the onload event should run before the inline script has a chance to define the run function.

@mukilan
Copy link
Copy Markdown
Member Author

mukilan commented Apr 16, 2025

Given that the load blocker is what controls when the onload event is dispatched, that definitely sounds suspicious!

Agreed. I meant to say I couldn't find any obvious connection between this load blocker change and the onload event being triggered early in this specific case.

I'll file the new issue.

@mukilan mukilan force-pushed the fix-loadblocker-terminate branch from d5caade to eac7211 Compare April 16, 2025 06:45
@mukilan mukilan added this pull request to the merge queue Apr 16, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 16, 2025
@mukilan
Copy link
Copy Markdown
Member Author

mukilan commented Apr 16, 2025

The TIMEOUT in /mimesniff/media/media-sniff.window.html seems unrelated to this change. The test times out locally even without this patch

web-platform-test
~~~~~~~~~~~~~~~~~
Ran 1290 checks (1260 subtests, 30 tests)
Expected results: 914
Unexpected results: 376
  test: 30 (30 timeout)
  subtest: 346 (2 fail, 344 timeout)
  
Unexpected Results
------------------
/mimesniff/media/media-sniff.window.html
  TIMEOUT mp3-with-id3.mp3 loads when served with Content-Type bogus/mime - Test timed out
  TIMEOUT mp3-with-id3.mp3 loads when served with Content-Type application/octet-stream - Test timed out
  TIMEOUT mp3-with-id3.mp3 loads when served with Content-Type text/html - Test timed out
  TIMEOUT mp3-with-id3.mp3 loads when served with Content-Type audio/ogg; codec=vorbis - Test timed out
  TIMEOUT mp3-with-id3.mp3 loads when served with Content-Type application/pdf - Test timed out
  TIMEOUT wav.wav loads when served with Content-Type  - Test timed out
  TIMEOUT wav.wav loads when served with Content-Type bogus/mime - Test timed out
  TIMEOUT wav.wav loads when served with Content-Type application/octet-stream - Test timed out
  TIMEOUT wav.wav loads when served with Content-Type text/html - Test timed out
  TIMEOUT wav.wav loads when served with Content-Type audio/ogg; codec=vorbis - Test timed out
  TIMEOUT wav.wav loads when served with Content-Type application/pdf - Test timed out

However, the CI test run has ALSA related errors that I don't see locally:

  │ Cannot connect to server socket err = No such file or directory
  │ Cannot connect to server request channel
  │ jack server is not running or cannot be started
  │ JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
  │ JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
  │ ALSA lib confmisc.c:855:(parse_card) cannot find card '0'

@jdm
Copy link
Copy Markdown
Member

jdm commented Apr 16, 2025

This change does appear to make it timeout consistently on CI though.

@mukilan
Copy link
Copy Markdown
Member Author

mukilan commented Apr 18, 2025

Does this test PASS for you locally? I've tried Ubuntu 24.04 and it consistently times out locally without the change (with --repeat 30 and --timeout-multiplier 2). Is there any setup needed for the media stack to work properly, apart from the GST_PLUGIN_FEATURE_RANK environment variable that we set in CI?

@mukilan
Copy link
Copy Markdown
Member Author

mukilan commented Apr 19, 2025

I've been debugging this on the CI using my fork and some log statements

The logs for a run with the current implementation of LoadBlocker::terminate - https://github.com/mukilan/servo/actions/runs/14545909416/job/40811811386

Based on those logs, I think I have a theory for why the test used to pass on CI without the change, but not with the change.

Firstly, there is a difference between CI and local (at my Ubuntu 24.04 VM) media stack behaviour - for the wav.wav subtests, the CI's media stack delivers the PlayerEvent::MetadataUpdated event, allowing the loadedmetadata event to be triggered which is sufficient for the subtest to be marked as passing. The CI's stack does not deliver the PlayerEvent::StateChanged event for wav.wav but it does for other types of audio. This StateChanged event is the one that finally marks the load of the resource as not blocking the 'load' event of the document. On my local machine, even the PlayerEvent::MetadataUpdated is not delivered for the wav.wav media loads, so those subtests were marked as TIMEOUT.

Secondly, the bug introduced in #34122 meant that document.finish_load(load) was called twice (from terminate and the drop triggered at the final statement of terminate) for each successfully loaded (i.e had received the PlayerEvent::StateChanged) media resource. However, load is just a LoadType::Media and can't distinguish between two different URLs being loaded. So, for each successful loaded media in media-sniff.window.html, we were popping two blocking loads from self.blocking_loads thereby draining blocking_loads prematurely, even when the media represented by the blocking loads were not fully loaded, allowing load event to be triggered on the document and the test runner to mark the test as completed.

@mukilan
Copy link
Copy Markdown
Member Author

mukilan commented Apr 21, 2025

I added some more logs for the CI run https://github.com/mukilan/servo/actions/runs/14567332667/job/40859369808

The theory above is essentially correct, with one minor clarification. The PlayerEvent::StateChanged does get emitted for wav.wav on the CI (with Paused and Buffer states), but only before the PlayerEvent::MetadataUpdated event is delivered. The if condition here

if self.ready_state.get() == ReadyState::HaveMetadata {

means we need a PlayerEvent::StateChanged event to also be delivered after a MetadataUpdated event so that the media is in the correct state, but this doesn't happen for .wav media.

It is not clear to me if this is a bug in the media stack or the assumption in the code about the order of events from the media stack. The spec states

Once the readyState attribute reaches HAVE_CURRENT_DATA, after the loadeddata event has been fired, set the element's delaying-the-load-event flag to false. This stops delaying the load event.

So it looks like the code is correct in not clearing the delay-the-load-event-flag when MetadataUpdated is delivered.
I think it makes sense to mark this test as a TIMEOUT and create an issue to investigate the media code. @jdm wdyt?

@jdm
Copy link
Copy Markdown
Member

jdm commented Apr 21, 2025

So it looks like the code is correct in not clearing the delay-the-load-event-flag when MetadataUpdated is delivered.
I think it makes sense to mark this test as a TIMEOUT and create an issue to investigate the media code. @jdm wdyt?

That sounds totally reasonable! Thanks for the detailed investigation.

The logic in LoadBlocker::terminate was modified in servo#34122 to `clone`
the LoadBlocker's inner `load` member instead of `take`ing it. However,
this member serves as a flag so that `LoadBlocker`'s Drop impl can avoid
calling `doc.finish_load` on already terminated loads. The change in
unnecessary 'unknown completed load' warnings when Servo is run with
logging enabled.

This patch also updates the expectations for tests which started passing
after the change in servo#34122, but now fail again with this change. These
test failures will be investigated in servo#36561.

Signed-off-by: Mukilan Thiyagarajan <[email protected]>
@mukilan mukilan force-pushed the fix-loadblocker-terminate branch from eac7211 to 8180292 Compare April 21, 2025 06:45
@mukilan mukilan enabled auto-merge April 21, 2025 06:46
@mukilan mukilan added this pull request to the merge queue Apr 21, 2025
Merged via the queue into servo:main with commit 2a723a5 Apr 21, 2025
21 checks passed
@mukilan mukilan deleted the fix-loadblocker-terminate branch April 21, 2025 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants