Skip to content

Conversation

@ns6089
Copy link
Contributor

@ns6089 ns6089 commented Apr 22, 2023

Description

Bring back serial processing path (with optimizations) in attempt to maximize frame time consistency.
Spiritual successor to #1168

Screenshot

2023-04-22 19_08_00-Window

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Dependency update (updates to dependencies)
  • Documentation update (changes to documentation)
  • Repository update (changes to repository files, e.g. .github/...)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated the in code docstring/documentation-blocks for new or existing methods/components

Branch Updates

LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.

  • I want maintainers to keep my branch updated

@ns6089 ns6089 changed the title Add experimental "unpaced" and "serial" processing options Add experimental "unpaced" and "serial" processing options (Windows only) Apr 22, 2023

if (config::video.unpaced && mouse_pointer_visible) {
// Wait for vblank and flush Desktop Duplication API mouse pointer updates.
DwmFlush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "use_dwmflush" has some logic behind it and using DwmFlush bypasses it completely.

Copy link
Contributor Author

@ns6089 ns6089 Apr 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and this was intentional.
DwmFlush() basically does WaitForVBlank() from direct3d and ensures duplication api has the most recent mouse pointer position (for some reason it normally gets delayed). Because this is unpaced path that disables sleeps, we need to limit update rate and call AcquireNextFrame() at even intervals for smooth mouse movement, WaitForVBlank() (or in this case DwmFlush()) accomplishes both.

Copy link
Contributor

@psyke83 psyke83 Apr 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitForVBlank() and DwmFlush() don't behave similarly in terms of resolving capture issues related to the mouse pointer.

DwmFlush() was explicitly added to alleviate this issue: loki-47-6F-64/sunshine#227, but using it causes problems if the host monitor refresh rate is lower than the client requested framerate. See the description here: 2d969c2

Bypassing the host refresh rate detection before using DwmFlush() will cause this issue to reoccur on those configurations. I have not checked this PR, but I would recommend investigating ways to smoothen frametime consistency without involving DwmFlush() at all. IMO, it should be removed entirely, as the original mouse-induced stuttering bug can be resolved by invoking ReleaseFrame() immediately after acquiring the frame and copying the buffer (see PR on original repo for reference: loki-47-6F-64/sunshine#286)

Copy link
Contributor Author

@ns6089 ns6089 Apr 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will see if calling ReleaseFrame() before sleeping with WaitForVBlank() eliminates mouse delay in the next AcquireNextFrame(). If it does, we can safely move from DwmFlush() to WaitForVBlank() in this particular path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to test this objectively, but looking at mouse trails over recursive moonlight window, this seems to work. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HakanFly
Copy link

Display Resolution 4K 120Hz (vsync disabled and freesync enabled) and Stream 4K60 HEVC SDR
Windows 11 22H2 and AMD 7900XTX
CP 2077 benchmark

Base : 80 FPS (100%)
Nightly (2023/04/09) : 68 FPS (85%)
PR Serial (both disabled) : 70 FPS (87,5%)
PR Serial (both enabled) : 70 FPS (87,5%)
PR Serial (serial enabled) : 70,5 FPS (88%)
PR Serial (unpaced enabled) : 69 FPS (86%)

Only in the case where serial processing is enabled, I reproduce an old bug that I had reported a while ago.
During the benchmark (no problem in the menu) impossible to have 60 FPS in video streaming (Moonlight overlay stats). It is blocked at 54-57 FPS.
And it makes me think a lot about this issue that I have been following recently: GPUOpen-LibrariesAndSDKs/AMF#386

@ns6089
Copy link
Contributor Author

ns6089 commented Apr 23, 2023

Thanks for the feedback, but a short disclaimer so the discussion doesn't get derailed.
The purpose of this pull request was not fixing amd performance (I don't even have an amd card).
It was for improving frame time consistency on nvidia cards, and that it does very well.

@HakanFly
Copy link

Thanks for the feedback, but a short disclaimer so the discussion doesn't get derailed. The purpose of this pull request was not fixing amd performance (I don't even have an amd card). It was for improving frame time consistency on nvidia cards, and that it does very well.

I understood it very well. Currently, my skills do not allow me to invest in the code of Sunshine.
But I'm happy to help give a state on a very different configuration. The subject interests me.
And just give feedback that some optimizations also benefits or don't hurt another type of config.

Copy link
Member

@ReenigneArcher ReenigneArcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the docs looks go to me. I just have two minor suggestions.

@ns6089
Copy link
Contributor Author

ns6089 commented Apr 24, 2023

Turns out "unpaced" path in its current form is more problematic than I thought. Encoder bitrate budget is calculated based on the bitrate and frame rate requested by he client, and we can't change this budget mid-stream due to ffmpeg limitations. This means exceeding the framerate requested be the client can (but not necessarily) also mean exceeding the bitrate requested by the client.

Probably needs safeguards and additional documentation. Will think about it.

@ns6089 ns6089 marked this pull request as draft April 24, 2023 23:33
@LizardByte-bot
Copy link
Member

This PR is stale because it has been open for 90 days with no activity. Comment or remove the stale label, otherwise this will be closed in 10 days.

@LizardByte-bot
Copy link
Member

This PR was closed because it has been stalled for 10 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants