Introduce RenderClusterIterator#3458
Conversation
src/renderer/vt/paint.cpp
Outdated
| unclusteredString.reserve(clusters.size()); | ||
| short totalWidth = 0; | ||
| for (const auto& cluster : clusters) | ||
| unclusteredString.reserve(40); |
There was a problem hiding this comment.
I'm not a fan of magic number. But without reserving space, this will run much slower. Should we pass size alongside clusterIter? I'm not sure about this.
There was a problem hiding this comment.
Uh... perhaps preallocate the number of cells width from coord to the right edge of the known dirty rect (see impl of GetDirtyRectInChars)
All of the engines should know a little bit about what area is actually invalid and they're always given a drawing coordinate.
Failing this, instead of allocating and reserving on each call to _Paint*BufferLine, they might be able to allocate the buffer when the StartPaint() comes in as a dirty-rect-wide string/vector and just re-use that one for every row and free it when the EndPaint() comes around. That'd make way fewer allocations per frame than one per call.
src/renderer/dx/CustomTextLayout.cpp
Outdated
| THROW_IF_FAILED(format->GetLocaleName(_localeName.data(), gsl::narrow<UINT32>(_localeName.size()))); | ||
|
|
||
| for (const auto& cluster : clusters) | ||
| RenderClusterIterator it = clusterIter; |
There was a problem hiding this comment.
Because we are using RenderClusterIterator const, the copy is needed. It could be RenderClusterIterator& and then it can be modifed by the caller. But that's just kind of implicit to me. I prefer the copying.
There was a problem hiding this comment.
Nah, I like what you did. It's constant for the specification of the custom layout, but then the internals of the layout makes the copy to walk through it.
src/renderer/base/renderer.cpp
Outdated
| // This outer loop will continue until we reach the end of the text we are trying to draw. | ||
| while (it) | ||
| { | ||
| TextBufferCellIterator runStartIt(it); |
There was a problem hiding this comment.
When is inner loop is over clusterIter will be exhausted. So I copy it and keep the start of the run here.
src/renderer/base/renderer.cpp
Outdated
| // Advance the cluster and column counts. | ||
| const auto columnCount = clusters.back().GetColumns(); | ||
| it += columnCount > 0 ? columnCount : 1; // prevent infinite loop for no visible columns | ||
| const auto columnCount = (*clusterIter).GetColumns(); |
There was a problem hiding this comment.
Calculating the cols is cumbersome. , I don't know how to do this in a more elegant way.
There was a problem hiding this comment.
Oh. This might actually justify making the PaintBufferLine's first arg be a & instead.
Save a copy of it on the way in, let the engine increment it, then add a method that lets you difference the two of them.
Diff the resultant one you get back versus the one you initially created and it should tell you how many columns it walked. Then you don't have to cycle through the loop multiple times.
That's what I did for the cell/text iterators with .GetCellDistance and whatnot.
There was a problem hiding this comment.
I'm not sure. I don't think it's enough to justify this. Making it & sounds very error-prone and implicit.
There was a problem hiding this comment.
Make two of them? Take a const input and then have a second & as the output one?
It's not uncommon to use an Inout parameter, but if you think its safer to have an explicit In and Out (given we can't use the return code as it's always HRESULT for these thing), it seems OK.
src/renderer/gdi/paint.cpp
Outdated
| try | ||
| { | ||
| const auto cchLine = clusters.size(); | ||
| size_t cchLine = 120; |
There was a problem hiding this comment.
Again, without size I can't make use of length to init arrays.
There was a problem hiding this comment.
Again, decent estimate is coord to right edge of dirty rect.
There was a problem hiding this comment.
An estimate is not enough, though. I need the exact size here.
There was a problem hiding this comment.
Iterating it two times, could it be that bad? I'm stucked...
There was a problem hiding this comment.
Not strictly. You could replace things like auto pwsPoly = std::make_unique<wchar_t>(cchLine) instead with a std::wstring and .reserve() it up to the "estimate" of _rcInvalid.Right - coord.X. Same goes for the rgdxPoly with a vector of ints or something. (or use vectors for both).
The first phase with the for loop has to iterate the whole thing anyway to condense the clusters back into a string. Either you'll get lucky in the majority case and the estimate won't resize the vectors, or you'll get unlucky in the uncommon case and you'll resize up once or twice. (And if resizing up happens a lot, we can use an estimate function like the CustomTextLayout::_EstimateGlyphCount does to reduce the potential resizes.)
By the end of the for loop (lines 316-325), you should be able to know the exact widths required for all the functions further down PaintBufferLine and have only iterated once. E.g. you should have calculated cchLine exactly by iterating once and also filled a vector pwsPoly and vector rgdxPoly with the condensed version. That should be enough information to finish the draw operation.
However, now that I look at this further. The loop in this specific GDI renderer only allows a maximum of one wchar_t per unit because of GDI limitation. (comment on L322 where cluster.GetTextAsSingle() is called. So the "estimate" used for .reserve() above should actually be the exact column count as a Cluster is a console grid unit described by one-or-more-wchar_t and the vectors wouldn't auto-resize.
The last problem is that the vectors would free when going out of scope. There isn't really a detach to the vectors. We could just hold a reference to them that is freed inside _FlushBufferLines after PolyTextOut is called.
Member variables on the class like:
std::vector<std::vector<wchar_t>> _polyStrings and std::vector<std::vector<int>> _polyWidths, push a new vector into those on each PaintBufferLine, take the .data() of the line and stuff it into the pPolyTextLine fields, and then clear them out on _FlushBufferLines.
That might make _FlushBuferLines a little cleaner than raw delete[]s of bare pointers as well.
|
I don't know what I did wrong but I'm getting the same error as CI build:
|
miniksa
left a comment
There was a problem hiding this comment.
This is exactly the direction I was hoping you would go with this. Pretty much perfectly what I was envisioning with the comment I left yesterday. Thank you!
src/renderer/vt/paint.cpp
Outdated
| unclusteredString.reserve(clusters.size()); | ||
| short totalWidth = 0; | ||
| for (const auto& cluster : clusters) | ||
| unclusteredString.reserve(40); |
There was a problem hiding this comment.
Uh... perhaps preallocate the number of cells width from coord to the right edge of the known dirty rect (see impl of GetDirtyRectInChars)
All of the engines should know a little bit about what area is actually invalid and they're always given a drawing coordinate.
Failing this, instead of allocating and reserving on each call to _Paint*BufferLine, they might be able to allocate the buffer when the StartPaint() comes in as a dirty-rect-wide string/vector and just re-use that one for every row and free it when the EndPaint() comes around. That'd make way fewer allocations per frame than one per call.
src/renderer/inc/Cluster.hpp
Outdated
| Cluster(const std::wstring_view text, const size_t columns); | ||
|
|
||
| const wchar_t GetTextAsSingle() const noexcept; | ||
| wchar_t GetTextAsSingle() const noexcept; |
src/renderer/dx/CustomTextLayout.cpp
Outdated
| THROW_IF_FAILED(format->GetLocaleName(_localeName.data(), gsl::narrow<UINT32>(_localeName.size()))); | ||
|
|
||
| for (const auto& cluster : clusters) | ||
| RenderClusterIterator it = clusterIter; |
There was a problem hiding this comment.
Nah, I like what you did. It's constant for the specification of the custom layout, but then the internals of the layout makes the copy to walk through it.
src/host/ut_host/VtRendererTests.cpp
Outdated
| } | ||
|
|
||
| VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters.data(), clusters.size() }, { 1, 1 }, false)); | ||
| // VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters.data(), clusters.size() }, { 1, 1 }, false)); |
src/renderer/dx/DxRenderer.cpp
Outdated
| _dwriteFontFace.Get(), | ||
| { &cluster, 1 }, | ||
| _glyphCell.cx); | ||
| //// Create the text layout |
src/renderer/gdi/paint.cpp
Outdated
| try | ||
| { | ||
| const auto cchLine = clusters.size(); | ||
| size_t cchLine = 120; |
There was a problem hiding this comment.
Again, decent estimate is coord to right edge of dirty rect.
src/renderer/base/renderer.cpp
Outdated
| // Advance the cluster and column counts. | ||
| const auto columnCount = clusters.back().GetColumns(); | ||
| it += columnCount > 0 ? columnCount : 1; // prevent infinite loop for no visible columns | ||
| const auto columnCount = (*clusterIter).GetColumns(); |
There was a problem hiding this comment.
Oh. This might actually justify making the PaintBufferLine's first arg be a & instead.
Save a copy of it on the way in, let the engine increment it, then add a method that lets you difference the two of them.
Diff the resultant one you get back versus the one you initially created and it should tell you how many columns it walked. Then you don't have to cycle through the loop multiple times.
That's what I did for the cell/text iterators with .GetCellDistance and whatnot.
|
Just overall, did you check a trace on this (can I see)? I presume it's better than the original but maybe not as perfect as the more memory-intense previous iteration. |
d0d432e to
e9a785f
Compare
src/host/ut_host/VtRendererTests.cpp
Outdated
|
|
||
| VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters1.data(), clusters1.size() }, { 0, 0 }, false)); | ||
| VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters2.data(), clusters2.size() }, { 0, 1 }, false)); | ||
| const COORD screenBufferSize{ 119, 9029 }; |
There was a problem hiding this comment.
This is just way too much ... kinda hurtful.
There was a problem hiding this comment.
It's suppose to be a unit test that focuses on this certain VT thing. Now I dragged all the other units like TextBuffer. I'm not a fan of this..
| static const size_t s_cPolyTextCache = 80; | ||
| POLYTEXTW _pPolyText[s_cPolyTextCache]; | ||
| static constexpr size_t s_cPolyTextCache = 80; | ||
| std::array<POLYTEXTW, s_cPolyTextCache> _polyText; |
There was a problem hiding this comment.
I figured that all those containers have exact same fixed size. Using std::array is more reasonable
| // Exit early if there are no lines to draw. | ||
| RETURN_HR_IF(S_OK, 0 == cchLine); | ||
|
|
||
| const size_t preallocateSize = (_rcInvalid.right - _rcInvalid.left) / 8; |
There was a problem hiding this comment.
I found that 8 is the magic number needed here to get the correct size? Any GDI+ reference that explains this?
| } | ||
| } | ||
|
|
||
| _polyStrings = {}; |
There was a problem hiding this comment.
This certainly is safer and better than raw delete[].
| return !(*this == it); | ||
| } | ||
|
|
||
| RenderClusterIterator& RenderClusterIterator::operator+=(const ptrdiff_t& movement) |
There was a problem hiding this comment.
Currently RenderClusterIterator is not aware of the dbcs columns. Caller must use clusterIter += cols > 0 ? cols : 1 to skip dbcs trailing char. I've tried to let RenderClusterIterator handle this itself, but failed to do so because the code gets too complicated and implicit.
|
Plain I found something interesting. Plain Update: With #3480 merged, |
d9e896e to
d1f4634
Compare
| <!-- Careful reordering these. Some default props (contained in these files) are order sensitive. --> | ||
| <Import Project="$(SolutionDir)src\common.build.post.props" /> | ||
| <!-- <Import Project="$(SolutionDir)src\common.build.tests.props" /> --> | ||
| <Import Project="$(SolutionDir)src\common.build.tests.props" /> |
There was a problem hiding this comment.
I don't know why but I need this to make CI and my local VS happy. Wonder why master is still green without it
3dbb0bc to
dcb9bd6
Compare
|
Mostly this is ready now. Tests are finally green. |
93df594 to
544c21c
Compare
|
This needs to hold, until #3546 is solved. I'm seeing even worse result with this PR. |
|
OK this PR is not the cause of the regression. It's probabaly #3474. But still let's hold this until the regression is solved. |
544c21c to
fe96309
Compare
|
Now that #778 is closed, this does not seem necessary. Close this for now, |
|
Thanks for this anyway, @skyline75489. I do want to get this done at some point in the future and will likely use this as a template whenever that time comes. But you're right, it's not as critical right now anymore. |

Summary of the Pull Request
This is the successor of #3438 . The idea is based on @miniksa 's comment originally posted here #3438 (comment)
References
#3075 #3438
PR Checklist
Detailed Description of the Pull Request / Additional comments
This is still WIP, though. Would love to hear everybody's feedback.
Validation Steps Performed
Just pretty much everything should be validated. Vim, cat, top, cmatrix, and so on.