Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@ahsonkhan
Copy link

@ahsonkhan ahsonkhan commented Feb 14, 2018

Part of https://github.com/dotnet/corefx/issues/21395#issuecomment-359906138

  • ToLower
  • ToLowerInvariant
  • ToUpper
  • ToUpperInvariant

TODO:

  • Add the allocating counterpart for slow span in corefx and add tests there.

Related corefx PR: dotnet/corefx#27193

cc @jkotas, @stephentoub, @KrzysztofCwalina, @tarekgh, @JeremyKuhne

public static class Span
{
// s_invariantMode is defined for the perf reason as accessing the instance field is faster than access the static property GlobalizationMode.Invariant
private static readonly bool s_invariantMode = GlobalizationMode.Invariant;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache does not make sense. GlobalizationMode.Invariant property should have the exact same perf as you local cache.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was borrowed from CompareInfo:

// _invariantMode is defined for the perf reason as accessing the instance field is faster than access the static property GlobalizationMode.Invariant

Should it be removed from there too?

Would calling the Invariant property result in this function call every time?

internal static bool GetInvariantSwitchValue()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would calling the Invariant property result in this function call every time?

No, the syntax:

internal static bool Invariant { get; } = GetGlobalizationInvariantMode();

is equivalent to:

private readonly static bool backingFieldInvariant = GetGlobalizationInvariantMode();
internal static bool Invariant => backingFieldInvariant;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was borrowed from CompareInfo:

The one in CompareInfo is instance field. This one is static field. The performance characteristics of instance and static fields are quite different (instance fields are faster).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be removed from there too?

No, in CompareInfo it is an instance field and not static.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance characteristics of instance and static fields are quite different (instance fields are faster).

Good to know. I missed the detail of it being an instance field vs. a static field here.

No, the syntax... is equivalent to...

Ah I see. Thanks for clarifying.


if (s_invariantMode)
{
CultureInfo.CurrentCulture.TextInfo.ToLowerAsciiInvariant(source, destination);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the passed in culture; not CurrentCulture.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same in all other places.)

if (s_invariantMode)
{
CultureInfo.CurrentCulture.TextInfo.ToLowerAsciiInvariant(source, destination);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to return here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same in all other places.)

{
fixed (char* pResult = &MemoryMarshal.GetReference(destination))
{
#if CORECLR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CORECLR ifdef is not necessary.

else
#endif
{
ChangeCase(pSource, source.Length, pResult, destination.Length, toUpper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a conclusion on whether it is safe to depend on 1:1 mapping between lower and upper case?

https://github.com/dotnet/corefx/issues/21395#issuecomment-353233435

Copy link
Member

@jkotas jkotas Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the current String APIs make this assumption, but that assumption does not affect their public shape - if needed, we can fix them without changing their public signature.

It is not true for the Span APIs. If there is no 1:1 mapping, the Span APIs need to start returning number of characters copied to output buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a reasonable suggestion for span (and maybe in the future we can support it for strings too).

@ahsonkhan could you discuss this with the design committee to update such APIs? I know we mentioned we are not supporting this today but this doesn't prevent us to start supporting it with the spans.

cc @KrzysztofCwalina

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. If we change our API to return the copied characters, though redundant atm, it will be flexible to possible future implementation changes.

To be clear, is the new proposal for the APIs as follows (including the Invariant versions)?

public static int ToLower(this ReadOnlySpan<char> source, Span<char> destination, CultureInfo culture);
public static int ToLowerInvariant(this ReadOnlySpan<char> source, Span<char> destination);
public static int ToUpper(this ReadOnlySpan<char> source, Span<char> destination, CultureInfo culture);
public static int ToUpperInvariant(this ReadOnlySpan<char> source, Span<char> destination);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, is the new proposal for the APIs as follows (including the Invariant versions)?

Yes, Invariant is a culture :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, what we'll do if the destination span is not enough? we throw? or return -1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer that we throw just like we do atm for destination.Length < source.Length since the assumption is that the length required can be statically known at compile time prior to calling the method.

However, if the 1:1 mapping does not hold in the future, then my assumption breaks and throwing may not be the best solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we throw only because of the length? if so, I then prefer not throw and return -1 in case not enough buffer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these APIs, should we enable partial processing (and ability to restart), like we do with some of our Base64 APIs, and return OperationStatus with charactersConsumed and charactersWritten out parameters? Or is that too much complexity for such a low level API, especially since there is a 1:1 mapping atm?

For example:

public static OperationStatus ToLower(this ReadOnlySpan<char>, Span<char> destination, CultureInfo culture, out char charactersConsumed, out char charactersWritten);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may have a simple one which just returns -1 and we can have the overload that takes more parameters if we find a demand for that.

destination[i] = (char)(source[i] | 0x20);
i++;

while (i < source.Length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string-based implementation of this has two separate loops like this because it wants to optimize the path where nothing needs to change and thus nothing needs to allocate. But here with span, we always need to look at every character and copy every character, potentially modified, to the output destination. Can we just consolidate it all to a single loop then? e.g.

internal void ToLowerAsciiInvariant(ReadOnlySpan<char> source, Span<char> destination)
{
    Debug.Assert(destination.Length >= source.Length);
    for (int i = 0; i < source.Length; i++)
    {
        destination[i] = ToLowerAsciiInvariant(source[i]);
    }
}

or is there some reason that's not as good?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the optimization which avoids having to do the or by 0x20 operation where not necessary (i.e when ((uint)(source[i] - 'A') > ('Z' - 'A'))?

I suspect the extra CopyTo probably hurts performance more, so I will stick to the simpler implementation. I will measure the perf and compare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
// Assuming that changing case does not affect length
if (destination.Length < source.Length)
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in a hypothetical future, will invariant ever result in changing case affecting length?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible. The sorting and casing can change for the Invariant :-(

if (GlobalizationMode.Invariant)
CultureInfo.InvariantCulture.TextInfo.ToLowerAsciiInvariant(source, destination);
else
CultureInfo.InvariantCulture.TextInfo.ChangeCase(source, destination, toUpper: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarekgh, is there a difference between these two operations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a difference. ToLowerAsciiInvariant is just trying to be optimized for the Ascii but ChangeCase is not. for example, if you have a Turkish text would be handled wrong if you do it with ToLowerAsciiInvariant

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this naming confusing then? Turning on the "Invariant" mode potentially produces different results than using the InvariantCulture?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, Invariant mode means we'll use the Invariant culture data all the time (without going to call the underlying OS) this will include doing all sorting/casing operations as ordinal. I believe whoever introduced ToLowerAsciiInvariant meant to have this operation mainly for Ascii strings and at the same time not using a culture like Turkish to avoid the Turkish 'I' problem. but in general, you are right for the statement, the invariant mode can produce different results than invariant culture in the sorting and casing area.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thanks.

char* b = bp;

while (length != 0 && (*a < 0x80) && (*b < 0x80) && (!HighCharTable[*a]) && (!HighCharTable[*b]))
while (length != 0 && (*a < 0x80) && (*b < 0x80) && (!s_highCharTable[*a]) && (!s_highCharTable[*b]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the JIT successfully avoid bounds checks on s_highCharTable on each iteration of the loop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like it is (tested on windows). We have two compares to 80h.

while (length != 0 && (*a < 0x80) && (*b < 0x80) && (!s_highCharTable[*a]) && (!s_highCharTable[*b]))
00007FF993402A25  test        edx,edx  
00007FF993402A27  je          00007FF993402A7C  
00007FF993402A29  movzx       r10d,word ptr [rcx]  
00007FF993402A2D  cmp         r10d,80h  
00007FF993402A34  jge         00007FF993402A7C  
00007FF993402A36  movzx       r11d,word ptr [rax]  
00007FF993402A3A  cmp         r11d,80h  
00007FF993402A41  jge         00007FF993402A7C  
00007FF993402A43  mov         rsi,23530CE28B0h  
00007FF993402A4D  mov         rsi,qword ptr [rsi]  
00007FF993402A50  mov         rdi,rsi  
00007FF993402A53  mov         ebx,r10d  
00007FF993402A56  mov         ebp,dword ptr [rdi+8]  
00007FF993402A59  cmp         ebx,ebp  
00007FF993402A5B  jae         00007FF993402AC0  
00007FF993402A5D  movsxd      rbx,ebx  
00007FF993402A60  cmp         byte ptr [rdi+rbx+10h],0  
00007FF993402A65  jne         00007FF993402A7C  
00007FF993402A67  mov         edi,r11d  
00007FF993402A6A  cmp         edi,ebp  
00007FF993402A6C  jae         00007FF993402AC0  
00007FF993402A6E  movsxd      rdi,edi  
00007FF993402A71  cmp         byte ptr [rsi+rdi+10h],0  
00007FF993402A76  je          00007FF9934029DD

cc @AndyAyersMS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the code would result in better codegen if s_highCharTable were stored into a local before the loop...?

Copy link
Author

@ahsonkhan ahsonkhan Feb 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't get rid of the bounds checks.
We save a couple of instructions at the prologue/epilogue with pushing to/popping off the stack.


for (int i = 0; i < source.Length; i++)
{
destination[i] = ToLowerAsciiInvariant(source[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to confirm that ToLower/UpperAsciiInvariant gets inlined? Seems like it should, but it'd be good to double check.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does.

            for (int i = 0; i < source.Length; i++)
00007FF9934127D0  sub         rsp,28h  
00007FF9934127D4  mov         rax,qword ptr [rcx]  
00007FF9934127D7  mov         ecx,dword ptr [rcx+8]  
00007FF9934127DA  xor         r8d,r8d  
00007FF9934127DD  test        ecx,ecx  
00007FF9934127DF  jle         00007FF993412814  
            {
                destination[i] = ToLowerAsciiInvariant(source[i]);
00007FF9934127E1  cmp         r8d,dword ptr [rdx+8]  
00007FF9934127E5  jae         00007FF993412819  
00007FF9934127E7  mov         r9,qword ptr [rdx]  
00007FF9934127EA  movsxd      r10,r8d  
00007FF9934127ED  lea         r9,[r9+r10*2]  
00007FF9934127F1  movzx       r10d,word ptr [rax+r10*2]  
00007FF9934127F6  lea         r11d,[r10-41h]  
00007FF9934127FA  cmp         r11d,19h  
00007FF9934127FE  ja          00007FF993412808  
00007FF993412800  or          r10d,20h  
00007FF993412804  movzx       r10d,r10w  
00007FF993412808  mov         word ptr [r9],r10w  
            for (int i = 0; i < source.Length; i++)
00007FF99341280C  inc         r8d  
00007FF99341280F  cmp         r8d,ecx  
00007FF993412812  jl          00007FF9934127E1  
00007FF993412814  add         rsp,28h  
00007FF993412818  ret  
00007FF993412819  call        00007FF9F3069860  
00007FF99341281E  int         3  
            for (int i = 0; i < source.Length; i++)
00007FF993412A50  sub         rsp,28h  
00007FF993412A54  mov         rax,qword ptr [rcx]  
00007FF993412A57  mov         ecx,dword ptr [rcx+8]  
00007FF993412A5A  xor         r8d,r8d  
00007FF993412A5D  test        ecx,ecx  
00007FF993412A5F  jle         00007FF993412A94  
            {
                destination[i] = ToUpperAsciiInvariant(source[i]);
00007FF993412A61  cmp         r8d,dword ptr [rdx+8]  
00007FF993412A65  jae         00007FF993412A99  
00007FF993412A67  mov         r9,qword ptr [rdx]  
00007FF993412A6A  movsxd      r10,r8d  
00007FF993412A6D  lea         r9,[r9+r10*2]  
00007FF993412A71  movzx       r10d,word ptr [rax+r10*2]  
00007FF993412A76  lea         r11d,[r10-61h]  
00007FF993412A7A  cmp         r11d,19h  
00007FF993412A7E  ja          00007FF993412A88  
00007FF993412A80  and         r10d,0FFFFFFDFh  
00007FF993412A84  movzx       r10d,r10w  
00007FF993412A88  mov         word ptr [r9],r10w  
            for (int i = 0; i < source.Length; i++)
00007FF993412A8C  inc         r8d  
00007FF993412A8F  cmp         r8d,ecx  
00007FF993412A92  jl          00007FF993412A61  
00007FF993412A94  add         rsp,28h  
00007FF993412A98  ret  
00007FF993412A99  call        00007FF9F3069860  
00007FF993412A9E  int         3  

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks, @ahsonkhan.

/// Copies the characters from the source span into the destination, converting each character to lowercase
/// using the casing rules of the invariant culture.
/// </summary>
public static int ToLowerInvariant(this ReadOnlySpan<char> source, Span<char> destination)
Copy link
Author

@ahsonkhan ahsonkhan Feb 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this call the ToLower with culture = CultureInfo.InvariantCulture passed in? Same with ToUpper.

It avoids code duplication at the cost of an extra null check.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants