Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@ViktorHofer
Copy link
Member

@ViktorHofer ViktorHofer commented Feb 26, 2018

From the previous PR with the changes now grouped into multiple commits. No squash please!

@stephentoub I will incorporate your feedback here later today.

/// <summary>
/// Returns an empty Match object.
/// </summary>
public static Match Empty => new Match(null, 1, string.Empty, 0, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a mistake. Previously it would only make a singleton empty match object, now it's making one every time it's requested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see a test is failing for this reason: NextMatch_EmptyMatch_ReturnsEmptyMatch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To speed things up I pushed this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction, I didn't as I don't have permissions. It does fix the test though.

Copy link
Member

@stephentoub stephentoub Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this need to be { get; } = instead of =>


_canonical = true;
_rangelist.Sort(SingleRangeComparer.Instance);
_rangelist.Sort(SingleRangeComparer.s_instance);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style wise I don't know how we feel about directly accessing fields, even of nested classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see we were already doing it, you renamed. Fine.

Copy link
Member

@danmoseley danmoseley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if you fix the Match.Empty mistake.

Also please do a sanity test of some perf scenario.

I agree let's not squash this.

@danmoseley danmoseley added the * NO SQUASH * The PR should not be squashed label Feb 26, 2018
int IList<Capture>.IndexOf(Capture item)
{
var comparer = EqualityComparer<Capture>.Default;
EqualityComparer<Capture> comparer = EqualityComparer<Capture>.Default;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately, I would suggest looking at not storing this into a local first. With devirtualization support for EqualityComparer.Default, this might be better as:

for (int i = 0; i < Count; i++)
{
    if (EqualityComparer<Capture>.Default.Equals(this[i], item)
    {
        return i;
    }
}
return -1;

There's also a lot of code behind the seemingly innocent this[i]; something else to look at separately.

public static readonly TimeSpan InfiniteMatchTimeout = Timeout.InfiniteTimeSpan;

// timeout for the execution of this regex
// timeout for the execution of this regex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's strange to me to see a period added at the end, when it's not a full sentence and doesn't begin with a capital

if (caps == null)
{
caps = new Hashtable(value);
}
Copy link
Member

@stephentoub stephentoub Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this (and other cases like it) can be simplified, e.g.

caps = value as Hashtable ?? new Hashtable(value);

return RegexCompiler.Compile(code, roptions);
}

[SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "assemblyname", Justification = "Microsoft: already shipped since v1 - can't fix without causing a breaking change")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we run these FxCop naming rules still? I'm wondering if all of these SuppressMessage attributes can just be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, we can remove them all.

sb.Append(s_hex[(ch >> shift) & 0xF]);
}

return sb.ToString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this CharDescription code does not need to be efficient? (It's not ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tell me more please :) We loose a lot of performance in RegexCharClass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We loose a lot of performance in RegexCharClass.

I was referring specifically to these "Description" methods. They're used in common code paths in release builds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, that method is only used in DEBUG code.

private sealed class SingleRangeComparer : IComparer<SingleRange>
{
public static readonly SingleRangeComparer Instance = new SingleRangeComparer();
public static readonly SingleRangeComparer s_instance = new SingleRangeComparer();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is a public field, so we generally have used PascalCasing. It just happens to be on a private type.

Copy link
Member Author

@ViktorHofer ViktorHofer Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should adjust our .editorconfig file as VS recommends such a change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to internal would be non breaking?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a private class

@stephentoub
Copy link
Member

No squash please!

Are you going to do any squashing prior to merging? There are a huge number of commits in this PR, and many of them don't need to stand on their own.

@ViktorHofer
Copy link
Member Author

Yes I will squash the appropriate ones before merging. E.g. code cleanups together, split of Regex class into subfiles together, Span for Match & Capture

@stephentoub
Copy link
Member

Thanks.

@ViktorHofer ViktorHofer force-pushed the RegexRefa branch 2 times, most recently from f339978 to 7716503 Compare February 27, 2018 22:47
@ViktorHofer
Copy link
Member Author

ViktorHofer commented Feb 27, 2018

Thanks.

Done.

Also please do a sanity test of some perf scenario.

Just done locally. Results are the same for ctor construction and Match. Replace could be slightly faster because we use Slice instead of Substring now.

I will merge this when CI turns green.

Move Cache related code to Regex.Cache.cs

Move Match related code to Regex.Match.cs

Move Replace related code to Regex.Replace.cs

Move Split related code to Match.Split.cs

Move Timeout related code to Regex.Timeout.cs
Code cleanup in RegexBoyerMoore

Code Cleanup in RegexCharClass

RegexFCD code cleanup

Code cleanup & limit visibility of helper methods in RegexNode

Code cleanup in RegexReplacement

Code cleanup in RegexTree

Group code cleanup

Match code cleanup

Capture code cleanup

Fix Match.Empty initalization to only compute ones

Regex code cleanup and comments added
@ViktorHofer
Copy link
Member Author

ViktorHofer commented Feb 28, 2018

@dotnet-bot test this

System\Text\RegularExpressions\Capture.cs(54,64): error CS1501: No overload for method 'AsReadOnlySpan' takes 2 arguments [D:\j\workspace\windows-TGrou---f8ac6754\src\System.Text.RegularExpressions\src\System.Text.RegularExpressions.csproj]
System\Text\RegularExpressions\Capture.cs(59,65): error CS1501: No overload for method 'AsReadOnlySpan' takes 2 arguments [D:\j\workspace\windows-TGrou---f8ac6754\src\System.Text.RegularExpressions\src\System.Text.RegularExpressions.csproj]
System\Text\RegularExpressions\Match.cs(149,25): error CS1501: No overload for method 'AsReadOnlySpan' takes 2 arguments [D:\j\workspace\windows-TGrou---f8ac6754\src\System.Text.RegularExpressions\src\System.Text.RegularExpressions.csproj]

No idea what's going on here. AFAIK AsReadOnlySpan takes 2 arguments... Ah it is renamed to ToSpan: 913d11e

@danmoseley
Copy link
Member

MetadataReaderTests failed on OSX10.12 with BIFException. I"ll open an issue if that proves to be more than a build glitch.

@danmoseley danmoseley merged commit 28e3939 into dotnet:master Feb 28, 2018
@danmoseley
Copy link
Member

I nearly squashed...

@danmoseley
Copy link
Member

@stephentoub I overlooked that you hadn't hit signoff.

@stephentoub
Copy link
Member

I overlooked that you hadn't hit signoff.

I guess that means I have a get-out-of-jail-free card if this causes everything to go haywire.

@ViktorHofer ViktorHofer deleted the RegexRefa branch February 28, 2018 10:45
@karelz karelz added this to the 2.1.0 milestone Mar 10, 2018
Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corefx that referenced this pull request Oct 30, 2019
…() (dotnet#27473)

* Revert removal of SuppressGCTransition from SystemNative_GetTimestamp()

* Insert GC_POLL before statement with unmanaged call.

* JIT test for insertion of GCPoll

Signed-off-by: dotnet-bot <[email protected]>
stephentoub pushed a commit that referenced this pull request Oct 30, 2019
…() (#27473)

* Revert removal of SuppressGCTransition from SystemNative_GetTimestamp()

* Insert GC_POLL before statement with unmanaged call.

* JIT test for insertion of GCPoll

Signed-off-by: dotnet-bot <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants