Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Regex perf test#26827

Merged
danmoseley merged 2 commits intodotnet:masterfrom
danmoseley:regex.perftest
Feb 6, 2018
Merged

Regex perf test#26827
danmoseley merged 2 commits intodotnet:masterfrom
danmoseley:regex.perftest

Conversation

@danmoseley
Copy link
Member

A crude start at regex perf tests.

The patterns and inputs are a subset of those in RegexGroupTests.Groups_Basic_TestData -- I excluded a small number that were unusually slow. All patterns are valid patterns.

I initially started by measuring each pattern separately but this produces too much data (and since each should take 100ms-1sec to measure, takes a long time). Rather than cherry pick a few of the patterns, I chose to start the tests with a single test that gets broad coverage, until we break tests out further.

Execution time is ~15 sec and produces a table like this, with matching CSV.

   System.Text.RegularExpressions.Performance.Tests.dll  | Metric         | Unit  | Iterations |    Average |   STDEV.S |        Min |        Max
  :----------------------------------------------------- |:-------------- |:-----:|:----------:| ----------:| ---------:| ----------:| ----------:
   System.Text.RegularExpressions.Tests.Perf_Regex.Match | Duration       | msec  |     40     |    255.747 |     8.223 |    245.721 |    274.721
   System.Text.RegularExpressions.Tests.Perf_Regex.Match | GC Allocations | bytes |     40     | 1.990E+008 | 43711.569 | 1.990E+008 | 1.992E+008

/// </summary>
public class Perf_Regex
{
private const int innerIterations = 100;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chosen to get ~250ms outer iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: InnerIterations

}
}

// A series of patterns (all valid and non pathological) and inputs (which they may or may not match)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually all of these do match. I'll leave the comment, because it would be nice to add more that don't.

@danmoseley
Copy link
Member Author

@dotnet-bot test NETFX x86 Release Build

@danmoseley
Copy link
Member Author

@dotnet-bot test OSX x64 Debug Build

{
for (int i = 0; i < innerIterations; i++)
{
foreach(var test in Match_TestData())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's work here unrelated to Regex.Match that it would be nice to move out of the measurement, e.g. store the data from Match_TestData() into an array at the beginning of the method once and then access the array here.

}

// A series of patterns (all valid and non pathological) and inputs (which they may or may not match)
public static IEnumerable<object[]> Match_TestData()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to make this a Theory? If not we don't need to be constrained to use object[] and can use a type more easily accessed, like a value tuple with names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a theory, since each would have too low duration.

This would make sense, but does not seem essential now, I will follow up with cleanup next time I work on regex perf.

@danmoseley
Copy link
Member Author

@dotnet/dnceng Windows and OSX queues are timing out, Is this a helix thing?
{"FailureReason":null,"QueueId":null,"JobList":null,"WorkItems":{"Unscheduled":0,"Waiting":0,"Running":0,"Finished":0,"ListUrl":"https://helix.dot.net/api/2017-04-14/jobs/adb2e35e-21a2-4134-8997-f02e22df07e7/workitems"},"Name":"adb2e35e-21a2-4134-8997-f02e22df07e7","Creator":null,"Created":null,"Finished":null,"InitialWorkItemCount":null,"WaitUrl":"https://helix.dot.net/api/2017-04-14/jobs/adb2e35e-21a2-4134-8997-f02e22df07e7/wait","Source":null,"Type":null,"Build":null,"Properties":null,"Errors":[{"Id":"JobEnqueueFailure","Message":"'System.TimeoutException':'The request has timed out after 60000 milliseconds. The successful completion of the request cannot be determined. Additional queries should be made to determine whether or not the operation has succeeded. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 TrackingId:65f85d4b-dbbb-4b4a-8905-b50e7b103c42, SystemTracker:helixtelemetry.servicebus.windows.net:telemetry, Timestamp:2/5/2018 2:08:47 AM' thrown while: downloading Job list JSON","LogUri":null}]}

@danmoseley
Copy link
Member Author

@dotnet/dnceng any thoughts?

@MattGal
Copy link
Member

MattGal commented Feb 5, 2018

@danmosemsft I'll take a quick look and chat w/ the FR team in their standup in 5 mins.

@danmoseley
Copy link
Member Author

Thanks @MattGal !

@MattGal
Copy link
Member

MattGal commented Feb 5, 2018

Filed https://github.com/dotnet/core-eng/issues/2575. I suspect this relates to recent issues w/ EventHub we've been seeing but I can assure you those jobs ran and tests executed, so we should also try to harden against problems like this.

@Chrisboh FYI.

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I hesitated to add benchmark tests for Regex as I wanted to see if we could use BenchmarkDotNet somehow in our innerloop. But I guess we can easily switch later on.

@danmoseley
Copy link
Member Author

@MattGal should I run again? Or can you see that tests all passed?

@MattGal
Copy link
Member

MattGal commented Feb 5, 2018

@danmosemsft sorry for the slow reply; for the run I checked everything passed but I'd suggest rerunning either way, I don't want to be responsible for manually aggregating a bunch of runs :)

@danmoseley danmoseley closed this Feb 6, 2018
@danmoseley danmoseley reopened this Feb 6, 2018
@danmoseley
Copy link
Member Author

No problem

@danmoseley danmoseley merged commit 80ef82c into dotnet:master Feb 6, 2018
@danmoseley danmoseley deleted the regex.perftest branch February 6, 2018 01:57
@ViktorHofer
Copy link
Member

As discussed yesterday it would be great if we could add a benchmark with Invariant mode enabled

@karelz karelz added this to the 2.1.0 milestone Mar 10, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Regex perf test

* caps


Commit migrated from dotnet/corefx@80ef82c
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants