Skip to content

Conversation

@stephentoub
Copy link
Member

Today, given an alternation like a|b|[cd]|efg, that gets reduced to [abcd]|efg, supporting One and Set nodes. But it doesn't support Notone nodes. That means the semi-common idiom .|\n that folks use to express any character when not using Singleline doesn't get reduced and remains an alternation. This extends the existing reduction pass to also recognize Notones, just by treating them as one or two ranges.

Today, given an alternation like `a|b|[cd]|efg`, that gets reduced to `[abcd]|efg`, supporting One and Set nodes. But it doesn't support Notone nodes. That means the semi-common idiom `.|\n` that folks use to express any character when not using Singleline doesn't get reduced and remains an alternation. This extends the existing reduction pass to also recognize Notones, just by treating them as one or two ranges.
@stephentoub stephentoub requested review from MihaZupan and Copilot July 28, 2025 01:19
@stephentoub
Copy link
Member Author

@MihuBot regexdiff

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR extends the regex alternation reduction optimization to support Notone nodes (e.g., [^a]) in addition to the existing One and Set nodes. This enables common patterns like .|\n (used to match any character including newlines) to be optimized into a single character class [\s\S] instead of remaining as an alternation.

  • Extends alternation reduction logic to handle Notone nodes by treating them as character ranges
  • Adds AddNotChar method to RegexCharClass to convert negated characters into appropriate ranges
  • Updates condition checks and comments to include Notone nodes in the merge logic

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
RegexReductionTests.cs Adds test cases for new Notone reduction scenarios including `.
RegexNode.cs Updates alternation reduction logic to handle Notone nodes and merge them with other character classes
RegexCharClass.cs Implements AddNotChar method to convert negated characters into positive character ranges

@MihuBot
Copy link

MihuBot commented Jul 28, 2025

54 out of 18857 patterns have generated source code changes.

Examples of GeneratedRegex source diffs
"<(.|\\n)*?>" (559 uses)
[GeneratedRegex("<(.|\\n)*?>")]
     /// ○ Match '&lt;'.<br/>
     /// ○ Loop lazily any number of times.<br/>
     ///     ○ 1st capture group.<br/>
-    ///         ○ Match with 2 alternative expressions.<br/>
-    ///             ○ Match any character other than '\n'.<br/>
-    ///             ○ Match '\n'.<br/>
+    ///         ○ Match any character.<br/>
     /// ○ Match '&gt;'.<br/>
     /// </code>
     /// </remarks>
                         lazyloop_iteration++;
                         
                         // 1st capture group.
-                        //{
+                        {
                             int capture_starting_pos = pos;
                             
-                            // Match with 2 alternative expressions.
-                            //{
-                                int alternation_starting_pos = pos;
-                                int alternation_starting_capturepos = base.Crawlpos();
-                                
-                                // Branch 0
-                                //{
-                                    // Match any character other than '\n'.
-                                    if (slice.IsEmpty || slice[0] == '\n')
-                                    {
-                                        goto AlternationBranch;
-                                    }
-                                    
-                                    Utilities.StackPush(ref base.runstack!, ref stackpos, 0, alternation_starting_pos, alternation_starting_capturepos);
-                                    pos++;
-                                    slice = inputSpan.Slice(pos);
-                                    goto AlternationMatch;
-                                    
-                                    AlternationBranch:
-                                    pos = alternation_starting_pos;
-                                    slice = inputSpan.Slice(pos);
-                                    UncaptureUntil(alternation_starting_capturepos);
-                                //}
-                                
-                                // Branch 1
-                                //{
-                                    // Match '\n'.
-                                    if (slice.IsEmpty || slice[0] != '\n')
-                                    {
-                                        goto LazyLoopIterationNoMatch;
-                                    }
-                                    
-                                    Utilities.StackPush(ref base.runstack!, ref stackpos, 1, alternation_starting_pos, alternation_starting_capturepos);
-                                    pos++;
-                                    slice = inputSpan.Slice(pos);
-                                    goto AlternationMatch;
-                                //}
-                                
-                                AlternationBacktrack:
-                                if (Utilities.s_hasTimeout)
-                                {
-                                    base.CheckTimeout();
-                                }
-                                
-                                Utilities.StackPop(base.runstack!, ref stackpos, out alternation_starting_capturepos, out alternation_starting_pos);
-                                switch (base.runstack![--stackpos])
-                                {
-                                    case 0:
-                                        goto AlternationBranch;
-                                    case 1:
-                                        goto LazyLoopIterationNoMatch;
-                                }
-                                
-                                AlternationMatch:;
-                            //}
+                            // Match any character.
+                            if (slice.IsEmpty || false)
+                            {
+                                goto LazyLoopIterationNoMatch;
+                            }
                             
+                            pos++;
+                            slice = inputSpan.Slice(pos);
                             base.Capture(1, capture_starting_pos, pos);
-                            
-                            Utilities.StackPush(ref base.runstack!, ref stackpos, capture_starting_pos);
-                            goto CaptureSkipBacktrack;
-                            
-                            CaptureBacktrack:
-                            capture_starting_pos = base.runstack![--stackpos];
-                            goto AlternationBacktrack;
-                            
-                            CaptureSkipBacktrack:;
-                        //}
+                        }
                         
                         goto LazyLoopEnd;
                         
                         // The lazy loop iteration failed to match.
                         LazyLoopIterationNoMatch:
-                        lazyloop_iteration--;
-                        UncaptureUntil(base.runstack![--stackpos]);
-                        pos = base.runstack![--stackpos];
-                        slice = inputSpan.Slice(pos);
-                        if (lazyloop_iteration > 0)
-                        {
-                            // The lazy loop matched at least one iteration; backtrack into the last one.
-                            goto CaptureBacktrack;
-                        }
-                        
                         UncaptureUntil(0);
                         return false; // The input didn't match.
                         
         /// <summary>Whether <see cref="s_defaultTimeout"/> is non-infinite.</summary>
         internal static readonly bool s_hasTimeout = s_defaultTimeout != Regex.InfiniteMatchTimeout;
         
-        /// <summary>Pops 2 values from the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPop(int[] stack, ref int pos, out int arg0, out int arg1)
-        {
-            arg0 = stack[--pos];
-            arg1 = stack[--pos];
-        }
-        
-        /// <summary>Pushes 1 value onto the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPush(ref int[] stack, ref int pos, int arg0)
-        {
-            // If there's space available for the value, store it.
-            int[] s = stack;
-            int p = pos;
-            if ((uint)p < (uint)s.Length)
-            {
-                s[p] = arg0;
-                pos++;
-                return;
-            }
-        
-            // Otherwise, resize the stack to make room and try again.
-            WithResize(ref stack, ref pos, arg0);
-        
-            // <summary>Resize the backtracking stack array and push 1 value onto the stack.</summary>
-            [MethodImpl(MethodImplOptions.NoInlining)]
-            static void WithResize(ref int[] stack, ref int pos, int arg0)
-            {
-                Array.Resize(ref stack, (pos + 0) * 2);
-                StackPush(ref stack, ref pos, arg0);
-            }
-        }
-        
         /// <summary>Pushes 2 values onto the backtracking stack.</summary>
         [MethodImpl(MethodImplOptions.AggressiveInlining)]
         internal static void StackPush(ref int[] stack, ref int pos, int arg0, int arg1)
                 StackPush(ref stack, ref pos, arg0, arg1);
             }
         }
-        
-        /// <summary>Pushes 3 values onto the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPush(ref int[] stack, ref int pos, int arg0, int arg1, int arg2)
-        {
-            // If there's space available for all 3 values, store them.
-            int[] s = stack;
-            int p = pos;
-            if ((uint)(p + 2) < (uint)s.Length)
-            {
-                s[p] = arg0;
-                s[p + 1] = arg1;
-                s[p + 2] = arg2;
-                pos += 3;
-                return;
-            }
-        
-            // Otherwise, resize the stack to make room and try again.
-            WithResize(ref stack, ref pos, arg0, arg1, arg2);
-        
-            // <summary>Resize the backtracking stack array and push 3 values onto the stack.</summary>
-            [MethodImpl(MethodImplOptions.NoInlining)]
-            static void WithResize(ref int[] stack, ref int pos, int arg0, int arg1, int arg2)
-            {
-                Array.Resize(ref stack, (pos + 2) * 2);
-                StackPush(ref stack, ref pos, arg0, arg1, arg2);
-            }
-        }
     }
 }
"\\((.|\\n)*?\\)" (539 uses)
[GeneratedRegex("\\((.|\\n)*?\\)")]
     /// ○ Match '('.<br/>
     /// ○ Loop lazily any number of times.<br/>
     ///     ○ 1st capture group.<br/>
-    ///         ○ Match with 2 alternative expressions.<br/>
-    ///             ○ Match any character other than '\n'.<br/>
-    ///             ○ Match '\n'.<br/>
+    ///         ○ Match any character.<br/>
     /// ○ Match ')'.<br/>
     /// </code>
     /// </remarks>
                         lazyloop_iteration++;
                         
                         // 1st capture group.
-                        //{
+                        {
                             int capture_starting_pos = pos;
                             
-                            // Match with 2 alternative expressions.
-                            //{
-                                int alternation_starting_pos = pos;
-                                int alternation_starting_capturepos = base.Crawlpos();
-                                
-                                // Branch 0
-                                //{
-                                    // Match any character other than '\n'.
-                                    if (slice.IsEmpty || slice[0] == '\n')
-                                    {
-                                        goto AlternationBranch;
-                                    }
-                                    
-                                    Utilities.StackPush(ref base.runstack!, ref stackpos, 0, alternation_starting_pos, alternation_starting_capturepos);
-                                    pos++;
-                                    slice = inputSpan.Slice(pos);
-                                    goto AlternationMatch;
-                                    
-                                    AlternationBranch:
-                                    pos = alternation_starting_pos;
-                                    slice = inputSpan.Slice(pos);
-                                    UncaptureUntil(alternation_starting_capturepos);
-                                //}
-                                
-                                // Branch 1
-                                //{
-                                    // Match '\n'.
-                                    if (slice.IsEmpty || slice[0] != '\n')
-                                    {
-                                        goto LazyLoopIterationNoMatch;
-                                    }
-                                    
-                                    Utilities.StackPush(ref base.runstack!, ref stackpos, 1, alternation_starting_pos, alternation_starting_capturepos);
-                                    pos++;
-                                    slice = inputSpan.Slice(pos);
-                                    goto AlternationMatch;
-                                //}
-                                
-                                AlternationBacktrack:
-                                if (Utilities.s_hasTimeout)
-                                {
-                                    base.CheckTimeout();
-                                }
-                                
-                                Utilities.StackPop(base.runstack!, ref stackpos, out alternation_starting_capturepos, out alternation_starting_pos);
-                                switch (base.runstack![--stackpos])
-                                {
-                                    case 0:
-                                        goto AlternationBranch;
-                                    case 1:
-                                        goto LazyLoopIterationNoMatch;
-                                }
-                                
-                                AlternationMatch:;
-                            //}
+                            // Match any character.
+                            if (slice.IsEmpty || false)
+                            {
+                                goto LazyLoopIterationNoMatch;
+                            }
                             
+                            pos++;
+                            slice = inputSpan.Slice(pos);
                             base.Capture(1, capture_starting_pos, pos);
-                            
-                            Utilities.StackPush(ref base.runstack!, ref stackpos, capture_starting_pos);
-                            goto CaptureSkipBacktrack;
-                            
-                            CaptureBacktrack:
-                            capture_starting_pos = base.runstack![--stackpos];
-                            goto AlternationBacktrack;
-                            
-                            CaptureSkipBacktrack:;
-                        //}
+                        }
                         
                         goto LazyLoopEnd;
                         
                         // The lazy loop iteration failed to match.
                         LazyLoopIterationNoMatch:
-                        lazyloop_iteration--;
-                        UncaptureUntil(base.runstack![--stackpos]);
-                        pos = base.runstack![--stackpos];
-                        slice = inputSpan.Slice(pos);
-                        if (lazyloop_iteration > 0)
-                        {
-                            // The lazy loop matched at least one iteration; backtrack into the last one.
-                            goto CaptureBacktrack;
-                        }
-                        
                         UncaptureUntil(0);
                         return false; // The input didn't match.
                         
         /// <summary>Whether <see cref="s_defaultTimeout"/> is non-infinite.</summary>
         internal static readonly bool s_hasTimeout = s_defaultTimeout != Regex.InfiniteMatchTimeout;
         
-        /// <summary>Pops 2 values from the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPop(int[] stack, ref int pos, out int arg0, out int arg1)
-        {
-            arg0 = stack[--pos];
-            arg1 = stack[--pos];
-        }
-        
-        /// <summary>Pushes 1 value onto the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPush(ref int[] stack, ref int pos, int arg0)
-        {
-            // If there's space available for the value, store it.
-            int[] s = stack;
-            int p = pos;
-            if ((uint)p < (uint)s.Length)
-            {
-                s[p] = arg0;
-                pos++;
-                return;
-            }
-        
-            // Otherwise, resize the stack to make room and try again.
-            WithResize(ref stack, ref pos, arg0);
-        
-            // <summary>Resize the backtracking stack array and push 1 value onto the stack.</summary>
-            [MethodImpl(MethodImplOptions.NoInlining)]
-            static void WithResize(ref int[] stack, ref int pos, int arg0)
-            {
-                Array.Resize(ref stack, (pos + 0) * 2);
-                StackPush(ref stack, ref pos, arg0);
-            }
-        }
-        
         /// <summary>Pushes 2 values onto the backtracking stack.</summary>
         [MethodImpl(MethodImplOptions.AggressiveInlining)]
         internal static void StackPush(ref int[] stack, ref int pos, int arg0, int arg1)
                 StackPush(ref stack, ref pos, arg0, arg1);
             }
         }
-        
-        /// <summary>Pushes 3 values onto the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPush(ref int[] stack, ref int pos, int arg0, int arg1, int arg2)
-        {
-            // If there's space available for all 3 values, store them.
-            int[] s = stack;
-            int p = pos;
-            if ((uint)(p + 2) < (uint)s.Length)
-            {
-                s[p] = arg0;
-                s[p + 1] = arg1;
-                s[p + 2] = arg2;
-                pos += 3;
-                return;
-            }
-        
-            // Otherwise, resize the stack to make room and try again.
-            WithResize(ref stack, ref pos, arg0, arg1, arg2);
-        
-            // <summary>Resize the backtracking stack array and push 3 values onto the stack.</summary>
-            [MethodImpl(MethodImplOptions.NoInlining)]
-            static void WithResize(ref int[] stack, ref int pos, int arg0, int arg1, int arg2)
-            {
-                Array.Resize(ref stack, (pos + 2) * 2);
-                StackPush(ref stack, ref pos, arg0, arg1, arg2);
-            }
-        }
     }
 }
"\\[(.|\\n)*?\\]" (539 uses)
[GeneratedRegex("\\[(.|\\n)*?\\]")]
     /// ○ Match '['.<br/>
     /// ○ Loop lazily any number of times.<br/>
     ///     ○ 1st capture group.<br/>
-    ///         ○ Match with 2 alternative expressions.<br/>
-    ///             ○ Match any character other than '\n'.<br/>
-    ///             ○ Match '\n'.<br/>
+    ///         ○ Match any character.<br/>
     /// ○ Match ']'.<br/>
     /// </code>
     /// </remarks>
                         lazyloop_iteration++;
                         
                         // 1st capture group.
-                        //{
+                        {
                             int capture_starting_pos = pos;
                             
-                            // Match with 2 alternative expressions.
-                            //{
-                                int alternation_starting_pos = pos;
-                                int alternation_starting_capturepos = base.Crawlpos();
-                                
-                                // Branch 0
-                                //{
-                                    // Match any character other than '\n'.
-                                    if (slice.IsEmpty || slice[0] == '\n')
-                                    {
-                                        goto AlternationBranch;
-                                    }
-                                    
-                                    Utilities.StackPush(ref base.runstack!, ref stackpos, 0, alternation_starting_pos, alternation_starting_capturepos);
-                                    pos++;
-                                    slice = inputSpan.Slice(pos);
-                                    goto AlternationMatch;
-                                    
-                                    AlternationBranch:
-                                    pos = alternation_starting_pos;
-                                    slice = inputSpan.Slice(pos);
-                                    UncaptureUntil(alternation_starting_capturepos);
-                                //}
-                                
-                                // Branch 1
-                                //{
-                                    // Match '\n'.
-                                    if (slice.IsEmpty || slice[0] != '\n')
-                                    {
-                                        goto LazyLoopIterationNoMatch;
-                                    }
-                                    
-                                    Utilities.StackPush(ref base.runstack!, ref stackpos, 1, alternation_starting_pos, alternation_starting_capturepos);
-                                    pos++;
-                                    slice = inputSpan.Slice(pos);
-                                    goto AlternationMatch;
-                                //}
-                                
-                                AlternationBacktrack:
-                                if (Utilities.s_hasTimeout)
-                                {
-                                    base.CheckTimeout();
-                                }
-                                
-                                Utilities.StackPop(base.runstack!, ref stackpos, out alternation_starting_capturepos, out alternation_starting_pos);
-                                switch (base.runstack![--stackpos])
-                                {
-                                    case 0:
-                                        goto AlternationBranch;
-                                    case 1:
-                                        goto LazyLoopIterationNoMatch;
-                                }
-                                
-                                AlternationMatch:;
-                            //}
+                            // Match any character.
+                            if (slice.IsEmpty || false)
+                            {
+                                goto LazyLoopIterationNoMatch;
+                            }
                             
+                            pos++;
+                            slice = inputSpan.Slice(pos);
                             base.Capture(1, capture_starting_pos, pos);
-                            
-                            Utilities.StackPush(ref base.runstack!, ref stackpos, capture_starting_pos);
-                            goto CaptureSkipBacktrack;
-                            
-                            CaptureBacktrack:
-                            capture_starting_pos = base.runstack![--stackpos];
-                            goto AlternationBacktrack;
-                            
-                            CaptureSkipBacktrack:;
-                        //}
+                        }
                         
                         goto LazyLoopEnd;
                         
                         // The lazy loop iteration failed to match.
                         LazyLoopIterationNoMatch:
-                        lazyloop_iteration--;
-                        UncaptureUntil(base.runstack![--stackpos]);
-                        pos = base.runstack![--stackpos];
-                        slice = inputSpan.Slice(pos);
-                        if (lazyloop_iteration > 0)
-                        {
-                            // The lazy loop matched at least one iteration; backtrack into the last one.
-                            goto CaptureBacktrack;
-                        }
-                        
                         UncaptureUntil(0);
                         return false; // The input didn't match.
                         
         /// <summary>Whether <see cref="s_defaultTimeout"/> is non-infinite.</summary>
         internal static readonly bool s_hasTimeout = s_defaultTimeout != Regex.InfiniteMatchTimeout;
         
-        /// <summary>Pops 2 values from the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPop(int[] stack, ref int pos, out int arg0, out int arg1)
-        {
-            arg0 = stack[--pos];
-            arg1 = stack[--pos];
-        }
-        
-        /// <summary>Pushes 1 value onto the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPush(ref int[] stack, ref int pos, int arg0)
-        {
-            // If there's space available for the value, store it.
-            int[] s = stack;
-            int p = pos;
-            if ((uint)p < (uint)s.Length)
-            {
-                s[p] = arg0;
-                pos++;
-                return;
-            }
-        
-            // Otherwise, resize the stack to make room and try again.
-            WithResize(ref stack, ref pos, arg0);
-        
-            // <summary>Resize the backtracking stack array and push 1 value onto the stack.</summary>
-            [MethodImpl(MethodImplOptions.NoInlining)]
-            static void WithResize(ref int[] stack, ref int pos, int arg0)
-            {
-                Array.Resize(ref stack, (pos + 0) * 2);
-                StackPush(ref stack, ref pos, arg0);
-            }
-        }
-        
         /// <summary>Pushes 2 values onto the backtracking stack.</summary>
         [MethodImpl(MethodImplOptions.AggressiveInlining)]
         internal static void StackPush(ref int[] stack, ref int pos, int arg0, int arg1)
                 StackPush(ref stack, ref pos, arg0, arg1);
             }
         }
-        
-        /// <summary>Pushes 3 values onto the backtracking stack.</summary>
-        [MethodImpl(MethodImplOptions.AggressiveInlining)]
-        internal static void StackPush(ref int[] stack, ref int pos, int arg0, int arg1, int arg2)
-        {
-            // If there's space available for all 3 values, store them.
-            int[] s = stack;
-            int p = pos;
-            if ((uint)(p + 2) < (uint)s.Length)
-            {
-                s[p] = arg0;
-                s[p + 1] = arg1;
-                s[p + 2] = arg2;
-                pos += 3;
-                return;
-            }
-        
-            // Otherwise, resize the stack to make room and try again.
-            WithResize(ref stack, ref pos, arg0, arg1, arg2);
-        
-            // <summary>Resize the backtracking stack array and push 3 values onto the stack.</summary>
-            [MethodImpl(MethodImplOptions.NoInlining)]
-            static void WithResize(ref int[] stack, ref int pos, int arg0, int arg1, int arg2)
-            {
-                Array.Resize(ref stack, (pos + 2) * 2);
-                StackPush(ref stack, ref pos, arg0, arg1, arg2);
-            }
-        }
     }
 }
"^(?<token>\\:=|\\-\\>|\\,|\\[|\\]|\\(|\\)|\\ ..." (5 uses)
[GeneratedRegex("^(?<token>\\:=|\\-\\>|\\,|\\[|\\]|\\(|\\)|\\*|\\+|\\?|\\!|#\\w+|\\w[\\w\\.]*|\\d+|\\s|.)*$")]
  /// ○ Match if at the beginning of the string.<br/>
  /// ○ Loop greedily any number of times.<br/>
  ///     ○ "token" capture group.<br/>
-   ///         ○ Match with 8 alternative expressions.<br/>
+   ///         ○ Match with 7 alternative expressions.<br/>
  ///             ○ Match the string ":=".<br/>
  ///             ○ Match the string "-&gt;".<br/>
  ///             ○ Match a character in the set [!(-,?[]].<br/>
  ///                 ○ Match a word character.<br/>
  ///                 ○ Match a character in the set [.\w] greedily any number of times.<br/>
  ///             ○ Match a Unicode digit greedily at least once.<br/>
-   ///             ○ Match a whitespace character.<br/>
-   ///             ○ Match any character other than '\n'.<br/>
+   ///             ○ Match any character.<br/>
  /// ○ Match if at the end of the string or if before an ending newline.<br/>
  /// </code>
  /// </remarks>
                      //{
                          int capture_starting_pos = pos;
                          
-                           // Match with 8 alternative expressions.
+                           // Match with 7 alternative expressions.
                          //{
                              int alternation_starting_pos = pos;
                              int alternation_starting_capturepos = base.Crawlpos();
                              
                              // Branch 6
                              //{
-                                   // Match a whitespace character.
-                                   if (slice.IsEmpty || !char.IsWhiteSpace(slice[0]))
-                                   {
-                                       goto AlternationBranch6;
-                                   }
-                                   
-                                   Utilities.StackPush(ref base.runstack!, ref stackpos, 6, alternation_starting_pos, alternation_starting_capturepos);
-                                   pos++;
-                                   slice = inputSpan.Slice(pos);
-                                   goto AlternationMatch;
-                                   
-                                   AlternationBranch6:
-                                   pos = alternation_starting_pos;
-                                   slice = inputSpan.Slice(pos);
-                                   UncaptureUntil(alternation_starting_capturepos);
-                               //}
-                               
-                               // Branch 7
-                               //{
-                                   // Match any character other than '\n'.
-                                   if (slice.IsEmpty || slice[0] == '\n')
+                                   // Match any character.
+                                   if (slice.IsEmpty || false)
                                  {
                                      goto LoopIterationNoMatch;
                                  }
                                  
-                                   Utilities.StackPush(ref base.runstack!, ref stackpos, 7, alternation_starting_pos, alternation_starting_capturepos);
+                                   Utilities.StackPush(ref base.runstack!, ref stackpos, 6, alternation_starting_pos, alternation_starting_capturepos);
                                  pos++;
                                  slice = inputSpan.Slice(pos);
                                  goto AlternationMatch;
                                  case 5:
                                      goto CharLoopBacktrack2;
                                  case 6:
-                                       goto AlternationBranch6;
-                                   case 7:
                                      goto LoopIterationNoMatch;
                              }

For more diff examples, see https://gist.github.com/MihuBot/aaa69c4b57ff76c9c48818bbc06a2c9f

JIT assembly changes
Total bytes of base: 54136129
Total bytes of diff: 54075819
Total bytes of delta: -60310 (-0.11 % of base)
Total relative delta: -17.36
    diff is an improvement.
    relative diff is an improvement.

For a list of JIT diff improvements, see Improvements.md

Sample source code for further analysis
const string JsonPath = "RegexResults-1305.json";
if (!File.Exists(JsonPath))
{
    await using var archiveStream = await new HttpClient().GetStreamAsync("https://mihubot.xyz/r/E2r4LFbA");
    using var archive = new ZipArchive(archiveStream, ZipArchiveMode.Read);
    archive.Entries.First(e => e.Name == "Results.json").ExtractToFile(JsonPath);
}

using FileStream jsonFileStream = File.OpenRead(JsonPath);
RegexEntry[] entries = JsonSerializer.Deserialize<RegexEntry[]>(jsonFileStream, new JsonSerializerOptions { IncludeFields = true })!;
Console.WriteLine($"Working with {entries.Length} patterns");



record KnownPattern(string Pattern, RegexOptions Options, int Count);

sealed class RegexEntry
{
    public required KnownPattern Regex { get; set; }
    public required string MainSource { get; set; }
    public required string PrSource { get; set; }
    public string? FullDiff { get; set; }
    public string? ShortDiff { get; set; }
    public (string Name, string Values)[]? SearchValuesOfChar { get; set; }
    public (string[] Values, StringComparison ComparisonType)[]? SearchValuesOfString { get; set; }
}

Copy link
Member

@MihaZupan MihaZupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some diffs are funny (preexisting behavior)

// Match any character.
if (slice.IsEmpty || false)

@stephentoub
Copy link
Member Author

/ba-g dead letter

@stephentoub stephentoub merged commit a166765 into dotnet:main Jul 28, 2025
85 of 88 checks passed
@stephentoub stephentoub deleted the notonealternation branch July 28, 2025 13:55
@github-actions github-actions bot locked and limited conversation to collaborators Aug 28, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants