perf: reduce allocations handling terms #8116

anderseknert · 2025-12-08T21:13:51Z

Reduce allocations in various methods working on ast.Term's:

1 alloc less needed in Set.Copy(!)
1 alloc less needed for Ref.Ptr
2 allocs less needed in *object.MergeWith
Remove .Copy call in all the various ref prefix functions

The latter is likely the only "controversial" change, but reading the docstrings for these functions made no mention of returing deep copies, and importantly, we had no code that depended on them doing so. Meaning we paid a cost without reaping benefits from that. If future callers want copies, they can copy.

Also added relevant micro-benchmarks.

netlify · 2025-12-08T21:16:10Z

✅ Deploy Preview for openpolicyagent ready!

Name	Link
🔨 Latest commit	`6c6792d`
🔍 Latest deploy log	https://app.netlify.com/projects/openpolicyagent/deploys/69373f91cbeb440008124226
😎 Deploy Preview	https://deploy-preview-8116--openpolicyagent.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify · 2025-12-08T21:18:32Z

✅ Deploy Preview for openpolicyagent ready!

Name	Link
🔨 Latest commit	`a03fd05`
🔍 Latest deploy log	https://app.netlify.com/projects/openpolicyagent/deploys/6943d19417685e0008c2bc55
😎 Deploy Preview	https://deploy-preview-8116--openpolicyagent.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

alex60217101990 · 2025-12-09T06:25:43Z

Great suggestion!

However, the method func (ref Ref) Ptr() (string, error) could be slightly improved, as the current implementation has a couple of issues:

Iterates twice: once for size calculation, once for writing
l += len(str) doesn't account for escaping. url.PathEscape can expand strings (e.g., space → %20, 1 char → 3 chars)
tail[i].Value.(String) checked in both loops

What I mean by point 2:

str := "hello world"  // len = 11
// After PathEscape: "hello%20world" // len = 13
// Buffer will be undersized and need to grow!

We could implement this method with a single loop iteration and correct buffer length calculation:

func (ref Ref) Ptr() (string, error) {
    tail := ref[1:]
    if len(tail) == 0 {
        return "", nil
    }

    // Estimate buffer size: original length + separators + ~30% for escaping
    estSize := 0
    for i := range tail {
        if str, ok := tail[i].Value.(String); ok {
            estSize += len(str)
        } else {
            return "", errors.New("invalid path value type")
        }
    }
    estSize += len(tail) - 1  // separators ('/')
    estSize += estSize / 3    // ~30% overhead for potential escaping

    var buf strings.Builder
    buf.Grow(estSize)

    for i := range tail {
        if i > 0 {
            buf.WriteByte('/')
        }
        str := string(tail[i].Value.(String)) // safe: already validated above
        buf.WriteString(url.PathEscape(str))
    }
    return buf.String(), nil
}

Or even integrate the escaping to avoid intermediate allocations from url.PathEscape:

const upperhex = "0123456789ABCDEF"

// shouldEscapePath checks if byte needs escaping in URL path component
// RFC 3986: unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
func shouldEscapePath(c byte) bool {
    if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
        return false
    }
    switch c {
    case '-', '_', '.', '~':
        return false
    }
    return true
}

func (ref Ref) Ptr() (string, error) {
    tail := ref[1:]
    if len(tail) == 0 {
        return "", nil
    }

    // Validate all terms and calculate size
    estSize := len(tail) - 1 // separators
    for i := range tail {
        str, ok := tail[i].Value.(String)
        if !ok {
            return "", errors.New("invalid path value type")
        }
        estSize += len(str) + len(str)/3 // string + ~30% escape overhead
    }

    var buf strings.Builder
    buf.Grow(estSize)

    for i := range tail {
        if i > 0 {
            buf.WriteByte('/')
        }
        s := string(tail[i].Value.(String))
        
        // Inline path escaping - avoids url.PathEscape allocations
        for j := 0; j < len(s); j++ {
            c := s[j]
            if shouldEscapePath(c) {
                buf.WriteByte('%')
                buf.WriteByte(upperhex[c>>4])
                buf.WriteByte(upperhex[c&15])
            } else {
                buf.WriteByte(c)
            }
        }
    }
    return buf.String(), nil
}

What do you think?

anderseknert · 2025-12-09T11:25:43Z

@alex60217101990 Hey! And thanks for your suggestion. I'm sorry though, but I'm not sure I understand your comment vs. your code. The code you posted also iterates twice over the parts, and I'm frankly not sure how any solution could correctly calculate the length to pre-allocate without it doing that.

As for the shouldEscapePath suggestion — yes, that doesn't sound too bad to me, at least not if it's to be used in more locations than here. Ideally we'd have a TextAppender provided by the stdlib and just use that, but at least there's some insight into why there isn't one provided here:
https://cs.opensource.google/go/go/+/refs/tags/go1.25.5:src/net/url/url.go;l=1240

I think an appender for URL path components in some util lib of ours (which relied on the logic from your example) would likely be a good option.

alex60217101990 · 2025-12-09T13:15:38Z

@anderseknert Okay, I agree with you.

One small question, wouldn't it perhaps be better if the object were allocated on the stack rather than the heap ?

buf := &strings.Builder{}  // forces heap allocation

Better:

var buf strings.Builder  // stack allocation (if it doesn't escape)

Also, we could start the loop in the method (ref Ref) GroundPrefix() Ref from 1 immediately:

func (ref Ref) GroundPrefix() Ref {
    if ref.IsGround() {
        return ref
    }

    // Head is always ground by definition, start checking from index 1
    for i := 1; i < len(ref); i++ {
        if !ref[i].IsGround() {
            return ref[:i] 
        }
    }

    return ref
}

alex60217101990 · 2025-12-10T07:02:29Z

In addition to your comment.

I looked at the code via the link you attached, thank you. Looking at Go's stdlib url.escape() implementation (source) To correctly pre-allocate the buffer, two passes are necessary. This is the same approach used by Go's standard library.

As suggested, creating a reusable PathEscapeAppender in a util package would be valuable if this pattern is used in multiple places.
Proposed utility for v1/util/escape.go:

package util

import "strings"

const upperhex = "0123456789ABCDEF"

// shouldEscapePathSegment checks if byte needs escaping in URL path segment
// Based on RFC 3986 and Go's url.shouldEscape(c, encodePathSegment)
func shouldEscapePathSegment(c byte) bool {
    // Unreserved characters (RFC 3986 §2.3)
    if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
        return false
    }
    switch c {
    case '-', '_', '.', '~':
        return false
    // Additional allowed in path segment
    case '!', '$', '&', '\'', '(', ')', '*', '+', '=', ':', '@':
        return false
    }
    return true
}

// AppendPathEscape appends the path-escaped form of s to buf.
// This avoids intermediate allocations compared to url.PathEscape.
func AppendPathEscape(buf *strings.Builder, s string) {
    for i := 0; i < len(s); i++ {
        c := s[i]
        if shouldEscapePathSegment(c) {
            buf.WriteByte('%')
            buf.WriteByte(upperhex[c>>4])
            buf.WriteByte(upperhex[c&15])
        } else {
            buf.WriteByte(c)
        }
    }
}

// CountPathEscapeLen returns the length of s after path escaping.
// Useful for pre-allocating buffers.
func CountPathEscapeLen(s string) int {
    n := 0
    for i := 0; i < len(s); i++ {
        if shouldEscapePathSegment(s[i]) {
            n += 3 // %XX
        } else {
            n++
        }
    }
    return n
}

And then the Ref.Ptr() method:

func (ref Ref) Ptr() (string, error) {
    tail := ref[1:]
    if len(tail) == 0 {
        return "", nil
    }

    // Calculate exact buffer size
    l := len(tail) - 1 // separators
    for i := range tail {
        str, ok := tail[i].Value.(String)
        if !ok {
            return "", errors.New("invalid path value type")
        }
        l += util.CountPathEscapeLen(string(str))
    }

    var buf strings.Builder
    buf.Grow(l)

    for i := range tail {
        if i > 0 {
            buf.WriteByte('/')
        }
        util.AppendPathEscape(&buf, string(tail[i].Value.(String)))
    }
    return buf.String(), nil
}

johanfylling

I'm not super concerned with dropping the copying if we don't rely on that internally. To be safe, maybe we should update the comment for public functions to let users know they're not getting a copy. Would this be something we want to mention as a possible breaking change for SDK users in the release notes? 🤔

johanfylling · 2025-12-17T16:17:51Z

v1/ast/term.go

-			parts = append(parts, url.PathEscape(string(str)))
-		} else {
+	buf := &strings.Builder{}
+	tail := ref[1:]


Interesting that we're dropping the first term. I suppose this is always the data/input root doc?

johanfylling · 2025-12-17T16:18:03Z

v1/ast/term.go

+		return true
+	case Var, Ref, *ArrayComprehension, *ObjectComprehension, *SetComprehension, Call:
+		return false
+	}


Reduce allocations in various methods working on ast.Term's: - 1 alloc less needed in `Set.Copy`(!) - 1 alloc less needed for `Ref.Ptr` - 2 allocs less needed in `*object.MergeWith` - Remove `.Copy` call in all the various ref prefix functions The latter is likely the only "controversial" change, but reading the docstrings for these functions made no mention of returing deep copies, and importantly, we had **no** code that depended on them doing so. Meaning we paid a cost without reaping benefits from that. If future callers want copies, they can copy. Also added relevant micro-benchmarks. Signed-off-by: Anders Eknert <[email protected]>

Signed-off-by: Johan Fylling <[email protected]>

anderseknert force-pushed the term-allocs branch from 6c6792d to 9a869d8 Compare December 8, 2025 21:15

johanfylling approved these changes Dec 17, 2025

View reviewed changes

anderseknert force-pushed the term-allocs branch from 9a869d8 to a03fd05 Compare December 18, 2025 10:04

anderseknert enabled auto-merge (squash) December 18, 2025 10:10

anderseknert merged commit f5c3743 into open-policy-agent:main Dec 18, 2025
31 checks passed

srenatus deleted the term-allocs branch December 18, 2025 10:19

johanfylling added a commit to johanfylling/opa that referenced this pull request Dec 18, 2025

Adding open-policy-agent#8116

5a11556

Signed-off-by: Johan Fylling <[email protected]>

BrewTestBot mentioned this pull request Dec 18, 2025

opa 1.12.0 Homebrew/homebrew-core#259206

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: reduce allocations handling terms #8116

perf: reduce allocations handling terms #8116

Uh oh!

anderseknert commented Dec 8, 2025

Uh oh!

netlify bot commented Dec 8, 2025

Uh oh!

netlify bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

alex60217101990 commented Dec 9, 2025

Uh oh!

anderseknert commented Dec 9, 2025

Uh oh!

alex60217101990 commented Dec 9, 2025

Uh oh!

alex60217101990 commented Dec 10, 2025

Uh oh!

johanfylling left a comment

Uh oh!

johanfylling Dec 17, 2025

Uh oh!

johanfylling Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf: reduce allocations handling terms #8116

perf: reduce allocations handling terms #8116

Uh oh!

Conversation

anderseknert commented Dec 8, 2025

Uh oh!

netlify bot commented Dec 8, 2025

✅ Deploy Preview for openpolicyagent ready!

Uh oh!

netlify bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for openpolicyagent ready!

Uh oh!

alex60217101990 commented Dec 9, 2025

Uh oh!

anderseknert commented Dec 9, 2025

Uh oh!

alex60217101990 commented Dec 9, 2025

Uh oh!

alex60217101990 commented Dec 10, 2025

Uh oh!

johanfylling left a comment

Choose a reason for hiding this comment

Uh oh!

johanfylling Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

johanfylling Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Dec 8, 2025 •

edited

Loading