-
Notifications
You must be signed in to change notification settings - Fork 161
feat: Support uniqueItems validation for arrays of complex objects #1563
Description
Problem Statement
Ogen currently does not support uniqueItems: true validation for arrays containing complex objects (objects, nested arrays). When encountering such schemas, ogen skips the operation entirely with the error:
INFO Skipping operation {"reason_error": "complex uniqueItems not implemented"}
This limitation prevents ogen from generating clients for many real-world OpenAPI specifications, including:
- Atlassian JIRA REST API v3 (~20 operations skipped, including workflow endpoints)
- Mist API (issue generator failing on uniqueItems #1507)
- Any API using arrays of objects with uniqueness constraints
Current Behavior
Code location: gen/schema_gen.go
if schema.UniqueItems {
item := schema.Item
if item == nil ||
item.Type == "" ||
item.Type == jsonschema.Array ||
item.Type == jsonschema.Object {
return nil, &ErrNotImplemented{Name: "complex uniqueItems"}
}
}Impact: Operations containing these arrays are completely skipped, resulting in incomplete API client generation.
Root Cause Analysis
Go's type system doesn't allow direct comparison of arbitrary structs with ==. The challenge is implementing equality checking for:
- Structs with multiple fields
- Optional fields (ogen's
OptTtypes) - Pointer fields
- Nested objects and arrays
- Maps
PR #887 (May 2023) added uniqueItems support for primitive comparable types only (string, int, bool, etc.), but complex types remain unimplemented.
Proposed Solution
Generate type-specific Equal() and Hash() methods for all schema types that appear in uniqueItems arrays, then use hash-based deduplication for O(n) average-case performance.
Architecture Overview
- Type Detection: Mark types needing equality methods during schema generation
- Method Generation: Generate
Equal()andHash()methods for each marked type - Validation Integration: Use generated methods in array validation
- Fallback: Hash collisions handled by calling
Equal()for verification
Implementation Design
1. Generated Equal() Method
// Generated for a workflow status type
func (a WorkflowReferenceStatus) Equal(b WorkflowReferenceStatus) bool {
// Primitive fields - direct comparison
if a.ID != b.ID { return false }
if a.Name != b.Name { return false }
// OptString fields - ogen's optional wrapper
if a.Description.Set != b.Description.Set { return false }
if a.Description.Set && a.Description.Value != b.Description.Value {
return false
}
// Pointer fields
if (a.Category == nil) != (b.Category == nil) { return false }
if a.Category != nil && *a.Category != *b.Category {
return false
}
// Nested objects - recursive equality
if !a.StatusCategory.Equal(b.StatusCategory) { return false }
// Arrays - length check then element comparison
if len(a.Properties) != len(b.Properties) { return false }
for i := range a.Properties {
if a.Properties[i] != b.Properties[i] { return false }
}
return true
}2. Generated Hash() Method
func (a WorkflowReferenceStatus) Hash() uint64 {
h := fnv.New64a()
// Primitive fields
h.Write([]byte(a.ID))
h.Write([]byte(a.Name))
// Optional fields - include presence marker
if a.Description.Set {
h.Write([]byte{1})
h.Write([]byte(a.Description.Value))
} else {
h.Write([]byte{0})
}
// Pointers
if a.Category != nil {
h.Write([]byte{1})
h.Write([]byte(*a.Category))
} else {
h.Write([]byte{0})
}
// Nested objects - incorporate their hash
binary.Write(h, binary.LittleEndian, a.StatusCategory.Hash())
// Arrays
for _, prop := range a.Properties {
h.Write([]byte(prop))
}
return h.Sum64()
}3. Validation Function
// Generated validation function for arrays with complex uniqueItems
func validateUniqueWorkflowReferenceStatus(items []WorkflowReferenceStatus) error {
type entry struct {
index int
item WorkflowReferenceStatus
}
seen := make(map[uint64][]entry, len(items))
for i, item := range items {
hash := item.Hash()
// Check for duplicates with same hash
if entries, exists := seen[hash]; exists {
for _, e := range entries {
// Verify with Equal() to handle hash collisions
if e.item.Equal(item) {
return fmt.Errorf(
"duplicate item found at indices %d and %d",
e.index, i,
)
}
}
}
seen[hash] = append(seen[hash], entry{index: i, item: item})
}
return nil
}Complexity: O(n) average case, O(n²) worst case with hash collisions
Field-Level Comparison Matrix
| Type Pattern | Equal() Logic | Hash() Logic |
|---|---|---|
Primitives (string, int64, bool) |
Direct == |
Write bytes to FNV |
OptT (OptString, OptInt64) |
Compare Set flag, then Value |
Write presence marker + value |
Pointers (*string) |
Nil check, then dereference | Write presence marker + dereferenced value |
| Arrays | Length check, element-by-element | Hash each element in order |
| Nested Objects | Recursive .Equal() call |
Incorporate .Hash() result |
| Maps | Length, key existence, value comparison | Hash keys and values (order-independent) |
Files to Modify/Create
Core Changes
gen/schema_gen.go- RemoveErrNotImplementedfor complex uniqueItems (~5 lines)gen/ir/validation.go- Track types needing equality methods (~20 lines)gen/gen_equality.go- NEW FILE: Generate Equal() and Hash() methods (~400 lines)gen/gen_validators.go- Generate validation calls for complex arrays (~50 lines)validate/array.go- Add complex uniqueItems validation logic (~30 lines)
Testing
gen/gen_equality_test.go- NEW FILE: Test method generation (~200 lines)validate/array_test.go- Add complex uniqueItems test cases (~300 lines)- Integration tests with real OpenAPI specs
Example OpenAPI Schema
components:
schemas:
WorkflowReferenceStatus:
type: object
properties:
id:
type: string
name:
type: string
description:
type: string
statusCategory:
$ref: '#/components/schemas/StatusCategory'
required:
- id
- name
Workflow:
type: object
properties:
id:
type: string
statuses:
type: array
items:
$ref: '#/components/schemas/WorkflowReferenceStatus'
uniqueItems: true # ← Currently causes operation skipCurrent behavior: Operation containing Workflow schema is skipped
Expected behavior: Full operation generation with runtime uniqueness validation
Edge Cases to Handle
- Empty arrays: Valid (no duplicates possible)
- Single element: Valid (no duplicates possible)
- Nil vs unset optional fields: Treated as different values
- Hash collisions: Must verify with
Equal() - Nested arrays: Recursive comparison
- Circular references: Potential stack overflow (may need cycle detection)
- Map field ordering: Order-independent hashing
- Large objects: Performance optimization may be needed
Testing Strategy
Unit Tests
func TestGeneratedEqual(t *testing.T) {
tests := []struct {
name string
a, b WorkflowReferenceStatus
wantEq bool
}{
{
name: "identical objects",
a: WorkflowReferenceStatus{ID: "1", Name: "Open"},
b: WorkflowReferenceStatus{ID: "1", Name: "Open"},
wantEq: true,
},
{
name: "different IDs",
a: WorkflowReferenceStatus{ID: "1", Name: "Open"},
b: WorkflowReferenceStatus{ID: "2", Name: "Open"},
wantEq: false,
},
{
name: "optional set vs unset",
a: WorkflowReferenceStatus{ID: "1", Description: OptString{Set: true, Value: "A"}},
b: WorkflowReferenceStatus{ID: "1", Description: OptString{Set: false}},
wantEq: false,
},
{
name: "both optional unset",
a: WorkflowReferenceStatus{ID: "1", Description: OptString{Set: false}},
b: WorkflowReferenceStatus{ID: "1", Description: OptString{Set: false}},
wantEq: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := tt.a.Equal(tt.b)
if got != tt.wantEq {
t.Errorf("Equal() = %v, want %v", got, tt.wantEq)
}
// Verify hash consistency
if tt.wantEq && tt.a.Hash() != tt.b.Hash() {
t.Error("Equal items must have equal hashes")
}
})
}
}Integration Tests
- JIRA API workflow operations
- Nested object arrays
- Mixed primitive and complex fields
- Performance benchmarks (1k, 10k, 100k items)
Performance Expectations
| Array Size | Unique Items | Performance |
|---|---|---|
| 100 | Yes | ~0.1ms |
| 1,000 | Yes | ~1ms |
| 10,000 | Yes | ~10ms |
| 100 | With duplicates | ~0.5ms (early detection) |
Migration Path
This is a non-breaking change:
- Existing primitive
uniqueItemsvalidation continues to work - Complex types that were previously skipped will now be generated
- No changes needed to existing generated code
Implementation Effort Estimate
| Phase | Description | Effort |
|---|---|---|
| 1 | Equal() generation for all type patterns | 1-2 weeks |
| 2 | Hash() generation | 1 week |
| 3 | Validation integration | 3-5 days |
| 4 | Comprehensive testing | 1 week |
| Total | 4-5 weeks |
Open Questions for Discussion
- Hash algorithm: Use FNV-1a (fast, good distribution) or a different hash function?
- Circular reference detection: Should we add cycle detection, or document this limitation?
- Map field hashing: Order-independent hashing required - sort keys first?
- Code size concerns: Generated Equal() methods could be large - acceptable tradeoff?
- Opt-in flag: Should this be enabled by default or require a config flag?
- Generic constraints: Use Go 1.18+ generics for the validation function signature?
Real-World Impact
Atlassian JIRA REST API v3
- Operations affected: ~20 (including
readWorkflows,createWorkflow, etc.) - Current workaround: Use openapi-generator instead of ogen
- With this feature: 100% ogen compatibility
Benefits
- ✅ Complete OpenAPI 3.0 compliance for
uniqueItems - ✅ Runtime validation catches duplicate items
- ✅ Type-safe, generated code (no reflection)
- ✅ Performance: O(n) average case
- ✅ Enables ogen adoption for more APIs
References
- Related: validator, gen: support
uniqueItemsarray validation #132 (primitive uniqueItems - implemented in PR feat(gen): supportuniqueItemsvalidator #887) - Related: generator failing on uniqueItems #1507 (uniqueItems validation error on object type)
- Related: gen: net.HardwareAddr does not satisfy comparable #1561 (comparable constraint issue with net.HardwareAddr)
- OpenAPI 3.0 Spec: uniqueItems
- JSON Schema: uniqueItems validation
Proposed by
This enhancement request is being created to document the design for a community-contributed implementation. The implementation will be developed in a fork and submitted as a PR for review.