feat: Support uniqueItems validation for arrays of complex objects

## Problem Statement

Ogen currently does not support `uniqueItems: true` validation for arrays containing complex objects (objects, nested arrays). When encountering such schemas, ogen skips the operation entirely with the error:

```
INFO  Skipping operation  {"reason_error": "complex uniqueItems not implemented"}
```

This limitation prevents ogen from generating clients for many real-world OpenAPI specifications, including:
- **Atlassian JIRA REST API v3** (~20 operations skipped, including workflow endpoints)
- **Mist API** (issue #1507)
- Any API using arrays of objects with uniqueness constraints

## Current Behavior

**Code location**: `gen/schema_gen.go`

```go
if schema.UniqueItems {
    item := schema.Item
    if item == nil ||
        item.Type == "" ||
        item.Type == jsonschema.Array ||
        item.Type == jsonschema.Object {
        return nil, &ErrNotImplemented{Name: "complex uniqueItems"}
    }
}
```

**Impact**: Operations containing these arrays are completely skipped, resulting in incomplete API client generation.

## Root Cause Analysis

Go's type system doesn't allow direct comparison of arbitrary structs with `==`. The challenge is implementing equality checking for:
- Structs with multiple fields
- Optional fields (ogen's `OptT` types)
- Pointer fields
- Nested objects and arrays
- Maps

PR #887 (May 2023) added `uniqueItems` support for **primitive comparable types only** (string, int, bool, etc.), but complex types remain unimplemented.

## Proposed Solution

Generate type-specific `Equal()` and `Hash()` methods for all schema types that appear in `uniqueItems` arrays, then use hash-based deduplication for O(n) average-case performance.

### Architecture Overview

1. **Type Detection**: Mark types needing equality methods during schema generation
2. **Method Generation**: Generate `Equal()` and `Hash()` methods for each marked type
3. **Validation Integration**: Use generated methods in array validation
4. **Fallback**: Hash collisions handled by calling `Equal()` for verification

## Implementation Design

### 1. Generated Equal() Method

```go
// Generated for a workflow status type
func (a WorkflowReferenceStatus) Equal(b WorkflowReferenceStatus) bool {
    // Primitive fields - direct comparison
    if a.ID != b.ID { return false }
    if a.Name != b.Name { return false }
    
    // OptString fields - ogen's optional wrapper
    if a.Description.Set != b.Description.Set { return false }
    if a.Description.Set && a.Description.Value != b.Description.Value {
        return false
    }
    
    // Pointer fields
    if (a.Category == nil) != (b.Category == nil) { return false }
    if a.Category != nil && *a.Category != *b.Category {
        return false
    }
    
    // Nested objects - recursive equality
    if !a.StatusCategory.Equal(b.StatusCategory) { return false }
    
    // Arrays - length check then element comparison
    if len(a.Properties) != len(b.Properties) { return false }
    for i := range a.Properties {
        if a.Properties[i] != b.Properties[i] { return false }
    }
    
    return true
}
```

### 2. Generated Hash() Method

```go
func (a WorkflowReferenceStatus) Hash() uint64 {
    h := fnv.New64a()
    
    // Primitive fields
    h.Write([]byte(a.ID))
    h.Write([]byte(a.Name))
    
    // Optional fields - include presence marker
    if a.Description.Set {
        h.Write([]byte{1})
        h.Write([]byte(a.Description.Value))
    } else {
        h.Write([]byte{0})
    }
    
    // Pointers
    if a.Category != nil {
        h.Write([]byte{1})
        h.Write([]byte(*a.Category))
    } else {
        h.Write([]byte{0})
    }
    
    // Nested objects - incorporate their hash
    binary.Write(h, binary.LittleEndian, a.StatusCategory.Hash())
    
    // Arrays
    for _, prop := range a.Properties {
        h.Write([]byte(prop))
    }
    
    return h.Sum64()
}
```

### 3. Validation Function

```go
// Generated validation function for arrays with complex uniqueItems
func validateUniqueWorkflowReferenceStatus(items []WorkflowReferenceStatus) error {
    type entry struct {
        index int
        item  WorkflowReferenceStatus
    }
    seen := make(map[uint64][]entry, len(items))
    
    for i, item := range items {
        hash := item.Hash()
        
        // Check for duplicates with same hash
        if entries, exists := seen[hash]; exists {
            for _, e := range entries {
                // Verify with Equal() to handle hash collisions
                if e.item.Equal(item) {
                    return fmt.Errorf(
                        "duplicate item found at indices %d and %d",
                        e.index, i,
                    )
                }
            }
        }
        
        seen[hash] = append(seen[hash], entry{index: i, item: item})
    }
    return nil
}
```

**Complexity**: O(n) average case, O(n²) worst case with hash collisions

## Field-Level Comparison Matrix

| Type Pattern | Equal() Logic | Hash() Logic |
|-------------|---------------|--------------|
| Primitives (`string`, `int64`, `bool`) | Direct `==` | Write bytes to FNV |
| OptT (`OptString`, `OptInt64`) | Compare `Set` flag, then `Value` | Write presence marker + value |
| Pointers (`*string`) | Nil check, then dereference | Write presence marker + dereferenced value |
| Arrays | Length check, element-by-element | Hash each element in order |
| Nested Objects | Recursive `.Equal()` call | Incorporate `.Hash()` result |
| Maps | Length, key existence, value comparison | Hash keys and values (order-independent) |

## Files to Modify/Create

### Core Changes
1. **`gen/schema_gen.go`** - Remove `ErrNotImplemented` for complex uniqueItems (~5 lines)
2. **`gen/ir/validation.go`** - Track types needing equality methods (~20 lines)
3. **`gen/gen_equality.go`** - NEW FILE: Generate Equal() and Hash() methods (~400 lines)
4. **`gen/gen_validators.go`** - Generate validation calls for complex arrays (~50 lines)
5. **`validate/array.go`** - Add complex uniqueItems validation logic (~30 lines)

### Testing
6. **`gen/gen_equality_test.go`** - NEW FILE: Test method generation (~200 lines)
7. **`validate/array_test.go`** - Add complex uniqueItems test cases (~300 lines)
8. Integration tests with real OpenAPI specs

## Example OpenAPI Schema

```yaml
components:
  schemas:
    WorkflowReferenceStatus:
      type: object
      properties:
        id:
          type: string
        name:
          type: string
        description:
          type: string
        statusCategory:
          $ref: '#/components/schemas/StatusCategory'
      required:
        - id
        - name
    
    Workflow:
      type: object
      properties:
        id:
          type: string
        statuses:
          type: array
          items:
            $ref: '#/components/schemas/WorkflowReferenceStatus'
          uniqueItems: true  # ← Currently causes operation skip
```

**Current behavior**: Operation containing `Workflow` schema is skipped  
**Expected behavior**: Full operation generation with runtime uniqueness validation

## Edge Cases to Handle

1. **Empty arrays**: Valid (no duplicates possible)
2. **Single element**: Valid (no duplicates possible)
3. **Nil vs unset optional fields**: Treated as different values
4. **Hash collisions**: Must verify with `Equal()`
5. **Nested arrays**: Recursive comparison
6. **Circular references**: Potential stack overflow (may need cycle detection)
7. **Map field ordering**: Order-independent hashing
8. **Large objects**: Performance optimization may be needed

## Testing Strategy

### Unit Tests
```go
func TestGeneratedEqual(t *testing.T) {
    tests := []struct {
        name    string
        a, b    WorkflowReferenceStatus
        wantEq  bool
    }{
        {
            name: "identical objects",
            a:    WorkflowReferenceStatus{ID: "1", Name: "Open"},
            b:    WorkflowReferenceStatus{ID: "1", Name: "Open"},
            wantEq: true,
        },
        {
            name: "different IDs",
            a:    WorkflowReferenceStatus{ID: "1", Name: "Open"},
            b:    WorkflowReferenceStatus{ID: "2", Name: "Open"},
            wantEq: false,
        },
        {
            name: "optional set vs unset",
            a:    WorkflowReferenceStatus{ID: "1", Description: OptString{Set: true, Value: "A"}},
            b:    WorkflowReferenceStatus{ID: "1", Description: OptString{Set: false}},
            wantEq: false,
        },
        {
            name: "both optional unset",
            a:    WorkflowReferenceStatus{ID: "1", Description: OptString{Set: false}},
            b:    WorkflowReferenceStatus{ID: "1", Description: OptString{Set: false}},
            wantEq: true,
        },
    }
    
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            got := tt.a.Equal(tt.b)
            if got != tt.wantEq {
                t.Errorf("Equal() = %v, want %v", got, tt.wantEq)
            }
            
            // Verify hash consistency
            if tt.wantEq && tt.a.Hash() != tt.b.Hash() {
                t.Error("Equal items must have equal hashes")
            }
        })
    }
}
```

### Integration Tests
- JIRA API workflow operations
- Nested object arrays
- Mixed primitive and complex fields
- Performance benchmarks (1k, 10k, 100k items)

## Performance Expectations

| Array Size | Unique Items | Performance |
|-----------|--------------|-------------|
| 100 | Yes | ~0.1ms |
| 1,000 | Yes | ~1ms |
| 10,000 | Yes | ~10ms |
| 100 | With duplicates | ~0.5ms (early detection) |

## Migration Path

This is a **non-breaking change**:
- Existing primitive `uniqueItems` validation continues to work
- Complex types that were previously skipped will now be generated
- No changes needed to existing generated code

## Implementation Effort Estimate

| Phase | Description | Effort |
|-------|-------------|--------|
| 1 | Equal() generation for all type patterns | 1-2 weeks |
| 2 | Hash() generation | 1 week |
| 3 | Validation integration | 3-5 days |
| 4 | Comprehensive testing | 1 week |
| **Total** | | **4-5 weeks** |

## Open Questions for Discussion

1. **Hash algorithm**: Use FNV-1a (fast, good distribution) or a different hash function?
2. **Circular reference detection**: Should we add cycle detection, or document this limitation?
3. **Map field hashing**: Order-independent hashing required - sort keys first?
4. **Code size concerns**: Generated Equal() methods could be large - acceptable tradeoff?
5. **Opt-in flag**: Should this be enabled by default or require a config flag?
6. **Generic constraints**: Use Go 1.18+ generics for the validation function signature?

## Real-World Impact

### Atlassian JIRA REST API v3
- **Operations affected**: ~20 (including `readWorkflows`, `createWorkflow`, etc.)
- **Current workaround**: Use openapi-generator instead of ogen
- **With this feature**: 100% ogen compatibility

### Benefits
- ✅ Complete OpenAPI 3.0 compliance for `uniqueItems`
- ✅ Runtime validation catches duplicate items
- ✅ Type-safe, generated code (no reflection)
- ✅ Performance: O(n) average case
- ✅ Enables ogen adoption for more APIs

## References

- Related: #132 (primitive uniqueItems - implemented in PR #887)
- Related: #1507 (uniqueItems validation error on object type)
- Related: #1561 (comparable constraint issue with net.HardwareAddr)
- OpenAPI 3.0 Spec: [uniqueItems](https://swagger.io/docs/specification/data-models/data-types/#uniqueItems)
- JSON Schema: [uniqueItems validation](https://datatracker.ietf.org/doc/html/draft-wright-json-schema-validation-00#section-5.12)

## Proposed by

This enhancement request is being created to document the design for a community-contributed implementation. The implementation will be developed in a fork and submitted as a PR for review.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support uniqueItems validation for arrays of complex objects #1563

Problem Statement

Current Behavior

Root Cause Analysis

Proposed Solution

Architecture Overview

Implementation Design

1. Generated Equal() Method

2. Generated Hash() Method

3. Validation Function

Field-Level Comparison Matrix

Files to Modify/Create

Core Changes

Testing

Example OpenAPI Schema

Edge Cases to Handle

Testing Strategy

Unit Tests

Integration Tests

Performance Expectations

Migration Path

Implementation Effort Estimate

Open Questions for Discussion

Real-World Impact

Atlassian JIRA REST API v3

Benefits

References

Proposed by

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type Pattern	Equal() Logic	Hash() Logic
Primitives (`string`, `int64`, `bool`)	Direct `==`	Write bytes to FNV
OptT (`OptString`, `OptInt64`)	Compare `Set` flag, then `Value`	Write presence marker + value
Pointers (`*string`)	Nil check, then dereference	Write presence marker + dereferenced value
Arrays	Length check, element-by-element	Hash each element in order
Nested Objects	Recursive `.Equal()` call	Incorporate `.Hash()` result
Maps	Length, key existence, value comparison	Hash keys and values (order-independent)

Array Size	Unique Items	Performance
100	Yes	~0.1ms
1,000	Yes	~1ms
10,000	Yes	~10ms
100	With duplicates	~0.5ms (early detection)

Phase	Description	Effort
1	Equal() generation for all type patterns	1-2 weeks
2	Hash() generation	1 week
3	Validation integration	3-5 days
4	Comprehensive testing	1 week
Total		4-5 weeks

feat: Support uniqueItems validation for arrays of complex objects #1563

Description

Problem Statement

Current Behavior

Root Cause Analysis

Proposed Solution

Architecture Overview

Implementation Design

1. Generated Equal() Method

2. Generated Hash() Method

3. Validation Function

Field-Level Comparison Matrix

Files to Modify/Create

Core Changes

Testing

Example OpenAPI Schema

Edge Cases to Handle

Testing Strategy

Unit Tests

Integration Tests

Performance Expectations

Migration Path

Implementation Effort Estimate

Open Questions for Discussion

Real-World Impact

Atlassian JIRA REST API v3

Benefits

References

Proposed by

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions