Author: Praveen Gupta, Pankaj Joshi
Date: January 16, 2026

Executive Summary

This document outlines the design and implementation strategy for an asynchronous API endpoint that enables the Processing System to process large invoice transactions containing 100,000+ line items. The solution leverages Hub as a publish-subscribe mechanism to handle asynchronous processing across multi-cloud environments (AWS and OCI), ensuring scalability, reliability, and efficient communication with external customers.

1. Introduction

1.1 Current State

The Processing System is a SaaS based platform that processes customer transactions. Currently deployed across AWS and OCI cloud infrastructures, the engine serves customers through a REST endpoint:

/process – Synchronous processing with database persistence

The existing architecture handles up to 5,000-lines items per invoice synchronously with excellent performance. However, enterprise customers require the ability to process significantly larger transactions—up to 100,000+ line items (approximately 20MB payload size)—which necessitates an asynchronous processing model.

1.2 Business Challenge

Processing 100k+ line invoices synchronously presents several challenges:

Timeout Issues – Extended processing times exceed typical HTTP timeout thresholds
Resource Contention – Long-running synchronous requests block critical resources
Customer Experience – Clients waiting for responses face degraded user experience
Processing Accuracy – All lines must be processed together as line interactions affect total calculations

1.3 Solution Overview

The proposed solution introduces a new asynchronous endpoint POST /api/defer that:

Accepts large invoice payloads (100k+ lines)
Processes transactions asynchronously in the background
Notifies customers of completion via Hub publish-subscribe mechanism
Operates seamlessly across both AWS and OCI environments
Eliminates the need for status polling databases

2. Architecture Overview

2.1 High-Level Architecture

The asynchronous processing architecture integrates the existing Processing System with Hub to enable event-driven communication:

Figure 1.1 High Level Architecture

2.2 Component Breakdown

Component	Technology	Purpose
API Gateway	Apigee	OAuth authentication, rate limiting, routing
Message Queue	AWS SQS / OCI Queue	Decouples request acceptance from processing
ML Service	Python / TensorFlow / scikit-learn	Job prioritization, anomaly detection, predictive scaling
Async Processor	Spring Boot Worker	Processes queued jobs, invokes processing engine
Processing Engine	Java 21 / Spring Boot	Core processing logic
Hub	Azure Event Grid	Pub-sub messaging for completion events
Database	PostgreSQL / Oracle	Configuration and content data
Container Platform	ECS (AWS) / EKS (OCI)	Auto-scaling compute infrastructure

Table 1.1 Component Breakdown

3. Detailed Processing Flow

3.1 Asynchronous Processing Flow

Figure 1.2 Processing Flow

3.2 Hub Integration

Figure 1.3 Hub Integration

4. Technical Implementation Details

4.1 API Endpoint Specification

Endpoint: POST /api/defer

Request Headers:

Authorization: Bearer <OAuth-Token>
Content-Type: application/json
X-Request-ID: <UUID>

Snippet 1.1 Request Headers

Request Body:

{
“transactionType”: “PROCESS”,
“invoice”: {
“documentCode”: “INV-2026-001234”,
“documentDate”: “2026-01-15”,
“customerCode”: “CUST-XYZ”,
“lines”: [
{
“lineNumber”: 1,
“itemCode”: “PROD-001”,
“quantity”: 100,
“amount”: 5000.00,
“originAddress”: {…},
“destinationAddress”: {…}
}
// … 99,999 more lines
]
},
“callbackUrl”: “https://customer.com/webhooks/results”
}

Snippet 1.2 Request Body

Response (202 Accepted):

{
“jobId”: “job-uuid-12345”,
“status”: “QUEUED”,
“estimatedCompletionTime”: “2026-01-15T14:35:00Z”,
“statusCheckUrl”: “https://api.engine.com/api/defer/job-uuid-12345”
}

Snippet 1.3 Response

4.2 Hub Event Schema

Event Type: processing.complete

Event Payload:

{
“eventId”: “evt-uuid-67890”,
“eventType”: “processing.complete”,
“timestamp”: “2026-01-15T14:32:15Z”,
“jobId”: “job-uuid-12345”,
“status”: “SUCCESS”,
“data”: {
“documentCode”: “INV-2026-001234”,
“totalAmount”: 5012456.78,
“processingTimeMs”: 45000,
“linesProcessed”: 100000,
“resultUrl”: “https://api.engine.com/api/defer/job-uuid-12345/result”
}
}

Snippet 1.4 Payload

4.3 Multi-Cloud Deployment Strategy

Component	AWS Implementation	OCI Implementation
Compute	ECS with Fargate	OKE (Kubernetes)
Message Queue	Amazon SQS	OCI Queue Service
ML Service	SageMaker / ECS	OCI Data Science / OKE
Database	Amazon RDS (PostgreSQL)	Oracle Autonomous Database
Auto-Scaling	ECS Service Auto-Scaling	HPA (Horizontal Pod Autoscaler)
Networking	VPC, ALB	VCN, OCI Load Balancer
Monitoring	CloudWatch	OCI Monitoring

Table 1.2 Cloud Deployment Strategy

5. Key Design Decisions

5.1 Why Asynchronous Processing?

Scalability – Decouple request acceptance from processing allows independent scaling
Resilience – Queue-based architecture provides retry capability and fault tolerance
Resource Optimization – Avoid thread blocking during long-running calculations
User Experience – Immediate acknowledgment prevents client timeout issues

5.2 Why Hub (Not Redis)?

Per requirements, Redis is explicitly excluded. Hub provides:

Managed Service – No infrastructure maintenance required
Multi-Cloud Support – Accessible from both AWS and OCI
Webhook Delivery – Native support for HTTP callbacks
Event Persistence – Guaranteed delivery with retry mechanisms
External Access – Customers outside TR network can subscribe
No Status Database Needed – Pub-sub eliminates polling requirement

5.3 Atomic Processing Requirement

All 100k lines must be processed together because:

Processing operations have interdependencies between line items
Calculations aggregate across lines
Business rules apply at invoice level
Results may vary based on total transaction value

Implication: No batch splitting—entire invoice processed as single unit.

6. AI/ML-Enhanced Capabilities

The architecture integrates machine learning to optimize performance, detect anomalies, and improve system intelligence:

6.1 Intelligent Job Prioritization

Objective: Optimize queue processing order based on predicted complexity and customer SLAs.

ML Model: Gradient Boosting Regressor trained on historical job metadata:

Input Features: Line count, payload size, customer tier, time of day, product types
Output: Predicted processing time (seconds)
Training Data: 6+ months of completed job metrics

Benefits:

High-priority customers processed first
Short jobs avoid blocking behind long-running jobs
25-40% improvement in average wait time

Implementation:

# Simplified ML prioritization logic
def calculate_priority_score(job):
predicted_time = ml_model.predict(job.features)
sla_urgency = get_customer_sla_weight(job.customer_id)
return (sla_urgency * 100) / predicted_time

Snippet 1.5 Implementation

6.2 Anomaly Detection

Objective: Identify suspicious or malformed transactions before expensive processing.

ML Model: Isolation Forest for unsupervised anomaly detection:

Detection Criteria:

Unusual line-item patterns
Abnormal amount distributions
Suspicious geographic patterns
Payload structure deviations

Action Workflow:

ML service scores incoming job (0-100 anomaly score)
Score > 80: Flag for manual review queue
Score 50-80: Process with enhanced logging
Score < 50: Normal processing

Benefits:

Prevent processing of corrupted/malicious data
Reduce wasted compute resources
Early fraud detection capabilities

6.3 Predictive Auto-Scaling

Objective: Proactively scale resources ahead of demand spikes.

ML Model: LSTM (Long Short-Term Memory) neural network for time-series forecasting:

Input Features: Historical queue depth, time patterns, seasonal trends
Output: Predicted job volume for next 15-60 minutes
Retraining: Weekly with latest patterns

Scaling Logic:

if predicted_volume > current_capacity * 0.7:
scale_up_workers(target=predicted_volume / avg_throughput)
elif predicted_volume < current_capacity * 0.3:
scale_down_workers(target=predicted_volume / avg_throughput)

Snippet 1.6 Scaling Logic

Benefits:

60% faster response to demand spikes vs. reactive scaling
Reduced cold-start delays
Cost optimization through predictive scale-down

6.4 Processing Optimization Insights

Objective: Continuously learn optimal processing strategies from execution patterns.

ML Approach: Reinforcement learning to optimize:

Database connection pool sizing
Batch processing chunk sizes
Memory allocation strategies
Parallel processing thread counts

Feedback Loop:

Processor reports: job characteristics → chosen strategy → execution time
ML service analyzes: which strategies perform best for which job types
Model recommends: optimal configuration for incoming jobs

Benefits:

Self-tuning performance optimization
Automatic adaptation to changing workload patterns
15-30% processing time improvements

6.5 ML Service Architecture

Figure 1.4 ML Service Architecture

6.6 Model Monitoring and Retraining

Continuous Improvement Pipeline:

Performance Tracking: Monitor prediction accuracy vs. actual outcomes
Drift Detection: Identify when model performance degrades (>10% accuracy drop)
Automated Retraining: Trigger weekly retraining with latest 90 days of data
A/B Testing: Deploy new models to 10% of traffic, validate before full rollout
Rollback Capability: Instant revert to previous model if issues detected

Metrics Dashboard:

Prioritization accuracy: Target >85%
Anomaly detection precision: Target >90%
Forecast MAPE (Mean Absolute Percentage Error): Target <15%
Processing optimization impact: Target 20%+ improvement

7. Performance and Scalability Considerations

7.1 Expected Performance Metrics

Metric	Target
Payload Size	Up to 20MB (100k lines)
Processing Time	30-90 seconds per job
Concurrent Jobs	50+ simultaneous
Throughput	5,000+ jobs/hour
Event Delivery	< 5 seconds after completion

Table 1.3 Performance Metrics

7.2 Scaling Strategy

ML-Driven Predictive Scaling – Scale proactively based on forecasted demand (60% faster than reactive)
Horizontal Scaling – Auto-scale processor workers based on queue depth
Intelligent Load Distribution – ML prioritization ensures optimal resource utilization
Message Queue – SQS/OCI Queue handles burst traffic automatically
Database Connection Pooling – Reuse connections across processor instances
Processing Engine Optimization – ML-recommended configurations for different workload types

Conclusion

The asynchronous API solution addresses the critical business need for processing large-scale transactions while maintaining the accuracy and reliability customers expect from the platform. By leveraging Hub’s publish-subscribe architecture combined with AI/ML intelligence, the solution eliminates the complexity of status polling databases and provides a modern, event-driven integration pattern with self-optimizing capabilities.

This design ensures:

Scalability across multi-cloud environments (AWS and OCI)
Intelligence through ML-driven job prioritization, anomaly detection, and predictive scaling
Reliability through queue-based processing and event-driven notifications
Performance optimized for 100k+ line invoice processing with continuous ML optimization
Simplicity for customers through webhook-based result delivery
Proactive Operations with predictive scaling and automated performance tuning

The ML-enhanced implementation positions the Processing Engine to serve enterprise customers’ most demanding transaction processing requirements while continuously improving through learned insights and automated optimization.

Appendix: Customer Integration Example

Customers integrate by:

Registering Webhook with Hub
Submitting Request to /api/defer
Receiving Job ID immediately
Getting Notified via webhook when complete

Sample Customer Code (Python):

import requests

# Submit async request
response = requests.post(
‘https://api.engine.com/api/defer’,
headers={‘Authorization’: ‘Bearer token’},
json={‘invoice’: invoice_data, ‘callbackUrl’: ‘https://myapp.com/webhook’}
)

job_id = response.json()[‘jobId’]
print(f”Job submitted: {job_id}“)

# Webhook endpoint receives result
@app.route(‘/webhook’, methods=[‘POST’])
def receive_results():
event = request.json
if event[‘status’] == ‘SUCCESS’:
results = event[‘data’]
# Process results
return ”, 200

Snippet 1.7 Customer Code

Asynchronous API for Large-Scale Processing

Executive Summary

1. Introduction

1.1 Current State

1.2 Business Challenge

1.3 Solution Overview

2. Architecture Overview

2.1 High-Level Architecture

2.2 Component Breakdown

3. Detailed Processing Flow

3.1 Asynchronous Processing Flow

3.2 Hub Integration

4. Technical Implementation Details

4.1 API Endpoint Specification

4.2 Hub Event Schema

4.3 Multi-Cloud Deployment Strategy

5.1 Why Asynchronous Processing?

5.2 Why Hub (Not Redis)?

5.3 Atomic Processing Requirement

6. AI/ML-Enhanced Capabilities

6.1 Intelligent Job Prioritization

6.2 Anomaly Detection

6.3 Predictive Auto-Scaling

6.4 Processing Optimization Insights

6.5 ML Service Architecture

6.6 Model Monitoring and Retraining

7. Performance and Scalability Considerations

7.1 Expected Performance Metrics

7.2 Scaling Strategy

Conclusion

Appendix: Customer Integration Example

Related Articles

Research

Treatment

News

About Us

Stay in touch