Author: Praveen Gupta, Pankaj Joshi
Date: January 16, 2026
Executive Summary
This document outlines the design and implementation strategy for an asynchronous API endpoint that enables the Processing System to process large invoice transactions containing 100,000+ line items. The solution leverages Hub as a publish-subscribe mechanism to handle asynchronous processing across multi-cloud environments (AWS and OCI), ensuring scalability, reliability, and efficient communication with external customers.
1. Introduction
1.1 Current State
The Processing System is a SaaS based platform that processes customer transactions. Currently deployed across AWS and OCI cloud infrastructures, the engine serves customers through a REST endpoint:
/process – Synchronous processing with database persistence
The existing architecture handles up to 5,000-lines items per invoice synchronously with excellent performance. However, enterprise customers require the ability to process significantly larger transactions—up to 100,000+ line items (approximately 20MB payload size)—which necessitates an asynchronous processing model.
1.2 Business Challenge
Processing 100k+ line invoices synchronously presents several challenges:
- Timeout Issues – Extended processing times exceed typical HTTP timeout thresholds
- Resource Contention – Long-running synchronous requests block critical resources
- Customer Experience – Clients waiting for responses face degraded user experience
- Processing Accuracy – All lines must be processed together as line interactions affect total calculations
1.3 Solution Overview
The proposed solution introduces a new asynchronous endpoint POST /api/defer that:
- Accepts large invoice payloads (100k+ lines)
- Processes transactions asynchronously in the background
- Notifies customers of completion via Hub publish-subscribe mechanism
- Operates seamlessly across both AWS and OCI environments
- Eliminates the need for status polling databases
2. Architecture Overview
2.1 High-Level Architecture
The asynchronous processing architecture integrates the existing Processing System with Hub to enable event-driven communication:

Figure 1.1 High Level Architecture
2.2 Component Breakdown
| Component | Technology | Purpose |
|---|---|---|
| API Gateway | Apigee | OAuth authentication, rate limiting, routing |
| Message Queue | AWS SQS / OCI Queue | Decouples request acceptance from processing |
| ML Service | Python / TensorFlow / scikit-learn | Job prioritization, anomaly detection, predictive scaling |
| Async Processor | Spring Boot Worker | Processes queued jobs, invokes processing engine |
| Processing Engine | Java 21 / Spring Boot | Core processing logic |
| Hub | Azure Event Grid | Pub-sub messaging for completion events |
| Database | PostgreSQL / Oracle | Configuration and content data |
| Container Platform | ECS (AWS) / EKS (OCI) | Auto-scaling compute infrastructure |
Table 1.1 Component Breakdown
3. Detailed Processing Flow
3.1 Asynchronous Processing Flow

Figure 1.2 Processing Flow
3.2 Hub Integration

Figure 1.3 Hub Integration
4. Technical Implementation Details
4.1 API Endpoint Specification
Endpoint: POST /api/defer
Request Headers:
| Authorization: Bearer <OAuth-Token> Content-Type: application/json X-Request-ID: <UUID> |
Snippet 1.1 Request Headers
Request Body:
| { “transactionType”: “PROCESS”, “invoice”: { “documentCode”: “INV-2026-001234”, “documentDate”: “2026-01-15”, “customerCode”: “CUST-XYZ”, “lines”: [ { “lineNumber”: 1, “itemCode”: “PROD-001”, “quantity”: 100, “amount”: 5000.00, “originAddress”: {…}, “destinationAddress”: {…} } // … 99,999 more lines ] }, “callbackUrl”: “https://customer.com/webhooks/results” } |
Snippet 1.2 Request Body
Response (202 Accepted):
| { “jobId”: “job-uuid-12345”, “status”: “QUEUED”, “estimatedCompletionTime”: “2026-01-15T14:35:00Z”, “statusCheckUrl”: “https://api.engine.com/api/defer/job-uuid-12345” } |
Snippet 1.3 Response
4.2 Hub Event Schema
Event Type: processing.complete
Event Payload:
| { “eventId”: “evt-uuid-67890”, “eventType”: “processing.complete”, “timestamp”: “2026-01-15T14:32:15Z”, “jobId”: “job-uuid-12345”, “status”: “SUCCESS”, “data”: { “documentCode”: “INV-2026-001234”, “totalAmount”: 5012456.78, “processingTimeMs”: 45000, “linesProcessed”: 100000, “resultUrl”: “https://api.engine.com/api/defer/job-uuid-12345/result” } } |
Snippet 1.4 Payload
4.3 Multi-Cloud Deployment Strategy
| Component | AWS Implementation | OCI Implementation |
|---|---|---|
| Compute | ECS with Fargate | OKE (Kubernetes) |
| Message Queue | Amazon SQS | OCI Queue Service |
| ML Service | SageMaker / ECS | OCI Data Science / OKE |
| Database | Amazon RDS (PostgreSQL) | Oracle Autonomous Database |
| Auto-Scaling | ECS Service Auto-Scaling | HPA (Horizontal Pod Autoscaler) |
| Networking | VPC, ALB | VCN, OCI Load Balancer |
| Monitoring | CloudWatch | OCI Monitoring |
Table 1.2 Cloud Deployment Strategy
5. Key Design Decisions
5.1 Why Asynchronous Processing?
- Scalability – Decouple request acceptance from processing allows independent scaling
- Resilience – Queue-based architecture provides retry capability and fault tolerance
- Resource Optimization – Avoid thread blocking during long-running calculations
- User Experience – Immediate acknowledgment prevents client timeout issues
5.2 Why Hub (Not Redis)?
Per requirements, Redis is explicitly excluded. Hub provides:
- Managed Service – No infrastructure maintenance required
- Multi-Cloud Support – Accessible from both AWS and OCI
- Webhook Delivery – Native support for HTTP callbacks
- Event Persistence – Guaranteed delivery with retry mechanisms
- External Access – Customers outside TR network can subscribe
- No Status Database Needed – Pub-sub eliminates polling requirement
5.3 Atomic Processing Requirement
All 100k lines must be processed together because:
- Processing operations have interdependencies between line items
- Calculations aggregate across lines
- Business rules apply at invoice level
- Results may vary based on total transaction value
Implication: No batch splitting—entire invoice processed as single unit.
6. AI/ML-Enhanced Capabilities
The architecture integrates machine learning to optimize performance, detect anomalies, and improve system intelligence:
6.1 Intelligent Job Prioritization
Objective: Optimize queue processing order based on predicted complexity and customer SLAs.
ML Model: Gradient Boosting Regressor trained on historical job metadata:
- Input Features: Line count, payload size, customer tier, time of day, product types
- Output: Predicted processing time (seconds)
- Training Data: 6+ months of completed job metrics
Benefits:
- High-priority customers processed first
- Short jobs avoid blocking behind long-running jobs
- 25-40% improvement in average wait time
Implementation:
| # Simplified ML prioritization logic def calculate_priority_score(job): predicted_time = ml_model.predict(job.features) sla_urgency = get_customer_sla_weight(job.customer_id) return (sla_urgency * 100) / predicted_time |
Snippet 1.5 Implementation
6.2 Anomaly Detection
Objective: Identify suspicious or malformed transactions before expensive processing.
ML Model: Isolation Forest for unsupervised anomaly detection:
Detection Criteria:
- Unusual line-item patterns
- Abnormal amount distributions
- Suspicious geographic patterns
- Payload structure deviations
Action Workflow:
- ML service scores incoming job (0-100 anomaly score)
- Score > 80: Flag for manual review queue
- Score 50-80: Process with enhanced logging
- Score < 50: Normal processing
Benefits:
- Prevent processing of corrupted/malicious data
- Reduce wasted compute resources
- Early fraud detection capabilities
6.3 Predictive Auto-Scaling
Objective: Proactively scale resources ahead of demand spikes.
ML Model: LSTM (Long Short-Term Memory) neural network for time-series forecasting:
- Input Features: Historical queue depth, time patterns, seasonal trends
- Output: Predicted job volume for next 15-60 minutes
- Retraining: Weekly with latest patterns
Scaling Logic:
| if predicted_volume > current_capacity * 0.7: scale_up_workers(target=predicted_volume / avg_throughput) elif predicted_volume < current_capacity * 0.3: scale_down_workers(target=predicted_volume / avg_throughput) |
Snippet 1.6 Scaling Logic
Benefits:
- 60% faster response to demand spikes vs. reactive scaling
- Reduced cold-start delays
- Cost optimization through predictive scale-down
6.4 Processing Optimization Insights
Objective: Continuously learn optimal processing strategies from execution patterns.
ML Approach: Reinforcement learning to optimize:
- Database connection pool sizing
- Batch processing chunk sizes
- Memory allocation strategies
- Parallel processing thread counts
Feedback Loop:
- Processor reports: job characteristics → chosen strategy → execution time
- ML service analyzes: which strategies perform best for which job types
- Model recommends: optimal configuration for incoming jobs
Benefits:
- Self-tuning performance optimization
- Automatic adaptation to changing workload patterns
- 15-30% processing time improvements
6.5 ML Service Architecture

Figure 1.4 ML Service Architecture
6.6 Model Monitoring and Retraining
Continuous Improvement Pipeline:
- Performance Tracking: Monitor prediction accuracy vs. actual outcomes
- Drift Detection: Identify when model performance degrades (>10% accuracy drop)
- Automated Retraining: Trigger weekly retraining with latest 90 days of data
- A/B Testing: Deploy new models to 10% of traffic, validate before full rollout
- Rollback Capability: Instant revert to previous model if issues detected
Metrics Dashboard:
- Prioritization accuracy: Target >85%
- Anomaly detection precision: Target >90%
- Forecast MAPE (Mean Absolute Percentage Error): Target <15%
- Processing optimization impact: Target 20%+ improvement
7. Performance and Scalability Considerations
7.1 Expected Performance Metrics
| Metric | Target |
|---|---|
| Payload Size | Up to 20MB (100k lines) |
| Processing Time | 30-90 seconds per job |
| Concurrent Jobs | 50+ simultaneous |
| Throughput | 5,000+ jobs/hour |
| Event Delivery | < 5 seconds after completion |
Table 1.3 Performance Metrics
7.2 Scaling Strategy
- ML-Driven Predictive Scaling – Scale proactively based on forecasted demand (60% faster than reactive)
- Horizontal Scaling – Auto-scale processor workers based on queue depth
- Intelligent Load Distribution – ML prioritization ensures optimal resource utilization
- Message Queue – SQS/OCI Queue handles burst traffic automatically
- Database Connection Pooling – Reuse connections across processor instances
- Processing Engine Optimization – ML-recommended configurations for different workload types
Conclusion
The asynchronous API solution addresses the critical business need for processing large-scale transactions while maintaining the accuracy and reliability customers expect from the platform. By leveraging Hub’s publish-subscribe architecture combined with AI/ML intelligence, the solution eliminates the complexity of status polling databases and provides a modern, event-driven integration pattern with self-optimizing capabilities.
This design ensures:
- Scalability across multi-cloud environments (AWS and OCI)
- Intelligence through ML-driven job prioritization, anomaly detection, and predictive scaling
- Reliability through queue-based processing and event-driven notifications
- Performance optimized for 100k+ line invoice processing with continuous ML optimization
- Simplicity for customers through webhook-based result delivery
- Proactive Operations with predictive scaling and automated performance tuning
The ML-enhanced implementation positions the Processing Engine to serve enterprise customers’ most demanding transaction processing requirements while continuously improving through learned insights and automated optimization.
Appendix: Customer Integration Example
Customers integrate by:
- Registering Webhook with Hub
- Submitting Request to /api/defer
- Receiving Job ID immediately
- Getting Notified via webhook when complete
Sample Customer Code (Python):
| import requests # Submit async request response = requests.post( ‘https://api.engine.com/api/defer’, headers={‘Authorization’: ‘Bearer token’}, json={‘invoice’: invoice_data, ‘callbackUrl’: ‘https://myapp.com/webhook’} ) job_id = response.json()[‘jobId’] print(f”Job submitted: {job_id}“) # Webhook endpoint receives result @app.route(‘/webhook’, methods=[‘POST’]) def receive_results(): event = request.json if event[‘status’] == ‘SUCCESS’: results = event[‘data’] # Process results return ”, 200 |
Snippet 1.7 Customer Code

