Asynchronous API for Large-Scale Processing

Author: Praveen Gupta, Pankaj Joshi
Date: January 16, 2026

Executive Summary

This document outlines the design and implementation strategy for an asynchronous API endpoint that enables the Processing System to process large invoice transactions containing 100,000+ line items. The solution leverages Hub as a publish-subscribe mechanism to handle asynchronous processing across multi-cloud environments (AWS and OCI), ensuring scalability, reliability, and efficient communication with external customers.

1. Introduction

1.1 Current State

The Processing System is a SaaS based platform that processes customer transactions. Currently deployed across AWS and OCI cloud infrastructures, the engine serves customers through a REST endpoint:

/process – Synchronous processing with database persistence

The existing architecture handles up to 5,000-lines items per invoice synchronously with excellent performance. However, enterprise customers require the ability to process significantly larger transactions—up to 100,000+ line items (approximately 20MB payload size)—which necessitates an asynchronous processing model.

1.2 Business Challenge

Processing 100k+ line invoices synchronously presents several challenges:

  1. Timeout Issues – Extended processing times exceed typical HTTP timeout thresholds
  2. Resource Contention – Long-running synchronous requests block critical resources
  3. Customer Experience – Clients waiting for responses face degraded user experience
  4. Processing Accuracy – All lines must be processed together as line interactions affect total calculations

1.3 Solution Overview

The proposed solution introduces a new asynchronous endpoint POST /api/defer that:

  • Accepts large invoice payloads (100k+ lines)
  • Processes transactions asynchronously in the background
  • Notifies customers of completion via Hub publish-subscribe mechanism
  • Operates seamlessly across both AWS and OCI environments
  • Eliminates the need for status polling databases

2. Architecture Overview

2.1 High-Level Architecture

The asynchronous processing architecture integrates the existing Processing System with Hub to enable event-driven communication:

Figure 1.1 High Level Architecture 

2.2 Component Breakdown

Component Technology Purpose
API Gateway Apigee OAuth authentication, rate limiting, routing
Message Queue AWS SQS / OCI Queue Decouples request acceptance from processing
ML Service Python / TensorFlow / scikit-learn Job prioritization, anomaly detection, predictive scaling
Async Processor Spring Boot Worker Processes queued jobs, invokes processing engine
Processing Engine Java 21 / Spring Boot Core processing logic
Hub Azure Event Grid Pub-sub messaging for completion events
Database PostgreSQL / Oracle Configuration and content data
Container Platform ECS (AWS) / EKS (OCI) Auto-scaling compute infrastructure

Table 1.1 Component Breakdown

3. Detailed Processing Flow

3.1 Asynchronous Processing Flow

Figure 1.2 Processing Flow

3.2 Hub Integration

Figure 1.3 Hub Integration

4. Technical Implementation Details

4.1 API Endpoint Specification

Endpoint: POST /api/defer

Request Headers:

Authorization: Bearer <OAuth-Token>
Content-Type: application/json
X-Request-ID: <UUID>

Snippet 1.1 Request Headers

Request Body:

{
  “transactionType”: “PROCESS”,
  “invoice”: {
    “documentCode”: “INV-2026-001234”,
    “documentDate”: “2026-01-15”,
    “customerCode”: “CUST-XYZ”,
    “lines”: [
      {
        “lineNumber”: 1,
        “itemCode”: “PROD-001”,
        “quantity”: 100,
        “amount”: 5000.00,
        “originAddress”: {},
        “destinationAddress”: {}
      }
      // 99,999 more lines
    ]
  },
  “callbackUrl”: “https://customer.com/webhooks/results”
}

Snippet 1.2 Request Body

Response (202 Accepted):

{
  “jobId”: “job-uuid-12345”,
  “status”: “QUEUED”,
  “estimatedCompletionTime”: “2026-01-15T14:35:00Z”,
  “statusCheckUrl”: “https://api.engine.com/api/defer/job-uuid-12345”
}

Snippet 1.3 Response

4.2 Hub Event Schema

Event Type: processing.complete

Event Payload:

{
  “eventId”: “evt-uuid-67890”,
  “eventType”: “processing.complete”,
  “timestamp”: “2026-01-15T14:32:15Z”,
  “jobId”: “job-uuid-12345”,
  “status”: “SUCCESS”,
  “data”: {
    “documentCode”: “INV-2026-001234”,
    “totalAmount”: 5012456.78,
    “processingTimeMs”: 45000,
    “linesProcessed”: 100000,
    “resultUrl”: “https://api.engine.com/api/defer/job-uuid-12345/result”
  }
}

Snippet 1.4 Payload

4.3 Multi-Cloud Deployment Strategy

Component AWS Implementation OCI Implementation
Compute ECS with Fargate OKE (Kubernetes)
Message Queue Amazon SQS OCI Queue Service
ML Service SageMaker / ECS OCI Data Science / OKE
Database Amazon RDS (PostgreSQL) Oracle Autonomous Database
Auto-Scaling ECS Service Auto-Scaling HPA (Horizontal Pod Autoscaler)
Networking VPC, ALB VCN, OCI Load Balancer
Monitoring CloudWatch OCI Monitoring

Table 1.2 Cloud Deployment Strategy

5. Key Design Decisions

5.1 Why Asynchronous Processing?

  • Scalability – Decouple request acceptance from processing allows independent scaling
  • Resilience – Queue-based architecture provides retry capability and fault tolerance
  • Resource Optimization – Avoid thread blocking during long-running calculations
  • User Experience – Immediate acknowledgment prevents client timeout issues

5.2 Why Hub (Not Redis)?

Per requirements, Redis is explicitly excluded. Hub provides:

  • Managed Service – No infrastructure maintenance required
  • Multi-Cloud Support – Accessible from both AWS and OCI
  • Webhook Delivery – Native support for HTTP callbacks
  • Event Persistence – Guaranteed delivery with retry mechanisms
  • External Access – Customers outside TR network can subscribe
  • No Status Database Needed – Pub-sub eliminates polling requirement

5.3 Atomic Processing Requirement

All 100k lines must be processed together because:

  • Processing operations have interdependencies between line items
  • Calculations aggregate across lines
  • Business rules apply at invoice level
  • Results may vary based on total transaction value

Implication: No batch splitting—entire invoice processed as single unit.

6. AI/ML-Enhanced Capabilities

The architecture integrates machine learning to optimize performance, detect anomalies, and improve system intelligence:

6.1 Intelligent Job Prioritization

Objective: Optimize queue processing order based on predicted complexity and customer SLAs.

ML Model: Gradient Boosting Regressor trained on historical job metadata: 

  • Input Features: Line count, payload size, customer tier, time of day, product types
  • Output: Predicted processing time (seconds) 
  • Training Data: 6+ months of completed job metrics

Benefits: 

  • High-priority customers processed first 
  • Short jobs avoid blocking behind long-running jobs 
  • 25-40% improvement in average wait time

Implementation:

# Simplified ML prioritization logic
def calculate_priority_score(job):
    predicted_time = ml_model.predict(job.features)
    sla_urgency = get_customer_sla_weight(job.customer_id)
    return (sla_urgency * 100) / predicted_time

Snippet 1.5 Implementation

6.2 Anomaly Detection

Objective: Identify suspicious or malformed transactions before expensive processing.

ML Model: Isolation Forest for unsupervised anomaly detection: 

Detection Criteria: 

  • Unusual line-item patterns 
  • Abnormal amount distributions 
  • Suspicious geographic patterns 
  • Payload structure deviations

Action Workflow: 

  1. ML service scores incoming job (0-100 anomaly score) 
  2. Score > 80: Flag for manual review queue 
  3. Score 50-80: Process with enhanced logging 
  4. Score < 50: Normal processing

Benefits: 

  • Prevent processing of corrupted/malicious data 
  • Reduce wasted compute resources 
  • Early fraud detection capabilities

6.3 Predictive Auto-Scaling

Objective: Proactively scale resources ahead of demand spikes.

ML Model: LSTM (Long Short-Term Memory) neural network for time-series forecasting:  

  • Input Features: Historical queue depth, time patterns, seasonal trends 
  • Output: Predicted job volume for next 15-60 minutes 
  • Retraining: Weekly with latest patterns

Scaling Logic:

if predicted_volume > current_capacity * 0.7:
    scale_up_workers(target=predicted_volume / avg_throughput)
elif predicted_volume < current_capacity * 0.3:
    scale_down_workers(target=predicted_volume / avg_throughput)

Snippet 1.6 Scaling Logic

Benefits: 

  • 60% faster response to demand spikes vs. reactive scaling 
  • Reduced cold-start delays 
  • Cost optimization through predictive scale-down

6.4 Processing Optimization Insights

Objective: Continuously learn optimal processing strategies from execution patterns.

ML Approach: Reinforcement learning to optimize: 

  • Database connection pool sizing 
  • Batch processing chunk sizes 
  • Memory allocation strategies 
  • Parallel processing thread counts

Feedback Loop: 

  • Processor reports: job characteristics → chosen strategy → execution time 
  • ML service analyzes: which strategies perform best for which job types 
  • Model recommends: optimal configuration for incoming jobs

Benefits: 

  • Self-tuning performance optimization 
  • Automatic adaptation to changing workload patterns 
  • 15-30% processing time improvements

6.5 ML Service Architecture

Figure 1.4 ML Service Architecture

6.6 Model Monitoring and Retraining

Continuous Improvement Pipeline: 

  1. Performance Tracking: Monitor prediction accuracy vs. actual outcomes 
  2. Drift Detection: Identify when model performance degrades (>10% accuracy drop) 
  3. Automated Retraining: Trigger weekly retraining with latest 90 days of data 
  4. A/B Testing: Deploy new models to 10% of traffic, validate before full rollout 
  5. Rollback Capability: Instant revert to previous model if issues detected

Metrics Dashboard: 

  • Prioritization accuracy: Target >85% 
  • Anomaly detection precision: Target >90% 
  • Forecast MAPE (Mean Absolute Percentage Error): Target <15% 
  • Processing optimization impact: Target 20%+ improvement

7. Performance and Scalability Considerations

7.1 Expected Performance Metrics

Metric Target
Payload Size Up to 20MB (100k lines)
Processing Time 30-90 seconds per job
Concurrent Jobs 50+ simultaneous
Throughput 5,000+ jobs/hour
Event Delivery < 5 seconds after completion

Table 1.3 Performance Metrics

7.2 Scaling Strategy

  1. ML-Driven Predictive Scaling – Scale proactively based on forecasted demand (60% faster than reactive)
  2. Horizontal Scaling – Auto-scale processor workers based on queue depth
  3. Intelligent Load Distribution – ML prioritization ensures optimal resource utilization
  4. Message Queue – SQS/OCI Queue handles burst traffic automatically
  5. Database Connection Pooling – Reuse connections across processor instances
  6. Processing Engine Optimization – ML-recommended configurations for different workload types

Conclusion

The asynchronous API solution addresses the critical business need for processing large-scale transactions while maintaining the accuracy and reliability customers expect from the platform. By leveraging Hub’s publish-subscribe architecture combined with AI/ML intelligence, the solution eliminates the complexity of status polling databases and provides a modern, event-driven integration pattern with self-optimizing capabilities.

This design ensures: 

  • Scalability across multi-cloud environments (AWS and OCI) 
  • Intelligence through ML-driven job prioritization, anomaly detection, and predictive scaling 
  • Reliability through queue-based processing and event-driven notifications 
  • Performance optimized for 100k+ line invoice processing with continuous ML optimization 
  • Simplicity for customers through webhook-based result delivery 
  • Proactive Operations with predictive scaling and automated performance tuning

The ML-enhanced implementation positions the Processing Engine to serve enterprise customers’ most demanding transaction processing requirements while continuously improving through learned insights and automated optimization.

Appendix: Customer Integration Example

Customers integrate by:

  1. Registering Webhook with Hub
  2. Submitting Request to /api/defer
  3. Receiving Job ID immediately
  4. Getting Notified via webhook when complete

Sample Customer Code (Python):

import requests

# Submit async request
response = requests.post(
    ‘https://api.engine.com/api/defer’,
    headers={‘Authorization’: ‘Bearer token’},
    json={‘invoice’: invoice_data, ‘callbackUrl’: ‘https://myapp.com/webhook’}
)

job_id = response.json()[‘jobId’]
print(f”Job submitted: {job_id})

# Webhook endpoint receives result
@app.route(‘/webhook’, methods=[‘POST’])
def receive_results():
    event = request.json
    if event[‘status’] == ‘SUCCESS’:
        results = event[‘data’]
        # Process results
    return , 200

Snippet 1.7 Customer Code

Related Articles