Add throughput testing infrastructure and fix Python telemetry performance by jmthomas · Pull Request #2742 · OpenC3/cosmos

jmthomas · 2026-01-18T03:42:20Z

Summary

This PR adds comprehensive throughput testing infrastructure and fixes a critical Python telemetry performance bottleneck, improving Python throughput by 10x.

Key Changes

Fix Python telemetry throughput bottleneck - JSONPath caching in JsonAccessor improves throughput from ~320 Hz to 3,545 Hz
Add throughput testing server - Standalone TCP/IP server for measuring COSMOS command/telemetry throughput
Add fire-and-forget command mode - Skip ACK waiting when timeout <= 0 for high-throughput scenarios
Add packet caching in TargetModel - Thread-safe caching with 10-second timeout reduces Redis lookups
Add UPDATE_INTERVAL tests - Verify queued writes functionality in both Ruby and Python

Performance Results

Metric	Before	After	Improvement
Python telemetry	~320 Hz	3,545 Hz	10x
Ruby telemetry	~2,700 Hz	~2,700 Hz	baseline

Python now outperforms Ruby at high telemetry rates while maintaining zero packet loss.

Root Cause Analysis

The Python performance issue was caused by jsonpath_ng.parse() recompiling JSONPath expressions on every call (~2.7ms per call). When identifying packets in unique_id_mode, this caused ~5.7ms overhead per packet. Adding lru_cache to cache parsed expressions reduced this to ~0.12µs (23,000x speedup per call).

Files Changed

Performance Fixes:

openc3/python/openc3/accessors/json_accessor.py - Add JSONPath caching with lru_cache
openc3/python/openc3/microservices/interface_microservice.py - Fix missing self.queued initialization
openc3/python/pyproject.toml - Add orjson as optional dependency

Throughput Testing Infrastructure:

examples/throughput_server/ - New standalone throughput testing server
openc3-cosmos-demo/targets/INST/procedures/throughput_test.rb - Ruby throughput test
openc3-cosmos-demo/targets/INST2/procedures/throughput_test.py - Python throughput test
openc3-cosmos-demo/targets/*/screens/throughput.txt - Throughput monitoring screens

Command/Telemetry Optimizations:

openc3/lib/openc3/topics/command_topic.rb - Fire-and-forget mode
openc3/python/openc3/topics/command_topic.py - Fire-and-forget mode
openc3/lib/openc3/models/target_model.rb - Packet caching
openc3/python/openc3/models/target_model.py - Packet caching

Test Coverage:

openc3/spec/microservices/interface_microservice_spec.rb - UPDATE_INTERVAL test
openc3/python/test/microservices/test_interface_microservice.py - UPDATE_INTERVAL tests
openc3/spec/models/target_model_spec.rb - Packet caching tests
openc3/python/test/models/test_target_model.py - Packet caching tests

Test plan

All Python protocol tests pass (211 tests)
Ruby interface_microservice tests pass (15 tests)
Python interface_microservice tests pass (10 tests)
Throughput tests verified with throughput_server
Manual testing with DEMO plugin

🤖 Generated with Claude Code

Throughput Server (examples/throughput_server/): - Standalone TCP/IP server for measuring COSMOS command/telemetry throughput - Dual-port operation for INST (7778) and INST2 (7780) targets - CCSDS packet encoding/decoding with configurable streaming rates - Time-compensated streaming to maintain accurate rates up to 100kHz - Pre-allocated buffers for minimal allocation in hot paths - Raw TCP rate test scripts (Ruby/Python) achieving ~300-500k cmd/s DEMO Plugin Changes: - Add THROUGHPUT_STATUS telemetry packet with rate/count metrics - Add throughput commands: START_STREAM, STOP_STREAM, GET_STATS, RESET_STATS - Add throughput_test procedures for INST (Ruby) and INST2 (Python) - Add throughput screen for real-time monitoring - Add plugin variables to toggle between simulator and throughput server - Configure LengthProtocol for CCSDS packet framing when using throughput server Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Add fire-and-forget mode to CommandTopic.send_command when timeout <= 0 to skip ACK waiting for high-throughput command scenarios - Add thread-safe packet caching in TargetModel with 10-second timeout to reduce Redis lookups for repeated packet access - Cache is automatically invalidated when set_packet is called - Add unit tests for packet caching in both Ruby and Python Co-Authored-By: Claude Opus 4.5 <[email protected]>

Root cause: jsonpath_ng.parse() was recompiling JSONPath expressions on every call, taking ~2.7ms per call. When identifying packets in unique_id_mode (required for INST2 due to mixed CCSDS/JSON packet types), this caused ~5.7ms overhead per packet, limiting throughput to ~320 Hz. Changes: - Add lru_cache to JSONPath parsing in JsonAccessor (103x speedup) - Add orjson as optional dependency for faster JSON parsing - Fix missing self.queued initialization in Python interface_microservice - Update throughput test scripts for both Ruby and Python Results: - Python telemetry: 320 Hz → 3,545 Hz (10x improvement) - Python now outperforms Ruby at high rates (3,545 Hz vs 2,627 Hz) - Zero packet loss maintained at all tested rates Co-Authored-By: Claude Opus 4.5 <[email protected]>

Revert the bytearray optimization in Python protocols to maintain backward compatibility for custom protocol implementations. The change from bytes to bytearray could break user code that: - Type checks self.data expecting bytes - Relies on immutability of self.data - Uses bytes-specific operations The JSONPath caching fix (the real 10x performance improvement) remains intact. Also adds tests for UPDATE_INTERVAL option in both Ruby and Python interface_microservice to verify the queued writes functionality. Co-Authored-By: Claude Opus 4.5 <[email protected]>

codecov · 2026-01-18T03:43:27Z

Codecov Report

❌ Patch coverage is 88.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.69%. Comparing base (0647fab) to head (7f83f8e).
⚠️ Report is 17 commits behind head on main.

Files with missing lines	Patch %	Lines
openc3/lib/openc3/topics/command_topic.rb	50.00%	3 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2742   +/-   ##
=======================================
  Coverage   78.68%   78.69%           
=======================================
  Files         671      671           
  Lines       54738    54796   +58     
  Branches      731      731           
=======================================
+ Hits        43072    43122   +50     
- Misses      11586    11594    +8     
  Partials       80       80

Flag	Coverage Δ
python	`80.32% <ø> (+<0.01%)`	⬆️
ruby-api	`82.68% <ø> (ø)`
ruby-backend	`81.80% <88.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jmthomas · 2026-01-18T03:45:31Z

openc3/lib/openc3/microservices/interface_microservice.rb

      @interface.options.each do |option_name, option_values|
-        if option_name.upcase == 'OPTIMIZE_THROUGHPUT'
+        # OPTIMIZE_THROUGHPUT was changed to UPDATE_INTERVAL to better represent the setting
+        if option_name.upcase == 'UPDATE_INTERVAL' or option_name.upcase == 'OPTIMIZE_THROUGHPUT'


This was just a bug ... missing keyword that we already changed in Python and in the docs

jmthomas · 2026-01-18T03:47:50Z

openc3/python/openc3/streams/tcpip_socket_stream.py

        while True:  # Loop until we get some data
            try:
-                data = self.read_socket.recv(4096, socket.MSG_DONTWAIT)
+                data = self.read_socket.recv(65535, socket.MSG_DONTWAIT)


Now sure how much of an optimization this is but it matches Ruby

jmthomas · 2026-01-18T03:52:05Z

Results calculated on my Macbook Pro M3 Max with 36GB RAM with Docker CPU limit 10 and memory limit 16GB RAM.

Ruby results:

2026/01/18 02:35:28.545 (throughput_test.rb:183): Command Throughput:
2026/01/18 02:35:28.545 (throughput_test.rb:184):   100 cmd burst:  442.5 cmd/s
2026/01/18 02:35:28.546 (throughput_test.rb:185):   500 cmd burst:  437.4 cmd/s
2026/01/18 02:35:28.546 (throughput_test.rb:186):   1000 cmd burst: 421.2 cmd/s
2026/01/18 02:35:28.547 (throughput_test.rb:188): 
2026/01/18 02:35:28.547 (throughput_test.rb:188): Telemetry Throughput:
2026/01/18 02:35:28.547 (throughput_test.rb:189):   100 Hz target:   99.0 Hz (0.0% loss)
2026/01/18 02:35:28.547 (throughput_test.rb:190):   1000 Hz target:  977.8 Hz (0.0% loss)
2026/01/18 02:35:28.547 (throughput_test.rb:191):   2000 Hz target:  1979.6 Hz (0.0% loss)
2026/01/18 02:35:28.547 (throughput_test.rb:192):   3000 Hz target:  2762.2 Hz (0% loss)
2026/01/18 02:35:28.548 (throughput_test.rb:193):   4000 Hz target:  2626.8 Hz (0% loss)

Python results:

2026-01-18T02:44:32.696048Z (throughput_test.py:191): Command Throughput:
2026-01-18T02:44:32.696222Z (throughput_test.py:192):   100 cmd burst:  586.0 cmd/s
2026-01-18T02:44:32.696430Z (throughput_test.py:193):   500 cmd burst:  623.3 cmd/s
2026-01-18T02:44:32.696663Z (throughput_test.py:194):   1000 cmd burst: 541.6 cmd/s
2026-01-18T02:44:32.696972Z (throughput_test.py:196): 
2026-01-18T02:44:32.696972Z (throughput_test.py:196): Telemetry Throughput:
2026-01-18T02:44:32.697271Z (throughput_test.py:197):   100 Hz target:   98.8 Hz (0% loss)
2026-01-18T02:44:32.697677Z (throughput_test.py:200):   1000 Hz target:  984.4 Hz (0% loss)
2026-01-18T02:44:32.697970Z (throughput_test.py:203):   2000 Hz target:  1978.2 Hz (0% loss)
2026-01-18T02:44:32.698207Z (throughput_test.py:206):   3000 Hz target:  2955.0 Hz (0% loss)
2026-01-18T02:44:32.699203Z (throughput_test.py:209):   4000 Hz target:  3460.0 Hz (0% loss)

jmthomas · 2026-01-20T15:34:38Z

openc3/python/openc3/accessors/json_accessor.py

+# This provides a ~1000x speedup for repeated accesses (2.7ms -> 2.7µs)
+@lru_cache(maxsize=256)
+def _parse_jsonpath(path):
+    return parse(path)


This was the huge win for Python telemetry performance ... it only affects JSON and CBOR but it affects everything if you have at least 1 JSON or CBOR packet defined.

jmthomas · 2026-01-20T16:09:10Z

The packet caching I added to target_model has a measurable impact on commanding. Python sees a 30-80% improvement and Ruby shows a 15-20% improvement. The caching optimization primarily benefits command operations where repeated packet lookups occur during burst sending.

jmthomas · 2026-01-20T16:24:03Z

Comparing these changes (plus other previous changes) from 6.10.4 to now:

With only binary CCSDS telemetry (no JSON, CBOR, XML, HTML):
2.2x - 2.6x improvement in Python command throughput
1.2 - 1.3x improvement in Ruby command throughput
Slight regressions at the highest telemetry rates (most likely other factors)

Full telemetry JSON, CBOR, XML, HTML:
5x improvement in Python telemetry due to the JSONPath caching
Slight regressions at the highest telemetry rates (most likely other factors)

clayandgen

Going to run it myself shortly, in the meantime, reminder to remove the cache statistics data!

openc3/lib/openc3/models/target_model.rb

clayandgen

Converting the Throughput scripts to a suite could be nice for the "Test Results" formatting!

Apple M4 Max, 36GB RAM results. Python commanding seems consistently higher performance than the M3 Max, Ruby results are comparable 👍

============================================================
SUMMARY (Ruby)
============================================================
Command Throughput:
	100 cmd burst:  421.4 cmd/s
	500 cmd burst:  455.5 cmd/s
	1000 cmd burst: 486.5 cmd/s
 
Telemetry Throughput:
	100 Hz target:   100.0 Hz (0.0% loss)
	1000 Hz target:  1000.8 Hz (0.0% loss)
	2000 Hz target:  1959.8 Hz (0.0% loss)
	3000 Hz target:  2958.2 Hz (9.99% loss)
	4000 Hz target:  2940.6 Hz (0% loss)
	5000 Hz target:  2840.4 Hz (0% loss)


============================================================
SUMMARY (Python)
============================================================

Command Throughput:
	100 cmd burst:  705.4 cmd/s
	500 cmd burst:  664.8 cmd/s
	1000 cmd burst: 724.0 cmd/s

Telemetry Throughput:
	100 Hz target:   99.0 Hz (0% loss)
	1000 Hz target:  976.4 Hz (0% loss)
	2000 Hz target:  1963.6 Hz (0% loss)
	3000 Hz target:  2949.6 Hz (0% loss)
	4000 Hz target:  3895.6 Hz (0% loss)
	5000 Hz target:  3788.0 Hz (0% loss)

clayandgen · 2026-01-23T20:29:10Z

examples/throughput_server/README.md

+
+1. Start the throughput server:
+   ```bash
+   python throughput_server.py


nit: python examples/throughput_server/throughput_server.py

clayandgen · 2026-01-23T20:38:04Z

...3-cosmos-init/plugins/packages/openc3-cosmos-demo/targets/INST/procedures/throughput_test.rb

+  packets_received = final_cosmos_count - initial_cosmos_count
+
+  # Calculate actual rate from test data (more accurate than server's TLM_SENT_RATE which is stale)
+  actual_rate = packets_sent.to_f / duration


One observation, is that in the Command test it uses the actual system clock time (Time.now) as part of the calculation, whereas in Telemetry test we use the duration. This is probably arbitrarily close, but thought to note the discrepancy

sonarqubecloud · 2026-01-29T01:36:29Z

Quality Gate failed

Failed conditions
12 New issues

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Add throughput testing infrastructure and fix Python telemetry performance

jmthomas and others added 6 commits January 16, 2026 12:00

Modify throughput tests

3a4fc3e

Merge branch 'main' into cmd_tlm_test

ba9203a

jmthomas commented Jan 18, 2026

View reviewed changes

jmthomas added 2 commits January 18, 2026 14:27

Fix tests

3a2d14d

Merge branch 'main' into cmd_tlm_test

f73edfe

jmthomas requested review from clayandgen and ryanmelt and removed request for ryanmelt January 20, 2026 15:32

jmthomas commented Jan 20, 2026

View reviewed changes

jmthomas added 2 commits January 20, 2026 10:41

Add 500Hz

1d1bb80

Add documentation to throughput server

a32b161

clayandgen reviewed Jan 23, 2026

View reviewed changes

openc3/lib/openc3/models/target_model.rb Outdated Show resolved Hide resolved

clayandgen approved these changes Jan 23, 2026

View reviewed changes

jmthomas added 5 commits January 27, 2026 23:13

Remove packet cache stats and test using Store spy

3d91e82

Merge branch 'main' into cmd_tlm_test

e7ebf95

Fix python test

7d2f6f2

Merge branch 'main' into cmd_tlm_test

2973dd3

Merge branch 'main' into cmd_tlm_test

7f83f8e

jmthomas merged commit a4ba7aa into main Jan 29, 2026
47 of 49 checks passed

jmthomas deleted the cmd_tlm_test branch January 29, 2026 01:42

jmthomas added a commit that referenced this pull request Mar 21, 2026

Merge pull request #2742 from OpenC3/cmd_tlm_test

ca99d9c

Add throughput testing infrastructure and fix Python telemetry performance

Conversation

jmthomas commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Performance Results

Root Cause Analysis

Files Changed

Test plan

Uh oh!

codecov bot commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jmthomas Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

jmthomas Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

jmthomas commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmthomas Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

jmthomas commented Jan 20, 2026

Uh oh!

jmthomas commented Jan 20, 2026

Uh oh!

clayandgen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

clayandgen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

clayandgen Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

clayandgen Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Jan 29, 2026

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jmthomas commented Jan 18, 2026 •

edited

Loading

codecov bot commented Jan 18, 2026 •

edited

Loading

jmthomas commented Jan 18, 2026 •

edited

Loading

clayandgen left a comment •

edited

Loading