Skip to content

Logging large values to rerun.set_time_seconds() causes hang/crash #2582

@Nicholas-Autio-Mitchell

Description

@Nicholas-Autio-Mitchell

Describe the bug

Using the Python SDK, logging a time before logging an entity causes the GUI to hang then crash. The process itself never exits.

I am logging various sensor data to the same stream via rerun.log_points(), rerun.log_image(), etc.

For each call, I first set the time of that recorded sensor data.

  • Using a "frame ID" works fine e.g. rr.set_time_sequence("frame_id", step).
  • I tried logging my dataset's timestamps as nanoseconds: rr.set_time_nanos("timestamp", timestamp). This wasn't correct for my case.
  • I instead logged timestamps as seconds: rr.set_time_nanos("timestamp", timestamp). This causes the hang/crash.

[sidenote: turned out my dataset uses a custom time reference: nanoseconds since something domain specific]

To Reproduce
Reproduced on the clock example: examples/python/clock/main.py. Apply the diff below, where we scale up the step by 1e9 e.g. nanoseconds:

@ examples/python/clock/main.py:46 @ def log_clock(steps: int) -> None:
    for step in range(steps):
        t_secs = step

-        rr.set_time_seconds("sim_time", t_secs)
+        rr.set_time_seconds("seconds", t_secs * 1e9)        # causes the app to hang
+        # rr.set_time_nanos("nano seconds", t_secs * 1e9)   # this works fine
  1. The full process:
pip install -U rerun-sdk  # get 0.7.0 at time of writing
git clone [email protected]:rerun-io/rerun.git  # Clone the repository
cd rerun/
git checkout 0.7.0
# code -n examples/python/clock/main.py  # open file to apply the diff above
python examples/python/clock/main.py  # opens the GUI

Interestingly, the logging itself takes a really high, suggesting some type checks/conversions are slowing things down?

  1. In the GUI, change the logging time to choose seconds, which causes the app to hang
    image

  2. Wait ~7 minutes, the GUI app does crash and close... but the process does not. You can still find python -m rerun --port 9876 in htop still chugging along doing something (for me CPU% floats around ~25%). Watching htop as I switch to seconds in the GUI to enter failure mode, I notice that memory spikes to fill my memory, then seems to clear itself and again spike, then clear itself and become stable.

Playing with datatypes on the Python side had no effect. E.g rr.set_time_seconds("seconds", float(t_secs * 1e9)), ends with the same result. I haven't starting poking around into the bindings/rust code.

Expected behavior

An error, or perhaps a sane conversion? The docs already speak of some assumptions on timestamps based on their magnitude.

Backtrace

No backtrace is given.

Desktop (please complete the following information):

  • OS: MacOS Ventura 13.4.1

Rerun version

rerun_py 0.7.0 [rustc 1.69.0 (84c898d65 2023-04-16), LLVM 15.0.7] aarch64-apple-darwin prepare-0.7 9cf3033, built 2023-06-16T15:39:37Z

Metadata

Metadata

Assignees

Labels

💣 crashcrash, deadlock/freeze, do-no-start🪳 bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions