memory leak

Hi, guys. Thanks for your great open-source work. When I use your rlds dataset built on tf.data.Dataset, I found there is memory leak when `train`=True. Specifically, when it's true and `shuffle` is called without `cache`, the memory gradually increases as the iteration goes.

https://github.com/openvla/openvla/blob/388244e88555150b67520419932189f459ad74cf/prismatic/vla/datasets/rlds/dataset.py#L572

However, I don't understand why it happens. In my understanding, `shuffle` preloads a fixed size of data before iteration and then replace the used data with new data on the fly during iteration. So the memory should keep constant after preloading.
Could you provide some solution to the memory leak? If the leak is hard to solve, could you provide an approximate memory it needs to run the whole iteration during training?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

memory leak #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

memory leak #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions