Skip to content

CacheDataset gets stuck when initializing cache content #3962

@Nic-Ma

Description

@Nic-Ma

Is your feature request related to a problem? Please describe.
Feedback from Andriy and Daguang when using NVIDIA Clara MMARs:
one peculiar issue if we mount dataset from NGC, then training gets stuck in 90% of times (both train.sh and train_multi_gpu.sh)
I believe it has something to do with SmartCache (since it’s the only MMAR that uses it) and ngc-dataset interoperability.
It says : “Loading Data: 0%” and just gets stuck, sometimes it goes to “Loading Data: 6%” and then gets stuck.
Is it something we should be worried about?

PS: If I copy data to local DGX folder first, then training runs fine.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

Relationships

None yet

Development

No branches or pull requests

Issue actions