Skip to content

Avoid duplicated cache content in CacheDataset #3734

@Nic-Ma

Description

@Nic-Ma

Is your feature request related to a problem? Please describe.
Feature request from @dongyang0122 during NVIDIA Clara development:
If I use some method (I think it's resample_dataset API of MONAI) to augment the original dataset and use CacheDataset, will it cache duplicated content for the same samples? Or it only caches the unique ones?

Actually, our PersistentDataset can only cache unique items because it uses Hash key as ID instead of indices, I think we can enhance CacheDataset to optionally also support caching in a dict with Hash key.
@ericspod @wyli @rijobro Do you have any concerns?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions