4135 loadimage and saveimage using network sources#4136
4135 loadimage and saveimage using network sources#4136karwojan wants to merge 7 commits intoProject-MONAI:devfrom karwojan:4135-loadimage-and-saveimage-using-network-sources
Conversation
|
Hi @karwojan , Thanks for your feature request. Thanks. |
Hi @Nic-Ma |
|
Hi @karwojan , Thanks for your analysis. Thanks. |
|
I think I understand the issues @karwojan is addressing with this PR. The alternative solutions to this is to change what the readers/writers do or provide a wrapper that goes around the readers/writers that you would provide, but those require changing a lot of how LoadImage/SaveImage are created. This solution is much cleaner and provides an optional argument that's easy to fill with some existing function. Another choice is to provide wrappers for One problem I see however is that the location the temporary files are stored in isn't controllable and varies by platform. Linux platforms may be configured to mount So this doesn't really provide enough control over what is done with files when loaded from remote sources and is perhaps less efficient by downloading files and keeping them rather than streaming data directly to readers. The alternative thing to do is mount the remote storage as a local file system and read from it with a cached dataset type to minimise how much loading is done, this would just use the existing code without modification. We can adopt the change proposed here and hope this doesn't become a problem, but if we provide wrappers for |
|
Sorry for the long response time. I've changed my implementation according to @ericspod suggestion. I've created separated DownloadImage and UploadImage transforms, which wrap up LoadImage and SaveImage transforms respectively. I agree that it is better to separate this responsibility. |
|
Hi @karwojan , Thanks for your update. Please feel free to correct me if I misunderstood the proposal. Thanks. |
|
Hi @Nic-Ma , |
…aveimage-using-network-sources
|
Hi @karwojan , Thanks for the detailed explantion. What do you think? Thanks. |
|
Hi @Nic-Ma , |
|
I think the issue @Nic-Ma is getting at is the use of temporary files. What your implementation does currently is download the file to a temporary location which is then read by a reader, the file then is deleted at some point in the future. This can cause files to be downloaded multiple times which has a cost. |
|
Thanks for the discussion.
What do you think? Thanks in advance. |
|
Providing a storage directory path to keep images is the way to go, if this is |
|
Okay, maybe the problem is that I haven't explained my idea clearly at the beggining. First of all, I think that it could be very useful to load data dynamically from network storage during training, not from local drive. In fact, I need such a functionality in my project, because I have no enough space on local disk to store whole dataset. I have also very fast network storage and very fast connection with it. I mean that I have to download data in each training loop and this is not a problem. It does not seem to be uncommon scenario because of VERY BIG datasets, which we have to deal with in today's world. Secondly, If I found a way to load data using nibabel or itk directly from bytes array I wouldn't use temporary files. These temporary files are just a workaround how to load data from network storage using nibabel and itk. I don't want to store any data in local hard drive. Finally, regarding the implementation, temporary files are removed after scope of the method is executed, because NamedTemporaryFile variable's reference is removed and all its resources are automatically free. I think that this behaviour is deterministic. |
|
Hi @karwojan , Thanks for the great discussion. Thanks in advance. |
…aveimage-using-network-sources
…sing-network-sources' into 4135-loadimage-and-saveimage-using-network-sources
Description
It would be very useful to load/save images directly from/to network storage (like S3 or Google Cloud Storage) using pure monai transforms. I have a pretty simple idea how to implement such a feature adding single extra argument to LoadImage and SaveImage transforms.
We could treat input and output paths as pointing to the network sources instead of local files, because classic storage services like S3 use filesystem-like chierarcical structure of object keys. LoadImage transform could accept a Callable, which download object from network storage and save to temporary files, which is then used by readers and any other regular LoadImage transform logic. Similarly SaveImage writer could optionally save data to temporary file, which is then uploaded to network storage.
Status
Work in progress
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.make htmlcommand in thedocs/folder.