Skip to content

ImageLoadingEstimator for TensorFlow scoring should allow in-memory image streams as input in addition to images from files on drive  #2121

@CESARDELATORRE

Description

@CESARDELATORRE

Right now the only way for ML.NET to load images is via ImageLoadingEstimator, which can load them only from disk files (as confirmed by @yaeldekel and Pete a few weeks ago).

However, it is a very common scenario in applications, such as a web app, where users submit images through Http, then the DataView/pipeline would load in-memory image streams (either BitMap, byte[], Image) instead of loading images from files in a folder on a disk/drive.

That's the right way to do it for many scenarios in web apps and services (Web APIs).
And for instance, you can do that when using TensorFlowSharp in C#. But we cannot in ML.NET, as of today.

When implementing this feature improvement in ML.NET, there could be the following two approaches:

  • Modify schema comprehension to be able to map Bitmap fields/properties to Image columns of a data view.

  • Add another version of ImageLoading transformer that loads/decodes the image from a byte vector, rather than from a disk file identified by path.

In any case, this is an important scenario to implement because not being able to load images from in-memory streams and only from files can be a big handicap in performance for on-line scenarios like the ones mentioned.

With the current implementation in ML.NET, the only workaround is to save the upcoming image from http and in-memory into a temporary file on the disk and load it from there. But that is a very "coarse/poor" workaround, not performant at all for a real application in production.

The following is a sample app I created for this online scenario where the user uploads an image from the browser into a service (Web API) and ultimately you get it as an in-memory image stream.

SEE CODE HERE:

https://github.com/CESARDELATORRE/TensorFlowImageClassificationWebAPI

image

  • That web form uploads the image through Http into a service (Web API) in the server-side. At that point, the image is an in-memory image stream.

  • In this implementation the sample app works because I implemented a workaround so the submitted image is temporarily stored as a file, then loaded from the file into the DataView through the pipeline...)

Basically, when the C# method in the Web API gets the image as an in-memory stream it should be able to load it directly in the DataView. The following code is an example:

        // Controller's method from Web API 
        [HttpPost]
        [ProducesResponseType(200)]
        [ProducesResponseType(400)]
        [Route("classifyimage")]
        public async Task<IActionResult> ClassifyImage(IFormFile imageFile)
        {
                if (imageFile.Length == 0)
                    return BadRequest();

                // WORKAROUND: Save image into a temporal file
                //Save the temp image image into the temp-folder 
                string fileName = await _imageWriter.UploadImageAsync(imageFile, _imagesTmpFolder);
                string imageFilePath = Path.Combine(_imagesTmpFolder, fileName);

                // Use image filename as the workaround...
                // Rest of the implementation with ML.NET API for scoring TensorFlow model...
                // ...
           
        }

To sum up:

I believe it is "a must" for ML.NET to be able to load in-memory image streams into the DataView to use those images when scoring TensorFlow models (in addition "from files") because of the mentioned on-line and in-memory scenarios that are pretty common.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestup-for-grabsA good issue to fix if you are trying to contribute to the project

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions