Skip to content

ExtractMetadata becomes non-responsive after individual error #52

@jnathanh

Description

@jnathanh

Hi, thanks for sharing this package. I ran into an error case trying to do a filepath.Walk over my photos directory. Manually skipping directories does avoid this issue, but it seems like the wrong behavior for ExtractMetadata to silently start giving incorrect results as a result of a previous call. I'd like to incorporate this package into some of my workflows, so I'm raising the issue to help improve the package's usability/reliability. Here is a test that reproduces the behavior:

func TestNonResponsiveExiftool(t *testing.T) {
	t.Parallel()

	// create test directory with about 200MB worth of photos
	// 10,000 of the included jpg or 6 30MB CR2's both repro
	dirPath := "./testdata/extractkiller"

	err := os.Mkdir(dirPath, 0744)
	require.NoError(t, err)
	defer os.RemoveAll(dirPath)

	for i:=0;i < 10_000;i++ {
		err := copyFile("./testdata/20190404_131804.jpg", path.Join(dirPath, fmt.Sprintf("%d.jpg", i)))
		require.NoError(t, err)
	}


	// initialize exiftool
	e, err := NewExiftool(Debug())
	require.NoError(t, err)
	defer e.Close()


	// control case, everything working normally
	f := e.ExtractMetadata("./testdata/20190404_131804.jpg")
	assert.Equal(t, 1, len(f))
	assert.NoError(t, f[0].Err)

	// case that breaks the session, reading a directory does not seem to be supported by this package, which is okay, and I would expect it to throw an error in that case, but it should not make future reads invalid.
	f = e.ExtractMetadata(dirPath)
	assert.Error(t, f[0].Err)

	// control case no longer works
	f = e.ExtractMetadata("./testdata/20190404_131804.jpg")
	assert.Equal(t, 1, len(f))
	assert.NoError(t, f[0].Err)

}

I suspect the issue is either a race (scanning output before it's finished), or improper buffer usage (overflow maybe?). I ran the same test with exiftool directly and got output right away without issue.

One other note, it looks like exiftool supports paths to a directory as an input, or even a list of multiple files. It might be worth reworking the design to let exiftool handle the work of managing multiple paths (directories or files), and add support for parsing the multi-file output.

I'll try to work on a PR for this, but posting the issue now as a heads up, and to see if you have any suggestions on the right design to fix the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions