Skip to content

Data isn't prepared correctly for invasive-species-monitoring #122

@rubzo

Description

@rubzo

The prepare function for the invasive-species-monitoring competition appears to incorrectly create .7z files using py7zr:

with py7zr.SevenZipFile(public / "train.7z", "w") as z:

This was brought to my attention because an agent was unable to find any images in the train.7z or test.7z datasets.

You can repro the issue simply by creating a test case like:

import py7zr
import pathlib

def main():
    p = pathlib.Path("project")
    with py7zr.SevenZipFile(pathlib.Path("project.7z"), "w") as z:
        z.write(p)

And putting some data in a directory called project. The resulting project.7z file is invalid.

This is fixed by changing the use of 'write' to 'writeall' i.e:

z.writeall(p)

Then the .7z file is correctly created.

I also see that 'statoil-iceberg-classifier-challenge' is the only other competition that uses this package, but I haven't tested if it's also affected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions