Skip to content

Incorrect decoding of file names with some zip files #432

@KenobiTom

Description

@KenobiTom

Due to the changes introduced with #403 zip files created with Windows and extracted via zip4j are no longer resulting in correct file names. The fix use of UTF-8 in this context destroys "Umlaute" (öäü...).
For example täglich.txt becomes t�glich.txt.

This behavior indicates that the original approach to fallback on the zip spec. encoding (cp437) rather than UTF-8 if there is no encoding supplied by the packing tool seemed to be the better one in this case.

A solution for this could be to enable the user to choose the behavior of zip4j: fallback to zip spec or UTF-8 if there is no encoding supplied.

The corresponding change is here:
d80df16

return new String(data, ZIP_STANDARD_CHARSET_NAME); // Cp437
vs.
return new String(data, ZIP4J_DEFAULT_CHARSET); // UTF-8

Test zip containing 1 file:
tester.zip

Code snippet:

ZipFile zipFile = new ZipFile(source);

// Get a list of FileHeaders.
List fileHeaderList = zipFile.getFileHeaders();
// Loop through all the fileHeaders
for (FileHeader fileHeader : fileHeaderList) {
if (fileHeader != null) {
// Build the output file
File outFile = new File(destination, fileHeader.getFileName());
//...
}

Metadata

Metadata

Labels

bugSomething isn't workingresolved

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions