WPRESS vs ZIP: Why WordPress Deserves a Purpose-Built Archive Format

Yani I
Yani I Apr 7, 2026 ยท 14 min read
WPRESS vs ZIP: Why We Built a Better Archive Format for WordPress

Back in 2013, when we started building All-in-One WP Migration, we had a choice. Use ZIP, the format everyone knows, or build something purpose-built for the problem we were solving.

We chose to build WPRESS.

Thirteen years and 60 million sites later, that decision turned out to be one of the most important architectural choices we made. This article explains why.

The Problem With ZIP for WordPress

ZIP is a general-purpose archive format created in 1989 by Phil Katz. It was designed for an era of floppy disks, desktop applications, and files measured in kilobytes. It is good at what it was designed for.

WordPress migration on shared hosting is not what it was designed for.

A WordPress migration plugin needs to:

ZIP fails at most of these. Here is where, and why.

Compression: Three Modes vs. One

WPRESS supports three compression modes:

ModeAlgorithmUse Case
NoneRaw storageMaximum speed, no CPU overhead
GZipzlib (level 9)Best balance of speed and compression
BZip2bzip2 (level 9)Maximum compression for text-heavy sites

Both GZip and BZip2 run at maximum compression level (9), producing the smallest possible archives.

ZIP technically supports multiple compression methods (DEFLATE, BZIP2, LZMA, Zstandard, XZ). In practice, PHP’s ZipArchive class only guarantees two: DEFLATE and STORE (no compression). BZIP2, LZMA, and Zstandard all require specific versions of libzip that most shared hosting environments do not provide.

ZipArchive does allow setting compression level via setCompressionIndex() with a compflags parameter (0 through 9). But this is per-entry, not per-archive, and the default is level 6. Most backup plugins using ZIP never change it.

WPRESS advantage: Three universally available compression modes at maximum compression. ZIP gives you DEFLATE at level 6, and everything else is a maybe.

File Size: No Limits vs. Multiple Limits

This is where the formats diverge most dramatically.

ZIP32 Limits

The original ZIP format (ZIP32) has hard limits baked into its specification:

A WordPress site with a 5GB media library? Cannot be archived in ZIP32. A WooCommerce store with 100,000 product images? Exceeds the entry limit.

ZIP64 Limits

ZIP64 extends these limits to theoretical maximums of 16 exabytes. But theory and practice diverge:

These bugs were tied to specific PHP and libzip versions. They have been fixed in recent releases. But the fact that they existed at all means any WordPress site that relied on ZIP64 during those PHP versions produced corrupt backups without knowing it.

WPRESS Limits

WPRESS has effectively no size limits:

On a 64-bit PHP installation (which is standard on any modern server), the practical limit is whatever your disk can hold.

Resumability: The Feature ZIP Cannot Have

This is the single most important difference between the two formats, and it comes down to architecture.

How ZIP Works

PHP’s ZipArchive::addFile() does not actually read, compress, or write anything. It stores a reference to the source file. All real work happens when you call ZipArchive::close().

This means:

  1. You call addFile() for every file on the site
  2. Every source file stays open (one file descriptor per file) until close() is called
  3. close() reads every file, computes CRC32, compresses the data, writes the archive, and builds the central directory
  4. If close() exceeds PHP’s max_execution_time, the process is killed and the archive is lost
  5. There is no way to resume. You start over.

For a 500MB WordPress site, close() might take 30 seconds. For a 5GB site, it might take 5 minutes. On shared hosting with a 30-second execution limit, anything beyond a small site fails.

There is no streaming API. There is no chunked writing. There is no progress callback. addFile() and close() is all you get.

How WPRESS Works

WPRESS is a sequential streaming format. Files are written one at a time, in 512 KB chunks:

  1. Write a 4,377-byte header for the current file
  2. Read 512 KB from the source file
  3. Compress the chunk (if compression is enabled)
  4. Write the compressed chunk to the archive
  5. Check if we are approaching the execution time limit
  6. If yes: save the current position (byte offset, CRC state, file index) and return
  7. On the next HTTP request: resume from exactly where we left off
  8. When the file is complete: seek back to the header and write the final size and CRC32

Every 512 KB chunk is a self-contained operation. The plugin tracks its position across HTTP requests using a state array that records:

If the server kills the process after writing 3.2 GB of a 5 GB site, the next request picks up at byte 3,200,000,001 and keeps going. Nothing is lost. Nothing needs to be recomputed.

This is why All-in-One WP Migration can export a 50 GB site on a shared host with a 30-second execution limit. ZIP cannot.

CRC32: Integrity Verification at Any Scale

Both WPRESS and ZIP use CRC32 checksums to verify data integrity. The difference is in how they compute them.

ZIP computes CRC32 during close(), reading the entire uncompressed file in a single pass. For a 2 GB database dump, that means reading 2 GB in one execution cycle. If the server kills the process before completion, no CRC is written.

WPRESS computes CRC32 incrementally. Each 512 KB chunk updates a running hash context using PHP’s hash_update(). When a file spans multiple HTTP requests, the CRC values from each request are mathematically combined using a GF(2) polynomial algorithm (the same approach used by zlib’s crc32_combine function). The final CRC is the same value you would get from hashing the entire file at once, but computed across arbitrarily many requests.

WPRESS also calculates an archive-wide CRC32 stored in the EOF block. This is a checksum of the entire archive content (every header and every byte of file data), computed in a separate resumable pass after all files are written. On import, this archive CRC is recalculated and compared, catching any corruption that occurred during download or transfer.

WPRESS advantage: CRC32 verification works on files of any size, computed across any number of HTTP requests. ZIP needs to read the entire file in one shot.

Memory: Constant vs. Unbounded

WPRESS uses approximately 512 KB of memory regardless of whether you are archiving a 10 MB blog or a 50 GB store. It reads one chunk, processes it, writes it, and moves on. Only one source file is open at a time.

ZIP’s memory behavior depends on which method you use:

PHP Bug #40494 (“Memory problem with ZipArchive::addFile()”) and Bug #44070 (“Zip is unusable for large numbers of files or large data volume”) document these issues in detail.

WPRESS advantage: Constant ~512 KB memory footprint. ZIP memory usage scales with the number of files and their sizes.

Speed: Streaming vs. Batched

Adding a file to a WPRESS archive is a streaming write operation: read chunk, compress, write, repeat. There is no buffering, no deferred processing, no central directory to maintain.

Adding a file to a ZIP archive via addFile() is nearly instant because it does nothing. The cost is deferred to close(), which must then:

  1. Open every source file
  2. Read each file completely
  3. Compute CRC32 for each file
  4. Compress each file
  5. Write all compressed data
  6. Build and write the central directory (which duplicates every file’s metadata)

For small sites, the difference is negligible. For sites with thousands of files or large individual files, the batched approach creates a single massive I/O spike during close() that is far more likely to trigger execution time limits, memory limits, or I/O throttling on shared hosting.

Header Overhead: Minimal vs. Redundant

Every file in an archive has metadata overhead. Here is what each format stores:

WPRESS Header (4,377 bytes per file)

FieldSizePurpose
Filename255 bytesFile name without path
Size14 bytesContent size (decimal ASCII)
Modification time12 bytesUnix timestamp
Path prefix4,088 bytesDirectory path
CRC328 bytesHex checksum

Four fields. Everything needed to reconstruct the file. Nothing else.

Each field is sized to match operating system maximums. The filename field (255 bytes) holds the maximum filename length on Linux, macOS, and Windows. The path prefix field (4,088 bytes) accommodates the maximum path length on Linux (PATH_MAX is 4,096). Any file the OS can store, WPRESS can archive without truncation. The fixed-width design also means headers can be read and written with simple fread and fwrite calls at known offsets, no variable-length parsing required.

ZIP Headers (two per file)

ZIP stores metadata twice for every file: once in a local file header before the data, and again in the central directory at the end of the archive.

Local file header (30 bytes fixed + variable):

FieldSize
Signature4 bytes
Version needed to extract2 bytes
General purpose bit flag2 bytes
Compression method2 bytes
Last mod time (MS-DOS format, 2-second resolution)2 bytes
Last mod date (MS-DOS format)2 bytes
CRC-324 bytes
Compressed size4 bytes
Uncompressed size4 bytes
Filename length2 bytes
Extra field length2 bytes
Filenamevariable
Extra fieldvariable

Central directory entry (46 bytes fixed + variable):

All fields from the local header, plus: version made by, file comment length, disk number start, internal file attributes, external file attributes, local header offset, file comment.

End of central directory (22 bytes minimum).

For a WordPress migration, fields like “version made by,” “disk number start,” “internal file attributes,” and “general purpose bit flag” serve no purpose. The MS-DOS timestamp format has only 2-second resolution and no timezone, so most implementations add extra fields with Unix timestamps (36+ additional bytes per entry).

ZIP headers are smaller per file in absolute bytes, but that compactness is the source of ZIP’s limitations. A 4-byte size field caps files at 4 GB. A 2-byte entry count caps archives at 65,535 files. WPRESS uses larger fields because they are sized to OS maximums, not arbitrary format constraints.

ZIP also has no equivalent of “write once.” Its central directory duplicates every file’s metadata and must be written at the end of the archive. For an archive with 15,000 files, the central directory alone can exceed 3 MB, and it must be built in memory during close().

Encryption

WPRESS supports per-chunk AES encryption via PHP’s OpenSSL extension. Each 512 KB chunk is encrypted independently, which means:

ZIP’s traditional encryption (ZipCrypto) is cryptographically broken and can be cracked in minutes. AES-256 encryption for ZIP requires libzip 1.2.0+, which many shared hosts do not provide. PHP did not add ZipArchive::setEncryptionIndex() until PHP 7.2.

PHP ZipArchive: A Track Record of Bugs

PHP’s ZipArchive extension has accumulated a significant list of bugs that directly affect WordPress backup and migration plugins:

BugDescription
#40494Memory exhaustion when adding many files
#44070“Zip is unusable for large numbers of files or large data volume”
#55383Cannot open archives near 2 GB
#62539Archives >= 4 GB silently corrupted
#66786Archive fails if source file is modified before close()
#76524Memory leak with OVERWRITE flag
#81490Memory leak during extractTo()

WordPress core itself has documented issues: Trac ticket #60398 reports that ZIP archives created on one platform produce “Incompatible Archive” errors on another, traced to libzip version differences.

When ZipArchive is unavailable (which happens on many shared hosts), WordPress falls back to PclZip, a pure-PHP ZIP implementation bundled with WordPress core. PclZip has its own limitations: 4 GB maximum archive size, ~65,000 maximum entries, and significantly slower performance than the native extension.

Why Competitors Are Abandoning ZIP

We are not the only ones who recognized ZIP’s limitations for WordPress.

Duplicator, the other major WordPress migration plugin, ships with a three-tier archive engine: shell zip command (when available), ZipArchive (when the extension is loaded), and their own custom format called DupArchive. DupArchive was built specifically to work around ZIP’s inability to handle chunked processing and execution time limits.

The fact that our primary competitor independently arrived at the same conclusion (ZIP does not work, build a custom format) validates the architectural decision we made in 2013.

The Summary

FeatureWPRESSZIP (via PHP ZipArchive)
Compression modesNone, GZip, BZip2 (all universally available)DEFLATE, STORE universally; BZIP2/LZMA/ZSTD require specific libzip
Max file size~91 TB per entry4 GB (ZIP32) or OS-limited (ZIP64, but historically buggy)
Max archive sizeOS filesystem limit4 GB (ZIP32) or OS-limited (ZIP64)
Max entriesUnlimited65,535 (ZIP32) or ~4 billion (ZIP64)
ResumableYes, across unlimited HTTP requestsNo
Chunked processing512 KB chunksAll-or-nothing via close()
Memory usageConstant ~512 KBScales with file count and size
CRC32 on large filesIncremental with mathematical combiningSingle-pass, fails if execution times out
Archive-wide CRCYes (EOF block)No
EncryptionAES per chunk (OpenSSL)ZipCrypto (broken) or AES (requires libzip 1.2+)
Metadata per file4 fields (name, size, mtime, crc32)15+ fields, stored twice (local + central directory)
File descriptors1 at a time1 per file, all held until close()
Central directoryNone (sequential format)Required, built in memory

When ZIP Is Fine

ZIP is a perfectly good format for many use cases: distributing software, sharing documents, archiving projects on your local machine. If you are working with files under 4 GB, have generous execution time and memory limits, and do not need to resume interrupted operations, ZIP works well.

It was not designed for the constraints of WordPress migration on shared hosting. WPRESS was.

FAQ

Can I convert a .wpress file to .zip?

Yes. A .wpress file is a standard archive. You can extract its contents and repackage them as a ZIP using any archiving tool. However, you would lose the resumability and integrity verification features that make WPRESS useful for migration.

Does WPRESS work on 32-bit PHP?

WPRESS works on 32-bit PHP for files under 2 GB. The 2 GB limit comes from PHP’s fseek() function, which cannot handle offsets larger than 2^31 bytes on 32-bit systems. On 64-bit PHP (standard on any server installed in the last decade), there is no practical limit.

Is WPRESS an open format?

The WPRESS format is fully documented in the All-in-One WP Migration source code. The header structure, compression modes, CRC implementation, and EOF block format are all readable in the plugin’s archiver classes.

Why not use TAR instead of building WPRESS?

TAR is a sequential format like WPRESS, but it lacks built-in compression (requiring external gzip/bzip2 wrapping), has no per-file CRC32 verification, and its header format carries Unix-specific fields (owner, group, device numbers) that are meaningless in a WordPress migration context. WPRESS was designed with exactly the metadata WordPress needs and nothing more.

Why is the WPRESS header 4,377 bytes when ZIP headers are smaller?

Each WPRESS header field is sized to hold whatever the operating system supports. The filename field is 255 bytes because that is the maximum filename length on Linux, macOS, and Windows (NAME_MAX). The path prefix field is 4,088 bytes because that accommodates the maximum path length on Linux (PATH_MAX is 4,096) and Windows extended paths. Every file the OS can store, WPRESS can archive without truncation.

ZIP headers are smaller precisely because the fields are smaller, which means limits. ZIP32 uses 4 bytes for file size (maximum 4 GB), 4 bytes for compressed size (maximum 4 GB), 2 bytes for the filename length field, and 2 bytes for the entry count (maximum 65,535 files). The compact headers are not an advantage. They are the reason ZIP has size and entry limits.

WPRESS allocates enough space to handle anything the operating system supports. ZIP saves a few bytes per header and hits walls because of it.