Back in 2013, when we started building All-in-One WP Migration, we had a choice. Use ZIP, the format everyone knows, or build something purpose-built for the problem we were solving.
We chose to build WPRESS.
Thirteen years and 60 million sites later, that decision turned out to be one of the most important architectural choices we made. This article explains why.
The Problem With ZIP for WordPress
ZIP is a general-purpose archive format created in 1989 by Phil Katz. It was designed for an era of floppy disks, desktop applications, and files measured in kilobytes. It is good at what it was designed for.
WordPress migration on shared hosting is not what it was designed for.
A WordPress migration plugin needs to:
- Package an entire website (database, plugins, themes, uploads, configuration) into a single portable file
- Run inside PHP on servers with strict execution time limits (often 30 seconds)
- Handle sites of any size, from a 50MB blog to a 20GB WooCommerce store
- Resume after interruptions, because shared hosting will kill long-running processes
- Verify data integrity so nothing gets corrupted in transit
- Use minimal memory, because shared hosting allocates as little as 128MB to PHP
ZIP fails at most of these. Here is where, and why.
Compression: Three Modes vs. One
WPRESS supports three compression modes:
| Mode | Algorithm | Use Case |
|---|---|---|
| None | Raw storage | Maximum speed, no CPU overhead |
| GZip | zlib (level 9) | Best balance of speed and compression |
| BZip2 | bzip2 (level 9) | Maximum compression for text-heavy sites |
Both GZip and BZip2 run at maximum compression level (9), producing the smallest possible archives.
ZIP technically supports multiple compression methods (DEFLATE, BZIP2, LZMA, Zstandard, XZ). In practice, PHP’s ZipArchive class only guarantees two: DEFLATE and STORE (no compression). BZIP2, LZMA, and Zstandard all require specific versions of libzip that most shared hosting environments do not provide.
ZipArchive does allow setting compression level via setCompressionIndex() with a compflags parameter (0 through 9). But this is per-entry, not per-archive, and the default is level 6. Most backup plugins using ZIP never change it.
WPRESS advantage: Three universally available compression modes at maximum compression. ZIP gives you DEFLATE at level 6, and everything else is a maybe.
File Size: No Limits vs. Multiple Limits
This is where the formats diverge most dramatically.
ZIP32 Limits
The original ZIP format (ZIP32) has hard limits baked into its specification:
- Maximum file size: 4 GB per entry
- Maximum archive size: 4 GB total
- Maximum number of entries: 65,535 files
A WordPress site with a 5GB media library? Cannot be archived in ZIP32. A WooCommerce store with 100,000 product images? Exceeds the entry limit.
ZIP64 Limits
ZIP64 extends these limits to theoretical maximums of 16 exabytes. But theory and practice diverge:
- PHP Bug #62539: Archives exceeding 4GB were silently corrupted
- PHP Bug #55383: Archives around 2GB could not be reopened
- PHP Bug #44974: Large archives failed to open entirely
These bugs were tied to specific PHP and libzip versions. They have been fixed in recent releases. But the fact that they existed at all means any WordPress site that relied on ZIP64 during those PHP versions produced corrupt backups without knowing it.
WPRESS Limits
WPRESS has effectively no size limits:
- Maximum file size per entry: ~91 TB (14-character decimal ASCII field)
- Maximum archive size: Limited only by the operating system’s filesystem
- Maximum number of entries: Unlimited (sequential format, no index structure to overflow)
On a 64-bit PHP installation (which is standard on any modern server), the practical limit is whatever your disk can hold.
Resumability: The Feature ZIP Cannot Have
This is the single most important difference between the two formats, and it comes down to architecture.
How ZIP Works
PHP’s ZipArchive::addFile() does not actually read, compress, or write anything. It stores a reference to the source file. All real work happens when you call ZipArchive::close().
This means:
- You call
addFile()for every file on the site - Every source file stays open (one file descriptor per file) until
close()is called close()reads every file, computes CRC32, compresses the data, writes the archive, and builds the central directory- If
close()exceeds PHP’smax_execution_time, the process is killed and the archive is lost - There is no way to resume. You start over.
For a 500MB WordPress site, close() might take 30 seconds. For a 5GB site, it might take 5 minutes. On shared hosting with a 30-second execution limit, anything beyond a small site fails.
There is no streaming API. There is no chunked writing. There is no progress callback. addFile() and close() is all you get.
How WPRESS Works
WPRESS is a sequential streaming format. Files are written one at a time, in 512 KB chunks:
- Write a 4,377-byte header for the current file
- Read 512 KB from the source file
- Compress the chunk (if compression is enabled)
- Write the compressed chunk to the archive
- Check if we are approaching the execution time limit
- If yes: save the current position (byte offset, CRC state, file index) and return
- On the next HTTP request: resume from exactly where we left off
- When the file is complete: seek back to the header and write the final size and CRC32
Every 512 KB chunk is a self-contained operation. The plugin tracks its position across HTTP requests using a state array that records:
- The current position in the archive file
- How many bytes of the current source file have been read
- How many bytes have been written (which differs from bytes read when compression is active)
- The accumulated CRC32 value for the current file
- Which file in the content list we are processing
If the server kills the process after writing 3.2 GB of a 5 GB site, the next request picks up at byte 3,200,000,001 and keeps going. Nothing is lost. Nothing needs to be recomputed.
This is why All-in-One WP Migration can export a 50 GB site on a shared host with a 30-second execution limit. ZIP cannot.
CRC32: Integrity Verification at Any Scale
Both WPRESS and ZIP use CRC32 checksums to verify data integrity. The difference is in how they compute them.
ZIP computes CRC32 during close(), reading the entire uncompressed file in a single pass. For a 2 GB database dump, that means reading 2 GB in one execution cycle. If the server kills the process before completion, no CRC is written.
WPRESS computes CRC32 incrementally. Each 512 KB chunk updates a running hash context using PHP’s hash_update(). When a file spans multiple HTTP requests, the CRC values from each request are mathematically combined using a GF(2) polynomial algorithm (the same approach used by zlib’s crc32_combine function). The final CRC is the same value you would get from hashing the entire file at once, but computed across arbitrarily many requests.
WPRESS also calculates an archive-wide CRC32 stored in the EOF block. This is a checksum of the entire archive content (every header and every byte of file data), computed in a separate resumable pass after all files are written. On import, this archive CRC is recalculated and compared, catching any corruption that occurred during download or transfer.
WPRESS advantage: CRC32 verification works on files of any size, computed across any number of HTTP requests. ZIP needs to read the entire file in one shot.
Memory: Constant vs. Unbounded
WPRESS uses approximately 512 KB of memory regardless of whether you are archiving a 10 MB blog or a 50 GB store. It reads one chunk, processes it, writes it, and moves on. Only one source file is open at a time.
ZIP’s memory behavior depends on which method you use:
addFile()opens a file descriptor for each file and holds it untilclose(). The OS file descriptor limit (ulimit -n) is typically 1,024 on Linux and 256 on macOS. A WordPress site with 10,000 files inwp-content/uploadscan exhaust this limit beforeclose()is even called.addFromString()avoids the file descriptor problem but requires loading the entire file content into PHP memory. A 200 MB database dump would need 200 MB of PHP memory just for that one entry.- During
close(), libzip allocates internal compression buffers. The size depends on the compression method and cannot be controlled from PHP.
PHP Bug #40494 (“Memory problem with ZipArchive::addFile()”) and Bug #44070 (“Zip is unusable for large numbers of files or large data volume”) document these issues in detail.
WPRESS advantage: Constant ~512 KB memory footprint. ZIP memory usage scales with the number of files and their sizes.
Speed: Streaming vs. Batched
Adding a file to a WPRESS archive is a streaming write operation: read chunk, compress, write, repeat. There is no buffering, no deferred processing, no central directory to maintain.
Adding a file to a ZIP archive via addFile() is nearly instant because it does nothing. The cost is deferred to close(), which must then:
- Open every source file
- Read each file completely
- Compute CRC32 for each file
- Compress each file
- Write all compressed data
- Build and write the central directory (which duplicates every file’s metadata)
For small sites, the difference is negligible. For sites with thousands of files or large individual files, the batched approach creates a single massive I/O spike during close() that is far more likely to trigger execution time limits, memory limits, or I/O throttling on shared hosting.
Header Overhead: Minimal vs. Redundant
Every file in an archive has metadata overhead. Here is what each format stores:
WPRESS Header (4,377 bytes per file)
| Field | Size | Purpose |
|---|---|---|
| Filename | 255 bytes | File name without path |
| Size | 14 bytes | Content size (decimal ASCII) |
| Modification time | 12 bytes | Unix timestamp |
| Path prefix | 4,088 bytes | Directory path |
| CRC32 | 8 bytes | Hex checksum |
Four fields. Everything needed to reconstruct the file. Nothing else.
Each field is sized to match operating system maximums. The filename field (255 bytes) holds the maximum filename length on Linux, macOS, and Windows. The path prefix field (4,088 bytes) accommodates the maximum path length on Linux (PATH_MAX is 4,096). Any file the OS can store, WPRESS can archive without truncation. The fixed-width design also means headers can be read and written with simple fread and fwrite calls at known offsets, no variable-length parsing required.
ZIP Headers (two per file)
ZIP stores metadata twice for every file: once in a local file header before the data, and again in the central directory at the end of the archive.
Local file header (30 bytes fixed + variable):
| Field | Size |
|---|---|
| Signature | 4 bytes |
| Version needed to extract | 2 bytes |
| General purpose bit flag | 2 bytes |
| Compression method | 2 bytes |
| Last mod time (MS-DOS format, 2-second resolution) | 2 bytes |
| Last mod date (MS-DOS format) | 2 bytes |
| CRC-32 | 4 bytes |
| Compressed size | 4 bytes |
| Uncompressed size | 4 bytes |
| Filename length | 2 bytes |
| Extra field length | 2 bytes |
| Filename | variable |
| Extra field | variable |
Central directory entry (46 bytes fixed + variable):
All fields from the local header, plus: version made by, file comment length, disk number start, internal file attributes, external file attributes, local header offset, file comment.
End of central directory (22 bytes minimum).
For a WordPress migration, fields like “version made by,” “disk number start,” “internal file attributes,” and “general purpose bit flag” serve no purpose. The MS-DOS timestamp format has only 2-second resolution and no timezone, so most implementations add extra fields with Unix timestamps (36+ additional bytes per entry).
ZIP headers are smaller per file in absolute bytes, but that compactness is the source of ZIP’s limitations. A 4-byte size field caps files at 4 GB. A 2-byte entry count caps archives at 65,535 files. WPRESS uses larger fields because they are sized to OS maximums, not arbitrary format constraints.
ZIP also has no equivalent of “write once.” Its central directory duplicates every file’s metadata and must be written at the end of the archive. For an archive with 15,000 files, the central directory alone can exceed 3 MB, and it must be built in memory during close().
Encryption
WPRESS supports per-chunk AES encryption via PHP’s OpenSSL extension. Each 512 KB chunk is encrypted independently, which means:
- Encryption works within the same resumable, chunked architecture
- A corrupted chunk does not destroy the rest of the archive
- The encryption key never touches disk (passed through the state array)
ZIP’s traditional encryption (ZipCrypto) is cryptographically broken and can be cracked in minutes. AES-256 encryption for ZIP requires libzip 1.2.0+, which many shared hosts do not provide. PHP did not add ZipArchive::setEncryptionIndex() until PHP 7.2.
PHP ZipArchive: A Track Record of Bugs
PHP’s ZipArchive extension has accumulated a significant list of bugs that directly affect WordPress backup and migration plugins:
| Bug | Description |
|---|---|
| #40494 | Memory exhaustion when adding many files |
| #44070 | “Zip is unusable for large numbers of files or large data volume” |
| #55383 | Cannot open archives near 2 GB |
| #62539 | Archives >= 4 GB silently corrupted |
| #66786 | Archive fails if source file is modified before close() |
| #76524 | Memory leak with OVERWRITE flag |
| #81490 | Memory leak during extractTo() |
WordPress core itself has documented issues: Trac ticket #60398 reports that ZIP archives created on one platform produce “Incompatible Archive” errors on another, traced to libzip version differences.
When ZipArchive is unavailable (which happens on many shared hosts), WordPress falls back to PclZip, a pure-PHP ZIP implementation bundled with WordPress core. PclZip has its own limitations: 4 GB maximum archive size, ~65,000 maximum entries, and significantly slower performance than the native extension.
Why Competitors Are Abandoning ZIP
We are not the only ones who recognized ZIP’s limitations for WordPress.
Duplicator, the other major WordPress migration plugin, ships with a three-tier archive engine: shell zip command (when available), ZipArchive (when the extension is loaded), and their own custom format called DupArchive. DupArchive was built specifically to work around ZIP’s inability to handle chunked processing and execution time limits.
The fact that our primary competitor independently arrived at the same conclusion (ZIP does not work, build a custom format) validates the architectural decision we made in 2013.
The Summary
| Feature | WPRESS | ZIP (via PHP ZipArchive) |
|---|---|---|
| Compression modes | None, GZip, BZip2 (all universally available) | DEFLATE, STORE universally; BZIP2/LZMA/ZSTD require specific libzip |
| Max file size | ~91 TB per entry | 4 GB (ZIP32) or OS-limited (ZIP64, but historically buggy) |
| Max archive size | OS filesystem limit | 4 GB (ZIP32) or OS-limited (ZIP64) |
| Max entries | Unlimited | 65,535 (ZIP32) or ~4 billion (ZIP64) |
| Resumable | Yes, across unlimited HTTP requests | No |
| Chunked processing | 512 KB chunks | All-or-nothing via close() |
| Memory usage | Constant ~512 KB | Scales with file count and size |
| CRC32 on large files | Incremental with mathematical combining | Single-pass, fails if execution times out |
| Archive-wide CRC | Yes (EOF block) | No |
| Encryption | AES per chunk (OpenSSL) | ZipCrypto (broken) or AES (requires libzip 1.2+) |
| Metadata per file | 4 fields (name, size, mtime, crc32) | 15+ fields, stored twice (local + central directory) |
| File descriptors | 1 at a time | 1 per file, all held until close() |
| Central directory | None (sequential format) | Required, built in memory |
When ZIP Is Fine
ZIP is a perfectly good format for many use cases: distributing software, sharing documents, archiving projects on your local machine. If you are working with files under 4 GB, have generous execution time and memory limits, and do not need to resume interrupted operations, ZIP works well.
It was not designed for the constraints of WordPress migration on shared hosting. WPRESS was.
FAQ
Can I convert a .wpress file to .zip?
Yes. A .wpress file is a standard archive. You can extract its contents and repackage them as a ZIP using any archiving tool. However, you would lose the resumability and integrity verification features that make WPRESS useful for migration.
Does WPRESS work on 32-bit PHP?
WPRESS works on 32-bit PHP for files under 2 GB. The 2 GB limit comes from PHP’s fseek() function, which cannot handle offsets larger than 2^31 bytes on 32-bit systems. On 64-bit PHP (standard on any server installed in the last decade), there is no practical limit.
Is WPRESS an open format?
The WPRESS format is fully documented in the All-in-One WP Migration source code. The header structure, compression modes, CRC implementation, and EOF block format are all readable in the plugin’s archiver classes.
Why not use TAR instead of building WPRESS?
TAR is a sequential format like WPRESS, but it lacks built-in compression (requiring external gzip/bzip2 wrapping), has no per-file CRC32 verification, and its header format carries Unix-specific fields (owner, group, device numbers) that are meaningless in a WordPress migration context. WPRESS was designed with exactly the metadata WordPress needs and nothing more.
Why is the WPRESS header 4,377 bytes when ZIP headers are smaller?
Each WPRESS header field is sized to hold whatever the operating system supports. The filename field is 255 bytes because that is the maximum filename length on Linux, macOS, and Windows (NAME_MAX). The path prefix field is 4,088 bytes because that accommodates the maximum path length on Linux (PATH_MAX is 4,096) and Windows extended paths. Every file the OS can store, WPRESS can archive without truncation.
ZIP headers are smaller precisely because the fields are smaller, which means limits. ZIP32 uses 4 bytes for file size (maximum 4 GB), 4 bytes for compressed size (maximum 4 GB), 2 bytes for the filename length field, and 2 bytes for the entry count (maximum 65,535 files). The compact headers are not an advantage. They are the reason ZIP has size and entry limits.
WPRESS allocates enough space to handle anything the operating system supports. ZIP saves a few bytes per header and hits walls because of it.
