I am filing this not as an expectation but as something I might implement a PR for if it becomes obvious that I really need it. Right now I only have a few four (4) PDFs of over 7000 with this issue.
Summary
gxpdf fails to parse PDFs that use Standard Security Handler encryption, even when the user password is empty (no password required to view). These files open normally in Preview, Adobe Reader, and other PDF viewers.
Note: ROADMAP.md shows "Encryption (RC4, AES) | Done | v0.1.0" — this appears to be for creating encrypted PDFs. This request is for the inverse: decrypting encrypted PDFs when reading them.
Error
failed to decode ObjStm 2802: FlateDecode failed: failed to create zlib reader: zlib: invalid header
The "invalid zlib header" occurs because gxpdf attempts to decompress encrypted stream data without first decrypting it.
Technical Details
Affected PDFs have an /Encrypt dictionary like:
/Filter/Standard
/V 4
/R 4
/CF<</StdCF<</AuthEvent/DocOpen/CFM/AESV2/Length 16>>>>
/StmF/StdCF
/StrF/StdCF
/U(...) % empty user password (mostly null bytes)
/O(...) % owner password hash
/P -1340 % permissions flags
Key characteristics:
/V 4 /R 4 — Encryption version/revision 4
/CFM/AESV2 — AES-128 encryption for streams
- Empty user password — allows viewing without password prompt
- Owner password — restricts editing/printing
Use Case
Institutions like banks, insurance companies, financial services distribute "permissions-only" encrypted PDFs. These files:
- Open without password in most PDF viewers
- Restrict printing, copying, or editing
- Are common in personal document archives
Proposed Solution
Implement PDF Standard Security Handler for /V 4 /R 4 encryption:
- Detection: Check trailer for
/Encrypt reference
- Key derivation: Compute decryption key using:
- Empty string as user password
/O, /P, /ID values from trailer
- MD5/SHA-256 per PDF spec revision
- Stream decryption: Decrypt streams with AES-128-CBC before decompression
- String decryption: Decrypt string objects as needed
Optional enhancements:
- Support
/V 2 /R 3 (RC4-128) for older PDFs
- Accept user-provided password for password-protected PDFs
- Expose permission flags via API
Impact
This would enable gxpdf to handle a class of real-world PDFs that currently fail with unfortunate "zlib: invalid header" errors.
References
I am filing this not as an expectation but as something I might implement a PR for if it becomes obvious that I really need it. Right now I only have a few four (
4) PDFs of over7000with this issue.Summary
gxpdf fails to parse PDFs that use Standard Security Handler encryption, even when the user password is empty (no password required to view). These files open normally in Preview, Adobe Reader, and other PDF viewers.
Note:
ROADMAP.mdshows "Encryption (RC4,AES) | Done |v0.1.0" — this appears to be for creating encrypted PDFs. This request is for the inverse: decrypting encrypted PDFs when reading them.Error
The "invalid zlib header" occurs because gxpdf attempts to decompress encrypted stream data without first decrypting it.
Technical Details
Affected PDFs have an
/Encryptdictionary like:Key characteristics:
/V 4/R 4— Encryption version/revision 4/CFM/AESV2— AES-128 encryption for streamsUse Case
Institutions like banks, insurance companies, financial services distribute "permissions-only" encrypted PDFs. These files:
Proposed Solution
Implement PDF Standard Security Handler for
/V 4/R 4encryption:/Encryptreference/O,/P,/IDvalues from trailerOptional enhancements:
/V 2/R 3(RC4-128) for older PDFsImpact
This would enable gxpdf to handle a class of real-world PDFs that currently fail with unfortunate "zlib: invalid header" errors.
References