-
-
Notifications
You must be signed in to change notification settings - Fork 579
Description
Please ensure the following:
- Your issue is based on the latest commit
✅ Tested with a build based on the current master
- State your OS and OS version
✅ Debian 12
- When reporting a problem with a specific PDF input file please avoid stating the organization responsible for the PDFWriter - just refer to the PDFWriter
The used PDFWriter generates PDF files which apparently contain object streams of the type ObjectStreamDict
for example (output excerpt of pdfcpu validate -vv 3.pdf)
object stream count:100 size of objectarray:100
585: offset= 237340 generation=0 types.ObjectStreamDict
<<
<Filter, FlateDecode>
<First, 999>
<Length, 9411>
<N, 100>
<Type, ObjStm>
>>
When this PDF is merged with other PDFs, sometimes the resulting PDF cannot be optimized, any operation which performs optimization (such as pdfcpu bookmark import or of course pdfcpu optimize) fails with an error:
Fatal: writeIndirectObject: undefined PDF object #988 types.ObjectStreamDict
That object number slightly varies even for the same files when the combination of merge and optimize is run multiple times. The issue is unfortunately really hard to reproduce, it only occurs with certain files and in a specific order.
user@pc:/tmp/test$ pdfcpu merge out.pdf 1.pdf 2.pdf 3.pdf 4.pdf && pdfcpu optimize out.pdf
writing out.pdf...
1.pdf
2.pdf
3.pdf
4.pdf
optimizing...
writing out.pdf...
optimizing...
writeIndirectObject: undefined PDF object #988 types.ObjectStreamDict
user@pc:/tmp/test$ pdfcpu merge out.pdf 1.pdf 2.pdf 3.pdf 4.pdf && pdfcpu optimize out.pdf
writing out.pdf...
1.pdf
2.pdf
3.pdf
4.pdf
optimizing...
writing out.pdf...
optimizing...
writeIndirectObject: undefined PDF object #989 types.ObjectStreamDict
user@pc:/tmp/test$ pdfcpu merge out.pdf 1.pdf 2.pdf 3.pdf 4.pdf && pdfcpu optimize out.pdf
writing out.pdf...
1.pdf
2.pdf
3.pdf
4.pdf
optimizing...
writing out.pdf...
optimizing...
writeIndirectObject: undefined PDF object #988 types.ObjectStreamDict
Stack trace:
[...]/FontFile3 617 0 R/FontName/ArialMT/ItalicAngle 0/MaxWidth 2000/StemV 89/Type/FontDescriptor>><</BaseFont/Arial/DescendantFonts[988 1 R]/Encoding/Identity-H/Subtype/Type0/ToUnicode 989 1 R/Type/Font>>>
WRITE: 2024/06/24 14:11:25 writeObject end, obj#725 written to objectStream #996
WRITE: 2024/06/24 14:11:25 addToObjectStream end, obj#:725 gen#:0
WRITE: 2024/06/24 14:11:25 writeDeepObject: begin offset=585777
Arial
WRITE: 2024/06/24 14:11:25 writeDirectObject: end, direct obj - nothing written: offset=585777
Arial
WRITE: 2024/06/24 14:11:25 writeDeepObject: begin offset=585777
[(988 1 R)]
WRITE: 2024/06/24 14:11:25 writeDeepObject: begin offset=585777
(988 1 R)
TRACE: 2024/06/24 14:11:25 FindTableEntry: obj#:988 gen:1
WRITE: 2024/06/24 14:11:25 writeIndirectObject: object #988 gets writeoffset: 585777
Fatal: writeIndirectObject: undefined PDF object #988 types.ObjectStreamDict
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeObjectGeneric
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:689
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeIndirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:728
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:745
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:552
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:742
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepDict
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:600
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeObjectGeneric
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:659
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeIndirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:728
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:745
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:538
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:742
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:538
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:742
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepDict
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:600
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeObjectGeneric
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:659
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeIndirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:728
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:745
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepArray
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:638
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeObjectGeneric
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:665
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeIndirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:728
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:745
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepDict
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:600
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeObjectGeneric
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:659
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeIndirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:728
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:745
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:552
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:742
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepDict
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:600
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeObjectGeneric
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:659
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeIndirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:728
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDeepObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:745
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.writeDirectObject
/root/pdfcpu/pkg/pdfcpu/writeObjects.go:552
Apparently the affected object causing the problem is the embedded Arial Font in this case. Unfortunately I cannot provide an example file where the issue occurs, I don't know how to artificially produce a PDF with the embedded font as ObjectStreamDict.
If I run optimize on the individual files before the merge, the issue does not occur.
user@pc:/tmp/test$ pdfcpu optimize 1.pdf
writing 1.pdf...
optimizing...
user@pc:/tmp/test$ pdfcpu optimize 2.pdf
writing 2.pdf...
optimizing...
user@pc:/tmp/test$ pdfcpu optimize 3.pdf
writing 3.pdf...
optimizing...
user@pc:/tmp/test$ pdfcpu optimize 4.pdf
writing 4.pdf...
optimizing...
user@pc:/tmp/test$ pdfcpu merge out.pdf 1.pdf 2.pdf 3.pdf 4.pdf
writing out.pdf...
1.pdf
2.pdf
3.pdf
4.pdf
optimizing...
user@pc:/tmp/test$ pdfcpu optimize out.pdf
writing out.pdf...
optimizing...