Parsing file with lots of dictionaries is extremely slow

I experienced this with a PDF that is a 9-page export of CAD drawings containing lots of small lines and numbers (created using "PDFTron PDFNet, V6.40292"). Unfortunately I can't share the document.

Tested with latest master (b89d7b1ab108d54a983c9653d47df07bbad5336e) and the test script below.

<details>

```golang
package main

import (
	"log"
	"os"

	"github.com/pdfcpu/pdfcpu/pkg/pdfcpu"
	"github.com/pdfcpu/pdfcpu/pkg/pdfcpu/model"
)

func main() {
	log.SetFlags(log.Flags() | log.Lmicroseconds)
	fp, err := os.Open("slow-parse.pdf")
	if err != nil {
		log.Fatal(err)
	}
	defer fp.Close()

	conf := model.NewDefaultConfiguration()
	log.Printf("Parsing ...")
	pdf, err := pdfcpu.Read(fp, conf)
	if err != nil {
		log.Fatal(err)
	}
	log.Printf("Done")

	if err := pdf.EnsurePageCount(); err != nil {
		log.Fatal(err)
	}

	log.Printf("Parsed %d pages", pdf.PageCount)
}
```

</details>

Parsing doesn't stop (I killed it after 5 minutes taking 100% CPU). I have a patch ready takes improves parsing this file to below 4 seconds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Parsing file with lots of dictionaries is extremely slow #775

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Parsing file with lots of dictionaries is extremely slow #775

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions