Skip to content

Display PDFs in servoshell using pdf.js viewer#43833

Open
Messi002 wants to merge 3 commits intoservo:mainfrom
Messi002:issue-38812
Open

Display PDFs in servoshell using pdf.js viewer#43833
Messi002 wants to merge 3 commits intoservo:mainfrom
Messi002:issue-38812

Conversation

@Messi002
Copy link
Copy Markdown
Contributor

@Messi002 Messi002 commented Apr 1, 2026

Adding PDF viewing to servoshell using pdf.js

So when Servo sees that a file is a PDF, it opens the bundled pdf.js viewer and passes the PDF URL to it.

What changed:

  • I added pdf.js v5.5.207 files under resources/resource_protocol/pdfjs/
  • As such when the parser sees application/pdf, it redirects to the pdf.js viewer page
  • so it makes the resource:// protocol work with fetch() so the viewer's scripts can load
  • I also fixed path handling so resource://pdfjs/... URLs resolve to the right files
  • Also added support to allow pages served from resource:// to fetch files from other origins (needed for the viewer to load the actual PDF)
  • Also I did't touch the referrer for resource:// pages, so that module scripts don't get blocked

Fixes: #38812

@servo-highfive servo-highfive added the S-awaiting-review There is new code that needs to be reviewed. label Apr 1, 2026
Copy link
Copy Markdown
Contributor

@webbeef webbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you made changes to pdf.js itself, can you create a fork with a "servo" branch? It's hard to review otherwise.

"<html><head>\
<meta http-equiv=\"refresh\" content=\"0;url={viewer_url}\">\
</head><body></body></html>",
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will cause the URL of the pdf document to be replaced, which is not what we want.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... ok!
I will check how Firefox handles that but I read something like Firefox loads the html viewer directly

context.protocols.is_fetchable(url.scheme())
},
Referrer::NoReferrer => false,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that looks dangerous in general...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! I suppose I have to find another way to give the PDF access without necessarily opening it up to all the fetchable protocols

@Messi002
Copy link
Copy Markdown
Contributor Author

Messi002 commented Apr 2, 2026

I looked again at how Firefox handles the whole pdf logic, it does it by changing the content type to text/html and then serves the html viewer. added to that, it then passes the pdf data to the viewer through a post message instead of refetching it.

so I think we can do same when we detect application/pdf and pass the pdf data to the viewer via javascript

Link to it: https://gitlab.iode.tech/os/apps/iodebrowser/-/blob/63b9a260a88a880f080fd875f3fe55dd23089625/browser/extensions/pdfjs/content/PdfStreamConverter.jsm

Line 1061 & Line 950

@webbeef what do you think?

@webbeef
Copy link
Copy Markdown
Contributor

webbeef commented Apr 2, 2026

The principle looks fine, but I'm not sure how you will "pass the pdf data to the viewer via javascript" .

@Messi002
Copy link
Copy Markdown
Contributor Author

Messi002 commented Apr 3, 2026

So the way I will pass it via the javascript is so:

Since firefox doesn'tt call PDFViewerApplication.open() directly but instead they use postMessage, the PdfStreamConverter collects the pdf bytes as they come in and once we have the viewer page loaded, we sends them via domWindow.postMessage({pdfjsLoadAction: 'complete', data: pdfBytes}) and not forgetting here that the viewer has a listener that picks up the message and loads the pdf from the raw bytes.

For Servo, since we already have the pdf response bytes in the parser, I'm thinking we could load the viewer html as the document content and then inject a small script that sends the same postMessage with the pdf data. That way we're using the same mechanism pdf.js already expects, and we don't need to worry about calling open() before the viewer is ready.

@webbeef

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-awaiting-review There is new code that needs to be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Display pdfs on servoshell with legacy pdf.js viewer

3 participants