<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>python programming on File Format Blog</title>
    <link>https://blog.fileformat.com/tag/python-programming/</link>
    <description>Recent content in python programming on File Format Blog</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Wed, 29 Jan 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.fileformat.com/tag/python-programming/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Working with PDF files in Python</title>
      <link>https://blog.fileformat.com/programming/working-with-pdf-files-in-python/</link>
      <pubDate>Wed, 29 Jan 2025 00:00:00 +0000</pubDate>
      
      <guid>https://blog.fileformat.com/programming/working-with-pdf-files-in-python/</guid>
      <description>Learn how to extract text from a PDF in Python, rotate PDF pages, merge multiple PDFs, split PDFs, and add watermarks to your PDFs using Python libraries and simple code examples.</description>
      <content:encoded><![CDATA[<p><strong>Last Updated</strong>: 29 Jan, 2025</p>
<figure class="align-center ">
    <img loading="lazy" src="images/working-with-pdf-files-in-python.png#center"
         alt="Title - Working with PDF files in Python"/> 
</figure>

<p>In this article, we will guide you on <strong>how to work with PDF files using Python</strong>. For this, we’ll utilize the <a href="https://pypi.org/project/pypdf/"><strong>pypdf</strong></a> library.</p>
<p>Using the <strong>pypdf</strong> library, we&rsquo;ll demonstrate how to perform the following operations in Python:</p>
<ul>
<li>Extracting text from PDFs</li>
<li>Rotating PDF pages</li>
<li>Merging multiple PDFs</li>
<li>Splitting PDFs into separate files</li>
<li>Adding watermarks to PDF pages</li>
</ul>
<p><em><strong>Note</strong>: This article covers a lot of valuable details, so feel free to skip to the sections that interest you the most! The content is organized for easy navigation, so you can quickly focus on what&rsquo;s most relevant to you.</em></p>
<figure class="align-center ">
    <img loading="lazy" src="images/pdf-manipulation-with-pypdf.webp#center"
         alt="Illustration - Working with PDF files in Python"/> 
</figure>

<h2 id="sample-codes">Sample Codes</h2>
<p>You can download all the sample code used in this article from the following link. It includes the code, input files, and output files.</p>
<ul>
<li><a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python">Code Examples and Input Files for Working with PDF Files in Python</a></li>
</ul>
<h2 id="install-pypdf">Install pypdf</h2>
<p>To install pypdf, simply run the following command in your terminal or command prompt:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>pip install pypdf
</span></span></code></pre></div><p><strong>Note:</strong> The above command is case-sensitive.</p>
<h2 id="1-extracting-text-from-a-pdf-file-using-python">1. Extracting Text from a PDF File Using Python</h2>
<script type="application/javascript" src="https://gist.github.com/fileformat-blog-gists/e2b43a49dbad9e89745f8f9777817acb.js?file=extract-text-from-pdf-using-pypdf-in-python.py"></script>

<h3 id="code-explanation"><strong>Code Explanation</strong></h3>
<p><strong>1. Creating a PDF Reader Object</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>reader <span style="color:#f92672">=</span> PdfReader(pdf_file)
</span></span></code></pre></div><ul>
<li><code>PdfReader(pdf_file)</code> loads the PDF file into a <strong>reader object</strong>.</li>
<li>This object allows access to the pages and their content.</li>
</ul>
<p><strong>2. Looping Through Pages</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> page_number, page <span style="color:#f92672">in</span> enumerate(reader<span style="color:#f92672">.</span>pages, start<span style="color:#f92672">=</span><span style="color:#ae81ff">1</span>):
</span></span></code></pre></div><ul>
<li><code>reader.pages</code> returns a list of pages in the PDF.</li>
<li><code>enumerate(..., start=1)</code> assigns a <strong>page number starting from 1</strong>.</li>
</ul>
<p><strong>3. Printing Extracted Text</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>    print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Page </span><span style="color:#e6db74">{</span>page_number<span style="color:#e6db74">}</span><span style="color:#e6db74">:&#34;</span>)
</span></span><span style="display:flex;"><span>    print(page<span style="color:#f92672">.</span>extract_text())
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#34;-&#34;</span> <span style="color:#f92672">*</span> <span style="color:#ae81ff">50</span>)  <span style="color:#75715e"># Separator for readability</span>
</span></span></code></pre></div><ul>
<li><code>page.extract_text()</code> extracts text content from the current page.</li>
<li>The script prints the extracted text along with the <strong>page number</strong>.</li>
<li><code>&quot;-&quot; * 50</code> prints a separator line (<code>--------------------------------------------------</code>) for better readability.</li>
</ul>
<h3 id="input-pdf-file-used-in-the-code">Input PDF File Used in the Code</h3>
<ul>
<li><strong>Input File:</strong> <a href="https://github.com/fileformat-blog-gists/code/blob/main/working-with-pdf-files-in-python/pdf-to-extract-text/">Download Link</a></li>
</ul>
<h3 id="output-of-the-code">Output of the Code</h3>
<script type="application/javascript" src="https://gist.github.com/fileformat-blog-gists/ab6976aa3a0fc2999093f5f9320a9e20.js?file=Output%20-%20extract-text-from-pdf-using-pypdf-in-python.txt"></script>

<h2 id="2-rotating-pdf-pages-using-python">2. Rotating PDF Pages Using Python</h2>
<script type="application/javascript" src="https://gist.github.com/fileformat-blog-gists/760d480cfede4178296c353d60662e1a.js?file=rotate-pdf-page-using-pypdf-in-python.py"></script>

<h3 id="code-explanation-1">Code Explanation</h3>
<p>The code basically rotates the <strong>first page</strong> by <strong>90° clockwise</strong> and saves the modified PDF without affecting other pages.</p>
<p><strong>1. Import Required Classes</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pypdf <span style="color:#f92672">import</span> PdfReader, PdfWriter
</span></span></code></pre></div><ul>
<li><code>PdfReader</code>: Reads the input PDF.</li>
<li><code>PdfWriter</code>: Creates a new PDF with modifications.</li>
</ul>
<p><strong>2. Define Input and Output File Paths</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>input_pdf <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;pdf-to-rotate/input.pdf&#34;</span>
</span></span><span style="display:flex;"><span>output_pdf <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;pdf-to-rotate/rotated_output.pdf&#34;</span>
</span></span></code></pre></div><ul>
<li>The script reads from <code>input.pdf</code> and saves the modified file as <code>rotated_output.pdf</code>.</li>
</ul>
<p><strong>3. Read the PDF and Create a Writer Object</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>reader <span style="color:#f92672">=</span> PdfReader(input_pdf)
</span></span><span style="display:flex;"><span>writer <span style="color:#f92672">=</span> PdfWriter()
</span></span></code></pre></div><ul>
<li><code>reader</code> loads the existing PDF.</li>
<li><code>writer</code> is used to store the modified pages.</li>
</ul>
<p><strong>4. Rotate the First Page by 90 Degrees</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>page <span style="color:#f92672">=</span> reader<span style="color:#f92672">.</span>pages[<span style="color:#ae81ff">0</span>]
</span></span><span style="display:flex;"><span>page<span style="color:#f92672">.</span>rotate(<span style="color:#ae81ff">90</span>)  <span style="color:#75715e"># Rotate 90 degrees clockwise</span>
</span></span><span style="display:flex;"><span>writer<span style="color:#f92672">.</span>add_page(page)
</span></span></code></pre></div><ul>
<li>Extracts <strong>page 1</strong>, rotates it <strong>90 degrees</strong>, and adds it to the new PDF.</li>
</ul>
<p><strong>5. Add Remaining Pages Without Changes</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(<span style="color:#ae81ff">1</span>, len(reader<span style="color:#f92672">.</span>pages)):
</span></span><span style="display:flex;"><span>    writer<span style="color:#f92672">.</span>add_page(reader<span style="color:#f92672">.</span>pages[i])
</span></span></code></pre></div><ul>
<li>Loops through the remaining pages and adds them as they are.</li>
</ul>
<p><strong>6. Save the New PDF</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(output_pdf, <span style="color:#e6db74">&#34;wb&#34;</span>) <span style="color:#66d9ef">as</span> file:
</span></span><span style="display:flex;"><span>    writer<span style="color:#f92672">.</span>write(file)
</span></span></code></pre></div><ul>
<li>Opens <code>rotated_output.pdf</code> in write-binary mode and saves the new PDF.</li>
</ul>
<p><strong>7. Print Confirmation</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Rotated page saved to </span><span style="color:#e6db74">{</span>output_pdf<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;</span>)
</span></span></code></pre></div><ul>
<li>Displays a success message.</li>
</ul>
<h3 id="input-pdf-used-in-the-code-and-its-rotated-output">Input PDF Used in the Code and Its Rotated Output</h3>
<ul>
<li><strong>Input PDF File:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-rotate/">Download Link</a></li>
<li><strong>Output Rotated PDF File:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-rotate/rotated_output.pdf">Download Link</a></li>
</ul>
<p><strong>Screenshot</strong>
<img loading="lazy" src="https://raw.githubusercontent.com/fileformat-blog-gists/content/main/working-with-pdf-files-in-python/rotated-pdf.png" alt="Screenshot of Rotated Page in PDF Using Python"  />
</p>
<h2 id="3-merging-pdf-files-using-python">3. Merging PDF Files Using Python</h2>
<p>This Python script demonstrates how to <strong>merge multiple PDF files</strong> from a directory into a single PDF using the <strong>PyPDF</strong> library.</p>
<script type="application/javascript" src="https://gist.github.com/fileformat-blog-gists/a1a571783e0f5e699678d1094bf1afa5.js?file=merge_pdf_files_using_pypdf_in_python.py"></script>

<h3 id="code-explanation-2">Code Explanation</h3>
<ul>
<li>This script automatically merges all PDF files found in the specified directory (<code>pdfs-to-merge</code>) into a single output file (<code>merged_output.pdf</code>).</li>
<li>It ensures the output directory exists and adds each PDF&rsquo;s pages in the order they are listed.</li>
<li>It outputs the final merged file in the <code>output-dir</code> subdirectory.</li>
</ul>
<p><strong>Code Breakdown</strong></p>
<p><strong>1. Import Libraries</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> os
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pypdf <span style="color:#f92672">import</span> PdfReader, PdfWriter
</span></span></code></pre></div><ul>
<li><code>os</code>: Used to interact with the file system, such as reading directories and managing file paths.</li>
<li><code>PdfReader</code>: Reads the content of a PDF file.</li>
<li><code>PdfWriter</code>: Creates and writes a new PDF file.</li>
</ul>
<p><strong>2. Define Directory and Output File</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>directory <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;pdfs-to-merge&#34;</span>
</span></span><span style="display:flex;"><span>output_file <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;output-dir/merged_output.pdf&#34;</span>
</span></span></code></pre></div><ul>
<li><code>directory</code>: Specifies the folder where the PDF files are stored.</li>
<li><code>output_file</code>: Defines the output path and name of the merged PDF.</li>
</ul>
<p><strong>3. Create Output Directory if It Doesn&rsquo;t Exist</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>os<span style="color:#f92672">.</span>makedirs(os<span style="color:#f92672">.</span>path<span style="color:#f92672">.</span>join(directory, <span style="color:#e6db74">&#34;output-dir&#34;</span>), exist_ok<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span></code></pre></div><ul>
<li>This ensures the <strong>output directory</strong> exists, and if it doesn&rsquo;t, it creates it.</li>
</ul>
<p><strong>4. Create a PdfWriter Object</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>writer <span style="color:#f92672">=</span> PdfWriter()
</span></span></code></pre></div><ul>
<li><code>writer</code> is used to collect and combine all the pages from the PDFs.</li>
</ul>
<p><strong>5. Iterate Over All PDF Files in the Directory</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> file_name <span style="color:#f92672">in</span> sorted(os<span style="color:#f92672">.</span>listdir(directory)):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> file_name<span style="color:#f92672">.</span>endswith(<span style="color:#e6db74">&#34;.pdf&#34;</span>):
</span></span><span style="display:flex;"><span>        file_path <span style="color:#f92672">=</span> os<span style="color:#f92672">.</span>path<span style="color:#f92672">.</span>join(directory, file_name)
</span></span><span style="display:flex;"><span>        print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Adding: </span><span style="color:#e6db74">{</span>file_name<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;</span>)
</span></span></code></pre></div><ul>
<li>This loop goes through all files in the specified directory, checking for files with the <code>.pdf</code> extension. It uses <code>sorted()</code> to process them in alphabetical order.</li>
</ul>
<p><strong>6. Read Each PDF and Append Pages to the Writer</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>reader <span style="color:#f92672">=</span> PdfReader(file_path)
</span></span><span style="display:flex;"><span>writer<span style="color:#f92672">.</span>append(reader)
</span></span></code></pre></div><ul>
<li>For each PDF, <code>PdfReader</code> reads the file, and then all pages from that PDF are appended to <code>writer</code>.</li>
</ul>
<p><strong>7. Write the Merged PDF to an Output File</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>output_path <span style="color:#f92672">=</span> os<span style="color:#f92672">.</span>path<span style="color:#f92672">.</span>join(directory, output_file)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(output_path, <span style="color:#e6db74">&#34;wb&#34;</span>) <span style="color:#66d9ef">as</span> output_pdf:
</span></span><span style="display:flex;"><span>    writer<span style="color:#f92672">.</span>write(output_pdf)
</span></span></code></pre></div><ul>
<li>After collecting all pages, <code>writer.write()</code> writes the merged PDF to the specified output path.</li>
</ul>
<p><strong>8. Print Confirmation</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Merged PDF saved as: </span><span style="color:#e6db74">{</span>output_path<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;</span>)
</span></span></code></pre></div><ul>
<li>Prints a success message confirming the location of the saved merged PDF.</li>
</ul>
<h3 id="input-pdf-files-used-in-the-code-and-the-merged-output-pdf">Input PDF Files Used in the Code and the Merged Output PDF</h3>
<ul>
<li><strong>Input PDF Files:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdfs-to-merge">Download Link</a></li>
<li><strong>Merged Output PDF:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdfs-to-merge/output-dir">Download Link</a></li>
</ul>
<h2 id="4-splitting-a-pdf-using-python">4. Splitting a PDF Using Python</h2>
<script type="application/javascript" src="https://gist.github.com/fileformat-blog-gists/0dee64422ac0dcf44cf027d90567bbf8.js?file=split-pdf-using-pypdf-in-python.py"></script>

<h3 id="code-explanation-3">Code Explanation</h3>
<p>The above Python script splits a PDF into separate pages using the <strong>PyPDF</strong> library. It first ensures that the output directory exists, then reads the input PDF file. The script loops through each page, creates a new <strong>PdfWriter</strong> object, and saves each page as an individual PDF file. The output files are named sequentially (e.g., <strong>page_1.pdf, page_2.pdf</strong>) and stored in the <strong><code>output-dir</code></strong> folder. Finally, it prints a confirmation message for each created file and notifies when the process is complete.</p>
<h3 id="input-pdf-and-split-output-files">Input PDF and Split Output Files</h3>
<ul>
<li><strong>Input PDF File:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-split">Download Link</a></li>
<li><strong>Split Output Files:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-split/output-dir">Download Link</a></li>
</ul>
<h2 id="5-adding-a-watermark-to-a-pdf-using-python">5. Adding a Watermark to a PDF Using Python</h2>
<p>You can add a watermark to a PDF using the PyPDF library by overlaying a watermark PDF onto an existing PDF. Make sure the watermark PDF has only one page so it applies correctly to each page of the main PDF.</p>
<script type="application/javascript" src="https://gist.github.com/fileformat-blog-gists/af057943580e2fcde6a635df34d7e39a.js?file=watermark-pdf-using-pypdf-in-python.py"></script>

<h3 id="code-explanation-4">Code Explanation</h3>
<p>The above Python script reads an input PDF, extracts a one-page watermark PDF, overlays the watermark on each page of the input PDF, and saves the final watermarked PDF.</p>
<p><strong>Code Breakdown</strong></p>
<p>Here&rsquo;s a brief explanation of each part</p>
<p><strong>1. Import Required Classes</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pypdf <span style="color:#f92672">import</span> PdfReader, PdfWriter
</span></span></code></pre></div><ul>
<li><strong><code>PdfReader</code></strong> is used to read existing PDFs.</li>
<li><strong><code>PdfWriter</code></strong> is used to create and write a new PDF.</li>
</ul>
<p><strong>2. Define File Paths</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>input_pdf <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;pdf-to-watermark/input.pdf&#34;</span>
</span></span><span style="display:flex;"><span>watermark_pdf <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;pdf-to-watermark/watermark.pdf&#34;</span>
</span></span><span style="display:flex;"><span>output_pdf <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;pdf-to-watermark/output_with_watermark.pdf&#34;</span>
</span></span></code></pre></div><ul>
<li><code>input_pdf</code>: The original PDF to which the watermark will be added.</li>
<li><code>watermark_pdf</code>: A separate <strong>one-page</strong> PDF that serves as the watermark.</li>
<li><code>output_pdf</code>: The output file that will contain the watermarked pages.</li>
</ul>
<p><strong>3. Read PDFs</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>reader <span style="color:#f92672">=</span> PdfReader(input_pdf)
</span></span><span style="display:flex;"><span>watermark <span style="color:#f92672">=</span> PdfReader(watermark_pdf)
</span></span></code></pre></div><ul>
<li><code>reader</code>: Reads the input PDF.</li>
<li><code>watermark</code>: Reads the watermark PDF.</li>
</ul>
<p><strong>4. Create a Writer Object</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>writer <span style="color:#f92672">=</span> PdfWriter()
</span></span></code></pre></div><ul>
<li>This will be used to create the final watermarked PDF.</li>
</ul>
<p><strong>5. Extract Watermark Page</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>watermark_page <span style="color:#f92672">=</span> watermark<span style="color:#f92672">.</span>pages[<span style="color:#ae81ff">0</span>]
</span></span></code></pre></div><ul>
<li>Assumes that the watermark PDF has only <strong>one page</strong>, which is used to overlay on all pages.</li>
</ul>
<p><strong>6. Loop Through Input PDF Pages &amp; Merge Watermark</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> page <span style="color:#f92672">in</span> reader<span style="color:#f92672">.</span>pages:
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Merge the watermark with the current page</span>
</span></span><span style="display:flex;"><span>    page<span style="color:#f92672">.</span>merge_page(watermark_page)
</span></span><span style="display:flex;"><span>    
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Add the merged page to the writer</span>
</span></span><span style="display:flex;"><span>    writer<span style="color:#f92672">.</span>add_page(page)
</span></span></code></pre></div><ul>
<li>Iterates through each page of <code>input_pdf</code>.</li>
<li><strong><code>merge_page(watermark_page)</code></strong> overlays the watermark on top of the current page.</li>
<li>Adds the modified page to the <code>writer</code>.</li>
</ul>
<p><strong>7. Save the Watermarked PDF</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(output_pdf, <span style="color:#e6db74">&#34;wb&#34;</span>) <span style="color:#66d9ef">as</span> output_file:
</span></span><span style="display:flex;"><span>    writer<span style="color:#f92672">.</span>write(output_file)
</span></span></code></pre></div><ul>
<li>Writes the modified pages into a new PDF file.</li>
</ul>
<p><strong>8. Print Confirmation</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Watermarked PDF saved as: </span><span style="color:#e6db74">{</span>output_pdf<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;</span>)
</span></span></code></pre></div><ul>
<li>Prints the output file path for confirmation.</li>
</ul>
<h3 id="input-pdf-watermark-pdf-and-output-watermarked-pdf">Input PDF, Watermark PDF, and Output Watermarked PDF</h3>
<ul>
<li><strong>Input PDF File:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-watermark">Download Link</a></li>
<li><strong>Watermark PDF File:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-watermark">Download Link</a></li>
<li><strong>Output Watermarked PDF File:</strong> <a href="https://github.com/fileformat-blog-gists/code/tree/main/working-with-pdf-files-in-python/pdf-to-watermark">Download Link</a></li>
</ul>
<p><strong>Screenshot</strong>
<img loading="lazy" src="https://raw.githubusercontent.com/fileformat-blog-gists/content/main/working-with-pdf-files-in-python/watermark-pdf.png" alt="Screenshot of Watermarked PDF Using Python"  />
</p>
<h2 id="conclusion">Conclusion</h2>
<p>In this guide, we explored essential PDF operations in Python, including extracting text, rotating pages, merging, splitting, and adding watermarks. With these skills, you can now build your own PDF manager and automate various PDF tasks efficiently.</p>
]]></content:encoded>
    </item>
    
  </channel>
</rss>
