Step-by-Step Guide: Generate high-quality PDF from HTML using Pyppeteer
Introduction to Pyppeteer for PDF Generation
Pyppeteer, a Python port of Puppeteer, renders HTML/CSS just like a real browser—making it ideal for modern web apps. Instead of manually coding PDFs (as with ReportLab) or manipulating existing PDFs (as with PyPDF2), you can design reports with standard web technologies and generate pixel-perfect PDFs automatically.
For SaaS developers looking to create PDF reports directly from web applications, Pyppeteer provides a streamlined solution to generate high-quality, dynamic PDFs with minimal friction.
You can check the full documentation here.
Comparing Pyppeteer with Other PDF Libraries and Tools
Comparing Pyppeteer to Other Popular Python PDF Libraries:
• Pyppeteer vs. Playwright: Both Pyppeteer and Playwright can generate PDFs from HTML, but Playwright is generally more robust for broader web automation use cases, supporting multiple browsers like Firefox and WebKit, whereas Pyppeteer focuses solely on Chromium. If your primary goal is HTML-to-PDF conversion and you don’t need multi-browser support, Pyppeteer may offer simpler usage. Playwright, on the other hand, excels when broader testing or automation beyond Chromium is needed.
• Pyppeteer vs. ReportLab: ReportLab is a powerful Python library for creating PDFs programmatically. However, it doesn’t support HTML/CSS rendering directly. ReportLab is more suited for constructing PDFs from scratch using Python, making it ideal for static reports or invoices that don’t rely on existing HTML content. In contrast, Pyppeteer allows you to leverage your existing HTML/CSS designs, which is more efficient for modern web applications.
• Pyppeteer vs. PyPDF2: PyPDF2 focuses on manipulating existing PDFs—merging, splitting, rotating, etc. While useful for handling PDFs once they’re created, PyPDF2 doesn’t offer HTML-to-PDF conversion. This makes Pyppeteer the superior option for generating PDFs dynamically from HTML content, especially when working with web-based layouts.
If you want to go deep on a full comparison between the best pdf libraries in python in 2025, you can check out this guide.
Setting Up Your Environment for Pyppeteer PDF Generation
Prerequisites: What You Need to Get Started with Pyppeteer
To begin, you’ll need:
• Python 3.6 or later
• Node.js (required for Puppeteer/Chromium)
• Basic knowledge of HTML, CSS, and Python
Check your Python and Node.js installations:
If Node.js is not installed, download it from nodejs.org.
Installing Pyppeteer in Your Python Project
Install Pyppeteer using pip:
This command installs Pyppeteer along with a bundled Chromium version for rendering.
Integrating Pyppeteer with Your Existing HTML Rendering Setup
With Pyppeteer installed, you can now integrate it with your existing HTML rendering pipeline. If you’re using a template engine like Jinja2, you can dynamically populate the HTML content and pass it to Pyppeteer.
Here’s a basic Jinja2 template:
How to Generate a PDF from HTML Using Pyppeteer
Converting HTML to PDF Using Pyppeteer
To generate a PDF from HTML, use the following Python script:
This script renders your HTML and generates a PDF file.
Generating a PDF from a Website URL Using Pyppeteer
Beyond rendering local HTML strings, Pyppeteer also allows you to capture full websites by navigating to a given URL. This approach is useful for archiving live content or automating the creation of print-ready snapshots of any dynamic webpage. Here’s a quick example showing how to generate a PDF from a URL:
In this snippet, goto()
fetches the live webpage, which can include JavaScript-driven elements and dynamic data. Setting {'waitUntil': 'networkidle2'}
ensures all requests finish before rendering the PDF, preventing partially loaded images or missing elements. Using printBackground
retains background images and colors in the generated PDF.
Customizing PDF Output: Headers, Footers, and Page Formats
You can easily customize the PDF format with Pyppeteer. Add headers, footers, or set the page size using the pdf() function:
This code adds page numbers to the footer, customizing your PDF output.
Common Pyppeteer Issues & How to Fix Them
There are several common issues that developers face when generating PDFs from HTML using Pyppeteer:
Fonts not displaying correctly:
• Cause: Custom fonts or images aren’t fully loaded before PDF creation.
• Solution: Use await page.waitForSelector() or await page.waitForFunction() to ensure assets finish loading.
Missing Print Styles
• Cause: By default, Pyppeteer uses screen media.
• Solution: await page.emulateMediaType('print') to enforce print styling.
Background Colors Not Appearing
• Cause: Background printing is off by default.
• Solution: Enable it with 'printBackground': True in page.pdf().
JavaScript Not Fully Executing
• Cause: The script might close the browser before JS finalizes.
• Solution: Wait for network idle or a specific DOM event.
Scaling PDF Generation with a Third-Party API
For larger SaaS platforms requiring automated PDF generation at scale, integrating a PDF Generation API like pdforge can offload the heavy lifting. This approach is ideal for SaaS platforms with high volumes of PDF requests.
With pdforge, you can create beautiful reports with flexible layouts and complex components with an easy-to-use opinionated no-code builder. Let the AI do the heavy lifting by generating your templates, creating custom components or even filling all the variables for you.
You can handle high-volume PDF generation from a single backend call.
Here’s an example of how to generate pdf with pdforge via an API call:
You can create your account, experience our no-code builder and create your first layout template without any upfront payment clicking here.
Conclusion
With Pyppeteer’s browser-like rendering, you can automate complex HTML-to-PDF tasks, ensuring high fidelity for web designs, charts, and interactive elements. Whether you’re a SaaS developer creating on-demand reports or need a robust API-based approach for massive PDF generation, the tools and techniques covered here will help you craft professional PDFs.
If you don't want to waste time maintaining pdfs layouts and their infrastructure or if you don't want to keep track of best practices to generate PDFs at scale, third-party PDF APIs like pdforge will save you hours of work and deliver a high quality pdf layout.
Try for free
7-day free trial