pdf libraries

pdf libraries

Java

Java

How to Generate PDF from HTML with Playwright and Java

Marcelo Abreu, founder of pdforge

Marcelo | Founder

Marcelo | Founder

Jan 30, 2025

Jan 30, 2025

Why Choose Playwright Java for HTML to PDF Conversion

Playwright is a cutting-edge automation library from Microsoft that allows developers to control Chromium, Firefox, and WebKit with remarkable precision.

  • Modern Web Compatibility: Because it uses real browser engines, it supports advanced CSS and JavaScript features out of the box.

  • Java Integration: Easily added via Maven or Gradle, making it ideal for existing Java-centric projects or SaaS platforms.

  • High-Fidelity Rendering: With Playwright, you can be sure your generated PDFs reflect the exact styles, fonts, and layouts of your application.

  • Tip: If you’ve ever struggled with older libraries that can’t handle complex layouts or JavaScript-driven UIs, Playwright solves these issues by leveraging a headless browser environment.

They have a complete and easy documentation, that you can check out here.

Playwright vs. Other Java PDF Libraries: A Comparative Insight

When considering HTML to PDF conversion in Java, several libraries come to mind:

  • Flying Saucer: A Java library that renders XML/XHTML and CSS 2.1 content. It handles basic HTML and CSS but struggles with modern web features.

  • iText: A powerful PDF library capable of creating and manipulating PDFs. However, it has a steep learning curve and licensing restrictions for commercial use.

  • Apache PDFBox: An open-source library for working with PDF documents. It’s excellent for manipulating existing PDFs but isn’t optimized for HTML to PDF conversion.

  • OpenPDF: A derivative of iText, offering similar functionalities with an open-source license. It shares the complexities of iText without full support for advanced HTML content.

Playwright stands out by using actual browser engines to render HTML and CSS, ensuring that the generated PDFs accurately reflect modern web designs, including advanced JavaScript and CSS features.

If you want to go deep on a full comparison between pdf libraries in java for 2025, you can check out this guide.

Guide to generate pdf from html using Java Playwright
Guide to generate pdf from html using Java Playwright
  1. Setting Up Playwright in Your Java Project

Quick Start: Installing Playwright Java

Add the following dependency to your pom.xml file if you’re using Maven:

<dependencies>
    <dependency>
        <groupId>com.microsoft.playwright</groupId>
        <artifactId>playwright</artifactId>
        <version>1.34.0</version>
    </dependency>
</dependencies>

For Gradle, include:

dependencies {
    implementation 'com.microsoft.playwright:playwright:1.34.0'
}

Configuring Dependencies

Initialize Playwright and install the necessary browser binaries:

import com.microsoft.playwright.*;

public class PlaywrightSetup {
    public static void main(String[] args) {
        try (Playwright playwright = Playwright.create()) {
            playwright.chromium().launch();
            System.out.println("Playwright is set up successfully.");
        }
    }
}

Verifying Your Setup with a Test HTML Page

Create a comprehensive HTML invoice template named invoice.html:

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>Invoice</title>
    <style>
        body { font-family: 'Arial', sans-serif; margin: 20px; }
        h1 { text-align: center; }
        .invoice-details { width: 100%; margin-top: 20px; border-collapse: collapse; }
        .invoice-details th, .invoice-details td { padding: 10px; border: 1px solid #ccc; text-align: left; }
        .total { text-align: right; font-weight: bold; }
    </style>
</head>
<body>
    <h1>Invoice</h1>
    <p>Date: <strong>2023-10-01</strong></p>
    <p>Invoice #: <strong>INV-1001</strong></p>
    <table class="invoice-details">
        <tr>
            <th>Description</th>
            <th>Quantity</th>
            <th>Price</th>
            <th>Total</th>
        </tr>
        <tr>
            <td>Product A</td>
            <td>2</td>
            <td>$50</td>
            <td>$100</td>
        </tr>
        <tr>
            <td>Service B</td>
            <td>5</td>
            <td>$20</td>
            <td>$100</td>
        </tr>
        <tr>
            <td colspan="3" class="total">Grand Total</td>
            <td>$200</td>
        </tr>
    </table>
</body>
</html>

2. Basic HTML to PDF Conversion

Method 01: Rendering the PDF from a URL

If your HTML invoice is hosted, you can navigate to its URL and generate a PDF:

import com.microsoft.playwright.*;

public class PdfFromUrl {
    public static void main(String[] args) {
        try (Playwright playwright = Playwright.create()) {
            Browser browser = playwright.chromium().launch();
            Page page = browser.newPage();
            page.navigate("https://yourdomain.com/invoice.html");
          
            // Save PDF locally
            page.pdf(new Page.PdfOptions().setPath("invoice.pdf"));
            System.out.println("PDF generated from URL and saved locally.");
          
          // Get PDF as a byte array for further processing
            byte[] pdfBytes = page.pdf();
          // Use pdfBytes to upload or process as needed
        }
    }
}

Method 02: Rendering the PDF from HTML Content

For dynamic content or when the HTML is generated on the fly:

import com.microsoft.playwright.*;

public class PdfFromContent {
    public static void main(String[] args) {
        String htmlContent = "<!DOCTYPE html><html><head><meta charset='UTF-8'><title>Invoice</title>"
                + "<style>"
                + "body { font-family: 'Arial', sans-serif; margin: 20px; }"
                + "h1 { text-align: center; }"
                + ".invoice-details { width: 100%; margin-top: 20px; border-collapse: collapse; }"
                + ".invoice-details th, .invoice-details td { padding: 10px; border: 1px solid #ccc; text-align: left; }"
                + ".total { text-align: right; font-weight: bold; }"
                + "</style></head>"
                + "<body>"
                + "<h1>Invoice</h1>"
                + "<p>Date: <strong>2023-10-01</strong></p>"
                + "<p>Invoice #: <strong>INV-1001</strong></p>"
                + "<table class='invoice-details'>"
                + "<tr><th>Description</th><th>Quantity</th><th>Price</th><th>Total</th></tr>"
                + "<tr><td>Product A</td><td>2</td><td>$50</td><td>$100</td></tr>"
                + "<tr><td>Service B</td><td>5</td><td>$20</td><td>$100</td></tr>"
                + "<tr><td colspan='3' class='total'>Grand Total</td><td>$200</td></tr>"
                + "</table>"
                + "</body></html>";
        try (Playwright playwright = Playwright.create()) {
            Browser browser = playwright.chromium().launch();
            Page page = browser.newPage();
            page.setContent(htmlContent);
            // Save PDF locally
            page.pdf(new Page.PdfOptions().setPath("invoice.pdf"));
            System.out.println("PDF generated from HTML content and saved locally.");
            // Get PDF as a byte array
            byte[] pdfBytes = page.pdf();
            // Use pdfBytes as needed
        }
    }
}

  1. Advanced Playwright PDF Features

Adding Headers, Footers, and Page Numbers with Playwright

Enhance your PDF by adding custom headers, footers, and page numbers:

Page.PdfOptions pdfOptions = new Page.PdfOptions()
    .setPath("invoice_with_header_footer.pdf")
    .setDisplayHeaderFooter(true)
    .setHeaderTemplate("<div style='font-size:10px; width:100%; text-align:center;'>My Company</div>")
    .setFooterTemplate("<div style='font-size:10px; width:100%; text-align:center;'>Page <span class='pageNumber'></span> of <span class='totalPages'></span></div>")
    .setMargin(new Margin().setTop("50px").setBottom("50px"))
    .setPrintBackground(true);

page.pdf(pdfOptions);

All Options from the pdf Method

The page.pdf() method offers a variety of options to customize the PDF output:

  • path: Specifies the file path to save the PDF. If omitted, the PDF will be returned as a byte array.

  • scale: Sets the scale of the webpage rendering (default is 1.0).

  • displayHeaderFooter: When set to true, includes header and footer in the PDF.

  • headerTemplate and footerTemplate: HTML templates for the header and footer. Can include placeholders like <span class='pageNumber'></span>.

  • printBackground: When set to true, prints background graphics.

  • landscape: When set to true, prints the PDF in landscape orientation.

  • pageRanges: Specifies the page ranges to print, e.g., "1-5, 8, 11-13".

  • format: Sets the paper format, such as "A4", "Letter".

  • width and height: Sets the width and height of the paper in units (px, in, cm, mm).

  • margin: Sets margins for the PDF. Accepts a Margin object with top, right, bottom, left properties.

  • preferCSSPageSize: When set to true, uses the @page size defined in CSS.

Example of using multiple options:

Page.PdfOptions pdfOptions = new Page.PdfOptions()
    .setPath("custom_invoice.pdf")
    .setFormat("A4")
    .setLandscape(false)
    .setPrintBackground(true)
    .setDisplayHeaderFooter(true)
    .setHeaderTemplate("<div style='font-size:10px; text-align:center;'>Invoice Header</div>")
    .setFooterTemplate("<div style='font-size:10px; text-align:center;'>Page <span class='pageNumber'></span> of <span class='totalPages'></span></div>")
    .setMargin(new Margin().setTop("60px").setBottom("60px").setLeft("20px").setRight("20px"))
    .setScale(1.0)
    .setPageRanges("1-2")
    .setPreferCSSPageSize(true);

page.pdf(pdfOptions);

Frequently Asked Questions (FAQ)

  1. How can I embed custom fonts or graphics?

  • Include them within your HTML/CSS. Ensure any external resources (e.g., fonts hosted on a CDN) are accessible when the browser renders the page.

  1. Can I generate multi-page invoices or large PDFs?

  • Yes. Use standard HTML/CSS page-break properties or the @page CSS rule. Also consider using template engines to paginate data effectively.

  1. Why is my background color missing in the PDF?

  • By default, printing can remove background colors. Set page.pdf(new Page.PdfOptions().setPrintBackground(true)) and add -webkit-print-color-adjust: exact; in your CSS to preserve backgrounds.

  1. Is it possible to run multiple PDF generations in parallel?

  • Absolutely. You can create multiple Page contexts in a single browser or launch multiple browsers. Just remember each instance consumes resources, so monitor performance and memory usage.

  1. How can I scale my pdf generation with playwright and java?

  1. Integrating Template Engines for Dynamic Invoices

To generate dynamic invoices, you can use template engines like FreeMarker or Thymeleaf. Here’s an example using FreeMarker:

import com.microsoft.playwright.*;
import freemarker.template.*;
import java.io.*;
import java.util.*;

public class PdfWithTemplate {
    public static void main(String[] args) throws Exception {
        // Configure FreeMarker
        Configuration cfg = new Configuration(Configuration.VERSION_2_3_30);
        cfg.setClassForTemplateLoading(PdfWithTemplate.class, "/templates");
        // Load template
        Template template = cfg.getTemplate("invoice.ftl");
        // Data model
        Map<String, Object> data = new HashMap<>();
        data.put("date", "2023-10-01");
        data.put("invoiceNumber", "INV-1001");
        List<Map<String, String>> items = new ArrayList<>();
        Map<String, String> item1 = new HashMap<>();
        item1.put("description", "Product A");
        item1.put("quantity", "2");
        item1.put("price", "$50");
        item1.put("total", "$100");
        items.add(item1);
        Map<String, String> item2 = new HashMap<>();
        item2.put("description", "Service B");
        item2.put("quantity", "5");
        item2.put("price", "$20");
        item2.put("total", "$100");
        items.add(item2);
        data.put("items", items);
        data.put("grandTotal", "$200");
        // Generate HTML content
        Writer out = new StringWriter();
        template.process(data, out);
        String htmlContent = out.toString();
        // Generate PDF with Playwright
        try (Playwright playwright = Playwright.create()) {
            Browser browser = playwright.chromium().launch();
            Page page = browser.newPage();
            page.setContent(htmlContent);
            // Save PDF locally
            page.pdf(new Page.PdfOptions().setPath("invoice.pdf"));
            System.out.println("PDF generated from template and saved locally.");
            // Get PDF as a byte array
            byte[] pdfBytes = page.pdf();
            // Use pdfBytes as needed
        }
    }
}

invoice.ftl template file in /templates directory:

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>Invoice</title>
    <style>
        body { font-family: 'Arial', sans-serif; margin: 20px; }
        h1 { text-align: center; }
        .invoice-details { width: 100%; margin-top: 20px; border-collapse: collapse; }
        .invoice-details th, .invoice-details td { padding: 10px; border: 1px solid #ccc; text-align: left; }
        .total { text-align: right; font-weight: bold; }
    </style>
</head>
<body>
    <h1>Invoice</h1>
    <p>Date: <strong>${date}</strong></p>
    <p>Invoice #: <strong>${invoiceNumber}</strong></p>
    <table class="invoice-details">
        <tr>
            <th>Description</th>
            <th>Quantity</th>
            <th>Price</th>
            <th>Total</th>
        </tr>
        <#list items as item>
        <tr>
            <td>${item.description}</td>
            <td>${item.quantity}</td>
            <td>${item.price}</td>
            <td>${item.total}</td>
        </tr>
        </#list>
        <tr>
            <td colspan="3" class="total">Grand Total</td>
            <td>${grandTotal}</td>
        </tr>
    </table>
</body>
</html>

Real-World Use Case: Generating Weekly Customer Invoices at Scale

Imagine you run a SaaS platform that bills customers weekly. Using Playwright Java, you can:

  1. Pull invoice data from your database (e.g., PostgreSQL).

  2. Feed that data into a FreeMarker template for personalized invoices.

  3. Launch Chromium headless in a Docker container, generating PDFs in parallel for each customer.

  4. Automatically email or store the PDFs, ensuring timely delivery and archival.

This approach provides a fully automated pipeline without relying on manual intervention. 

Alternative: Scaling PDF Generation with Third-Party APIs

homepage of pdforge

For larger SaaS platforms requiring automated PDF generation at scale, integrating a PDF Generation API like pdforge can offload the heavy lifting. This approach is ideal for SaaS platforms with high volumes of PDF requests.

With pdforge, you can create beautiful reports with flexible layouts and complex components with an easy-to-use opinionated no-code builder. Let the AI do the heavy lifting by generating your templates, creating custom components or even filling all the variables for you.

You can handle high-volume PDF generation from a single backend call.

Here’s an example of how to generate pdf with pdforge via an API call:

import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.URL;

public class PdfForgeExample {
    public static void main(String[] args) {
        try {
            URL url = new URL("https://api.pdforge.com/v1/pdf/sync");
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            conn.setRequestMethod("POST");
            conn.setRequestProperty("Authorization", "Bearer your-api-key");
            conn.setRequestProperty("Content-Type", "application/json");
            conn.setDoOutput(true);

            String jsonInputString = "{ \"templateId\": \"your-template\", \"data\": { \"html\": \"your-html\" } }";

            try(OutputStreamWriter writer = new OutputStreamWriter(conn.getOutputStream())) {
                writer.write(jsonInputString);
                writer.flush();
            }

            int responseCode = conn.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // Read the response and process the PDF
            } else {
                // Handle errors
            }
        } catch (Exception e) {
            e.printStackTrace

You can create your account, experience our no-code builder and create your first layout template without any upfront payment clicking here.

Conclusion

Playwright Java offers a robust solution for converting HTML to PDF, capturing the nuances of modern web content with high fidelity. Its use of real browser engines ensures that your PDFs accurately reflect your HTML designs, making it ideal for generating complex documents like invoices.

However, if your application requires extensive PDF manipulation beyond rendering, traditional libraries like iText or Flying Saucer might be more suitable due to their advanced features for editing and annotating PDFs.

If you don't want to waste time maintaining pdfs layouts and their infrastructure or if you don't want to keep track of best practices to generate PDFs at scale, third-party PDF APIs like pdforge will save you hours of work and deliver a high quality pdf layout.

Generating pdfs at scale can be quite complicated!

Generating pdfs at scale can be quite complicated!

We take care of all of this, so you focus on what trully matters on your Product!

We take care of all of this, so you focus on what trully matters on your Product!

Try for free

7-day free trial

Table of contents

Title