pdf libraries

pdf libraries

Java

Java

Simplify PDF Creation from HTML Using Flying Saucer

Marcelo Abreu, founder of pdforge

Marcelo | Founder

Marcelo | Founder

Nov 5, 2024

Nov 5, 2024

Introduction to Flying Saucer and PDF Generation

Flying Saucer is an open-source Java library that converts XHTML and CSS content into PDF documents with high fidelity. Unlike other PDF libraries that require manual construction of PDF elements, Flying Saucer allows developers to design documents using familiar web technologies, making the process more intuitive and efficient for those with HTML and CSS expertise.

You can check out the full documentation here.

How Flying Saucer Stands Out Among Java PDF Libraries

When it comes to generating PDFs in Java, several libraries are available, such as OpenPDF, iText, Apache PDFBox, and Playwright.

However, Flying Saucer distinguishes itself with its seamless integration of HTML and CSS rendering capabilities. Unlike other libraries that require you to construct PDFs programmatically, Flying Saucer allows you to design your documents using familiar web technologies.

  • OpenPDF: Great for basic PDF generation but lacks advanced HTML and CSS rendering.

  • iText: Powerful but comes with licensing restrictions for commercial use.

  • Apache PDFBox: Suitable for manipulating existing PDFs but not ideal for HTML to PDF conversion.

  • Playwright: Primarily a browser automation tool; using it for PDF generation can be overkill.

Flying Saucer provides a balanced solution by focusing specifically on converting XHTML and CSS into PDFs, making it an excellent choice for SaaS applications that require dynamic document generation.

Guide to generate pdf from html using Java Flying Saucer
Guide to generate pdf from html using Java Flying Saucer

Setting Up Flying Saucer for XHTML to PDF Conversion

Preparing Your HTML Content for Flying Saucer

Flying Saucer requires the input content to be well-formed XHTML to successfully render it into a PDF.

This means your HTML must adhere to XML standards, including proper nesting of elements, closing all tags, and case-sensitive tag names. If your existing HTML is not already in XHTML format, you’ll need to convert it before using Flying Saucer.

Here’s how you can convert your HTML to XHTML:

  • Use an HTML to XHTML Converter: Tools like JTidy can parse HTML and output XHTML.

  • Manually Adjust Your HTML Code: Ensure all tags are properly closed, attributes are quoted, and the document is well-structured.

Example of converting HTML to XHTML:

Before (HTML):

<html>
<head>
    <title>Sample</title>
</head>
<body>
    <img src="image.jpg">
    <p>Welcome to our site</p>
</body>
</html>

After (XHTML):

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Sample</title>
</head>
<body>
    <img src="image.jpg" />
    <p>Welcome to our site</p>
</body>
</html>

Ensuring your content is in XHTML format allows the parser to correctly interpret the document structure without encountering errors.

Installation and Configuration in Java Projects

To get started with Flying Saucer, add the library to your project. For Maven, include the following dependency in your pom.xml:

<dependency>
    <groupId>org.xhtmlrenderer</groupId>
    <artifactId>flying-saucer-pdf</artifactId>
    <version>9.1.22</version>
</dependency>

For Gradle, add this line to your build.gradle:

implementation 'org.xhtmlrenderer:flying-saucer-pdf:9.1.22'

Essential Dependencies: Setting Up Your Project for Smooth Integration

Flying Saucer requires a few additional libraries to function correctly. Ensure that you have these dependencies:

  • XML Parser: For parsing XHTML content.

  • iText 2: Used internally by Flying Saucer for PDF creation.

Note: Since Flying Saucer depends on iText version 2, which is LGPL licensed, be mindful of licensing implications if you’re developing a commercial application.

Main Features of Flying Saucer

  • Seamless XHTML and CSS Rendering: Supports well-formed XHTML and CSS 2.1, enabling the creation of complex layouts and styles without compromising on design fidelity.

  • Easy Integration with Java Projects: Offers a straightforward API that integrates seamlessly with existing Java applications.

  • Utilizes iTextRenderer Internally: Leverages iTextRenderer from iText 2 for the underlying PDF generation, combining the strengths of both libraries.

  • Form and Interactive Elements Support: Capable of rendering XHTML forms and interactive elements into PDFs.

  • Open-Source and Extensible: Being open-source allows for customization and extension to meet specific project requirements.

It’s important to note that while Flying Saucer uses iTextRenderer, it relies on the older version of iText (version 2), which is LGPL licensed. This integration allows Flying Saucer to handle PDF creation efficiently while focusing on rendering XHTML and CSS content.

Quick Guide to Rendering XHTML with CSS Support Using Flying Saucer

With the dependencies set, you can start converting XHTML to PDF. Here’s a simple example:

import org.xhtmlrenderer.pdf.ITextRenderer;

public void generatePdf(String xhtmlContent, String outputPath) throws Exception {
    ITextRenderer renderer = new ITextRenderer();
    renderer.setDocumentFromString(xhtmlContent);
    renderer.layout();
    try (OutputStream os = new FileOutputStream(outputPath)) {
        renderer.createPDF(os);
    }
}

This code initializes the ITextRenderer, sets the XHTML content, lays out the document, and writes the PDF to the specified path.

Step-by-Step Guide to Creating PDF from XHTML

Let’s create a sample invoice to demonstrate. First, design an XHTML template (invoice.xhtml):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Invoice</title>
    <style type="text/css">
        body { font-family: Arial, sans-serif; }
        .invoice-box { width: 800px; margin: auto; padding: 30px; }
        .header { text-align: center; font-size: 24px; }
        .details { margin-top: 20px; }
        .items { width: 100%; margin-top: 20px; border-collapse: collapse; }
        .items th, .items td { border: 1px solid #ddd; padding: 8px; }
        .items th { background-color: #f2f2f2; }
        .total { text-align: right; margin-top: 20px; }
    </style>
</head>
<body>
    <div class="invoice-box">
        <div class="header">Invoice</div>
        <div class="details">
            <p><strong>Bill To:</strong> John Doe</p>
            <p><strong>Date:</strong> 2023-10-01</p>
        </div>
        <table class="items">
            <tr>
                <th>Description</th><th>Quantity</th><th>Price</th>
            </tr>
            <tr>
                <td>Product A</td><td>2</td><td>$50</td>
            </tr>
            <tr>
                <td>Product B</td><td>1</td><td>$30</td>
            </tr>
        </table>
        <div class="total">
            <p><strong>Total:</strong> $130</p>
        </div>
    </div>
</body>
</html>

Converting XHTML Files to PDF

Use the earlier Java method to read the XHTML file and generate the PDF:

import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.charset.StandardCharsets;
import org.xhtmlrenderer.pdf.ITextRenderer;

public void generatePdfFromXhtmlFile(String inputXhtmlPath, String outputPdfPath) throws Exception {
    String xhtmlContent = new String(Files.readAllBytes(Paths.get(inputXhtmlPath)), StandardCharsets.UTF_8);
    ITextRenderer renderer = new ITextRenderer();
    renderer.setDocumentFromString(xhtmlContent);
    renderer.layout();
    try (OutputStream os = new FileOutputStream(outputPdfPath)) {
        renderer.createPDF(os);
    }
}

Invoke the method:

generatePdfFromXhtmlFile("invoice.xhtml", "invoice.pdf");

Advanced PDF Features: Page Numbers, Headers, and Footers

To add headers, footers, or page numbers, modify your XHTML template:

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<head>
    <title>Invoice with Header and Footer</title>
    <style type="text/css">
        @page {
            @top-center {
                content: element(header);
            }
            @bottom-center {
                content: element(footer);
            }
        }
        .header { position: running(header); text-align: center; font-size: 18px; }
        .footer { position: running(footer); text-align: center; font-size: 12px; }
        .page-number:before { content: counter(page); }
    </style>
</head>
<body>
    <div class="header">
        <p>Company Name</p>
    </div>
    <!-- Main content -->
    <div class="invoice-box">
        <!-- Invoice details -->
    </div>
    <div class="footer">
        <p>Page <span class="page-number"></span></p>
    </div>
</body>
</html>

Flying Saucer recognizes the @page CSS at-rules and the running elements, allowing you to define static headers and footers.

How to Use a PDF API to Automate PDF Creation at Scale

For SaaS platforms, automating PDF generation at scale might require offloading the heavy lifting to a PDF API.

It's also an option to integrate with third-party APIs like pdforge you can handle high-volume PDF generation, complex formatting, and post-processing, all from a single backend call.

Here’s an example of how to integrate pdforge in Rails to convert HTML content into a PDF via an API call:

import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.URL;

public class PdfForgeExample {
    public static void main(String[] args) {
        try {
            URL url = new URL("https://api.pdforge.com/v1/pdf/sync");
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            conn.setRequestMethod("POST");
            conn.setRequestProperty("Authorization", "Bearer your-api-key");
            conn.setRequestProperty("Content-Type", "application/json");
            conn.setDoOutput(true);

            String jsonInputString = "{ \"templateId\": \"your-template\", \"data\": { \"html\": \"your-html\" } }";

            try(OutputStreamWriter writer = new OutputStreamWriter(conn.getOutputStream())) {
                writer.write(jsonInputString);
                writer.flush();
            }

            int responseCode = conn.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) {
                // Read the response and process the PDF
            } else {
                // Handle errors
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

This code sends a POST request to the pdforge API, receives the generated PDF, and saves it locally.

Conclusion

Flying Saucer simplifies the process of converting XHTML and CSS into PDFs, making it an excellent choice for Java developers who prefer designing documents using web technologies. Its ability to render complex layouts with minimal effort sets it apart from other libraries.

However, if your project requires advanced PDF manipulation or you face licensing constraints, alternatives like OpenPDF or Apache PDFBox might be more suitable. These libraries offer different features that could align better with specific project needs.

For SaaS applications demanding high scalability and minimal maintenance, leveraging third-party PDF APIs like pdforge can be a pragmatic solution. These services handle the heavy lifting of PDF generation, allowing you to focus on core application functionality.

Generating pdfs at scale can be quite complicated!

Generating pdfs at scale can be quite complicated!

We take care of all of this, so you focus on what trully matters on your Product!

We take care of all of this, so you focus on what trully matters on your Product!

Try for free

7-day free trial

Table of contents

Title