pdf libraries

pdf libraries

Javascript

Javascript

How to Generate PDF from HTML Using PDFKit in Node.js

Marcelo Abreu, founder of pdforge

Marcelo | Founder

Marcelo | Founder

Oct 7, 2024

Oct 7, 2024

Introduction to PDFKit: A Reliable PDF Library for Node.js

PDFKit is a powerful open-source library for generating PDF documents in Node.js. Unlike many PDF libraries, PDFKit operates with both flexibility and precision, allowing developers to build documents programmatically or by rendering existing HTML. It supports embedding images, fonts, and CSS styles, making it ideal for complex document layouts.

You can check out the full documentation here.

Alternative PDF Libraries: How PDFKit Compares to Other Tools

While PDFKit is a great tool, it's important to be aware of alternatives.

Download montly for pdf generation javascript libraries

Libraries like Playwright or Puppeteer are widely used for rendering HTML into PDFs via headless browser instances. These tools offer greater accuracy in rendering CSS and are excellent for visually rich documents but come at the cost of increased resource usage. In contrast, PDFKit’s lightweight nature makes it a good fit for projects where performance and simplicity are key, especially for backend services or API-based PDF generation. Other good alternative to PDFKit with the same characteristics is jsPDF.

html to pdf guide using pdfkit
html to pdf guide using pdfkit

Setting Up PDFKit in Node.js

Before diving into generating PDFs, let’s get PDFKit up and running in your Node.js environment. Below is a step-by-step guide to install and configure the library.

Installing PDFKit: Step-by-Step Guide for Node.js

To get started, initialize your Node.js project and install PDFKit using npm:

npm init -y
npm

Once installed, require the library in your project:

const PDFDocument = require('pdfkit');
const fs = require('fs');

Creating a Basic PDF Document with PDFKit

To create a basic PDF, instantiate a new `PDFDocument` and pipe the output to a file:

const doc = new PDFDocument();
doc.pipe(fs.createWriteStream('output.pdf'));
doc.text('Hello, PDFKit!');
doc.end();

This will generate a simple PDF containing the text "Hello, PDFKit!". However, the true power of PDFKit comes into play when you start generating more complex documents, especially when rendering HTML content.

Structuring Your HTML for Seamless PDF Generation

Structuring HTML effectively is essential for clean conversion to PDF. When preparing HTML for conversion, it's critical to use a layout structure that works well with both PDFKit and CSS rendering engines. A typical structure for an invoice might look like this:

<html>
<head>
  <style>
    body { font-family: 'Helvetica'; }
    .invoice-header { text-align: center; margin-bottom: 20px; }
    .invoice-details { width: 100%; margin-bottom: 20px; }
    .item-table { width: 100%; border-collapse: collapse; }
    .item-table th, .item-table td { border: 1px solid #ddd; padding: 8px; }
  </style>
</head>
<body>
  <div class="invoice-header">
    <h1>Invoice #12345</h1>
    <p>Date: 2024-10-01</p>
  </div>
  <div class="invoice-details">
    <p>Customer: John Doe</p>
    <p>Address: 1234 Street Name, City, State</p>
  </div>
  <table class="item-table">
    <thead>
      <tr>
        <th>Item</th>
        <th>Quantity</th>
        <th>Price</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>Product 1</td>
        <td>2</td>
        <td>$50</td>
      </tr>
      <tr>
        <td>Product 2</td>
        <td>1</td>
        <td>$30</td>
      </tr>
    </tbody>
  </table>
</body>
</html>

Converting HTML to PDF with PDFKit: A Practical Guide

Now that we’ve structured our HTML, we can move on to converting it to a PDF document using PDFKit.

Rendering HTML in PDF Format: Key Techniques and Best Practices

PDFKit doesn’t directly convert HTML to PDF like some tools, but it allows you to build the PDF programmatically. To achieve an HTML-to-PDF workflow, we can use libraries like **html-to-pdf** to parse the HTML and render it with PDFKit. This involves first generating HTML and then programmatically placing that content into a PDFKit document.

const htmlToPdf = require('html-to-pdf');
htmlToPdf.generatePdf({ content: 'HTML_CONTENT_HERE' }, { format: 'A4' })
    .then(pdfBuffer => fs.writeFileSync('invoice.pdf', pdfBuffer));

Handling CSS and Media Queries in HTML to PDF Conversion

When converting HTML to PDF, PDFKit requires that you take special care with CSS. Inline styles and media queries may not always behave as expected. It's essential to keep CSS minimal and use PDF-friendly layouts. Avoid complex flexbox or grid layouts that may not render well outside of a browser context.

Embedding Images and Fonts with PDFKit

PDFKit supports embedding custom fonts and images to enhance the PDF layout:

doc.font('fonts/CustomFont.ttf')
   .text('Custom font text', { align: 'center' });
doc.image('path/to/image.jpg', {
   fit: [250, 300],
   align: 'center',
   valign: 'center'
});

Customizing Page Layouts: Headers, Footers, and Margins in PDFKit

You can also customize headers, footers, and margins for each page in your PDF:

doc.text('Header Text', 50, 50);
doc.text('Footer Text', 50, 750);
doc.text('Main Content', { margin: 50 });

PDF Security and Encryption

PDFKit enables you to encrypt PDFs, adding an extra layer of security to sensitive documents:

doc.encrypt({
  userPassword: 'user_password',
  ownerPassword: 'owner_password',
  permissions: {
    printing: 'highResolution',
    modifying: false,
    copying: false,
  }
});

How to Handle Large HTML Files and Pagination in Node.js

When dealing with large HTML files, pagination becomes a key consideration. PDFKit can handle automatic pagination, ensuring content is split across pages without breaking mid-section:

doc.addPage().text('Continued Content Here');

How to Use a PDF API to Automate PDF Creation at Scale

For larger SaaS platforms requiring automated PDF generation at scale, integrating a PDF Generation API like pdforge can offload the heavy lifting. This approach is ideal for SaaS platforms with high volumes of PDF requests.

fetch('https://api.pdforge.com/v1/pdf/sync', {
  method: 'POST',
  body: JSON.stringify({ templateId: 'your-template' , data: {html:'your-html' } }),
  headers: { 'Authorization' : 'Bearer your-api-key' }
});

Conclusion

PDFKit shines when flexibility and programmatic control are needed in PDF generation. It’s perfect for backend services or Node.js-based SaaS products. However, if CSS fidelity or browser-like rendering is crucial, tools like Playwright or Puppeteer might be a better fit. For massive PDF workloads, third-party PDF APIs like pdforge can streamline the process without managing complex infrastructure. Ultimately, choosing the right tool depends on your specific requirements and scalability needs.

Generating pdfs at scale can be quite complicated!

Generating pdfs at scale can be quite complicated!

We take care of all of this, so you focus on what trully matters on your Product!

We take care of all of this, so you focus on what trully matters on your Product!

Try for free

7-day free trial

Table of contents

Title