Building a Consistent PDF-to-Image Pipeline

When building AI workflows that involve PDFs, one common approach is to render PDF pages to images in the browser and send them to a server for processing. It’s fast, lightweight, and works well with web-based tools. But as we recently discovered, even the smallest rendering differences between browsers can have outsized effects downstream.

In this post, we’ll walk through a real-world debugging session involving cross-browser inconsistencies in canvas rendering—and how we solved it by designing around Chrome’s behavior.

The Setup

We had a client-side helper function that used pdf.js to render a PDF page to a canvas, convert that canvas to a PNG, and send it to an API server for processing. The code looked something like this:

const page = await window.pdf.getPage(pageNumber);
const viewport = page.getViewport({ scale });
const canvas = document.createElement("canvas");
const context = canvas.getContext("2d");

canvas.width = viewport.width;
canvas.height = viewport.height;

await page.render({ canvasContext: context, viewport }).promise;
canvas.toBlob(blob => {
  formData.append("image", blob, "page.png");
  fetch("/api/analyze", { method: "POST", body: formData });
}, "image/png");

This worked flawlessly on Chrome, but produced different results when run in Safari and Edge.

The Symptom

The server ran a computer vision pipeline to detect shapes in the image, such as circles indicating key instrumentation or equipment elements. On images rendered by Safari and Edge, the system started detecting extra circles around existing ones, introducing false positives.

We were puzzled at first. The images looked identical to the naked eye. File formats, dimensions, and resolutions were all the same. And yet, the API server behaved differently.

The Investigation

We started logging key indicators:

File size: Chrome requests were slightly larger than Safari’s
Canvas dimensions: identical
PNG metadata: no obvious differences

To go deeper, we saved the rendered images from each browser and compared them using ImageMagick:

compare -metric AE safari.png chrome.png diff.png

The output showed over 499,000 differing pixels, even though the images looked the same.

The root cause was tiny shifts introduced by:

Different anti-aliasing and font rendering engines (Blink vs WebKit)
Subtle variations in canvas API implementations
Possibly different default color profiles (e.g., Display P3 vs sRGB)

These shifts weren’t perceptible to a human viewer, but were enough to throw off the shape detection algorithm.

The Resolution

Rather than complicate the pipeline with browser-specific fixes, we took a practical path:

We standardized rendering around Chrome

Since the app would eventually run inside Electron (which uses Chromium), aligning with Chrome meant we’d have a consistent rendering engine on both web and desktop.

Other measures included:

Explicitly disabling image smoothing:

context.imageSmoothingEnabled = false;
context.imageSmoothingQuality = "low";

Switching to canvas.toBlob() instead of toDataURL() to avoid base64 encoding inconsistencies
Verifying canvas resolution and scale settings

Key Lessons

This small issue carried important lessons for image processing workflows:

Canvas rendering is not pixel-identical across browsers, even with the same code and dimensions.
Tiny visual shifts can have huge impact on downstream tasks like OCR or shape detection.
Standardization beats normalization: Where possible, choose a single rendering engine and stick to it.
Electron alignment makes sense for hybrid desktop-web deployments.

Who This Affects

If you’re building any of the following:

AI tools that work off PDF images
Browser-based document annotation platforms
Shape detection, OCR, or CV pipelines involving user-generated images

…then beware of cross-browser rendering differences. Consistency isn’t just a UI concern—it’s foundational for reliable automation.

How We Can Help

At Storm Consulting, we specialize in robust, engineering-grade automation workflows. Whether you’re rendering technical diagrams, digitizing scanned documents, or building AI-powered document pipelines: we help ensure your stack is accurate, stable, and production-ready.

Want to build something similar? Contact us — we’d love to hear from you.

Like this? Share it with your network!