PDF Compression Methods
Technical reference: structural optimization vs image recompression vs full-page rasterization.
Overview
PulpPDF uses three compression strategies, applied depending on the selected preset.
Structural Optimization
Reorganizes the internal PDF structure without touching visual content.
| Operation | Effect |
|---|---|
| Object stream generation | Groups small objects into compressed streams |
| Stream compression | Applies zlib/deflate to all streams |
| Unreferenced object removal | Strips orphaned objects |
| PDF version normalization | Forces PDF 1.7 output |
Used by: All presets except None.
Savings: 5-20%, depending on how well the original PDF was optimized.
Image Recompression
Decodes embedded raster images, optionally resizes them, and re-encodes as JPEG.
Supported input formats
| Filter | Format |
|---|---|
| DCTDecode | JPEG |
| FlateDecode | Raw/zlib compressed (PNG-like) |
Supported color spaces
- DeviceRGB
- DeviceGray
- DeviceCMYK (converted to RGB before encoding)
Process
- Decode image from PDF stream
- If image dimensions exceed the DPI cap, downscale using Lanczos3 interpolation
- Encode as JPEG at the preset's quality level
- Compare size: only replace if the new image is smaller
Used by: High Quality (85% / 300 DPI), Balanced (60% / 150 DPI), Maximum (35% / 72 DPI).
Skipped images
- BitsPerComponent < 8
- Dimensions < 4x4 pixels
- Images that would grow after recompression
Full-Page Rasterization
Renders each page as a bitmap and builds a new PDF from JPEG images.
Process
- Render each page at target DPI as an RGBA bitmap
- Convert RGBA to RGB
- Encode as JPEG at 40% quality
- Build new PDF with one image per page
- Run structural optimization pass
Used by: Ultra preset only.
Trade-off: Maximum compression, but text becomes non-selectable (use OCR to restore searchability).
Two-Pass Pipeline
The standard presets use a two-pass approach:
- Flatten pass: Unpack object streams for image access
- Image recompression: Iterate image objects, decode, resize, re-encode
- Optimization pass: Repack with object streams and stream compression
