Compress PDF Custom Profiles

PDF Compression Levels

When you use the compression API, select the level of compression you want to apply to your PDF file in the compression_level setting.

We provide three standard values for compressing your PDF document:

  1. high Make the PDF file as small as possible. This may reduce the quality of the output. If you choose high, you may see a difference in the way the PDF document renders when you print it or open it in a viewing tool.
  2. medium A balance between high and low compression.
  3. low Preserve the quality of the PDF file at the expense of file size optimization.

Compression Effects

The compression_level parameter applies different effects to different resources within a PDF. The following is a list of how high, medium, and low affect resources such as images, transparencies, and more:

Images
Fonts
Objects
User Data
Cleanup
Color Conversion
Transparency

The final quality of images in the PDF file after compression.

  • high Minimum size. Compress images aggressively to reduce the size of the PDF file.
  • medium A balance between high and low.
  • low Maximum size. Protect the final appearance of the images at the cost of a larger file size.

Customize Your Profile

Your JSON profile file should include a list of settings that define exactly what kinds of changes you want to apply to your PDF document. You can make your custom profile file as long or as short as you need, depending on the types of changes you plan to apply to optimize your PDF document.

We offer a variety of options to use when optimizing a PDF document. The options you select will depend on how you want to change your output PDF documents, and on your goals.

Suppose you have a PDF document that is 18 MB and you want to make it smaller so that the file will be easier to distribute online. If you expect that your readers will be opening the file in a browser window, and it does not matter if the photographs and diagrams in the document appear with a lower resolution, you could make the document smaller by compressing the images included in the file.

On the other hand, if you are working with a large PDF document that your customers are likely to want to print, but you want to make it smaller so that it downloads and prints more quickly, you probably want to leave the graphics alone. They will need to appear as sharp as possible. But you do not need interactive content, like form fields, bookmarks, comments, or digital signatures. You can use this Custom JSON profile feature to remove items from the PDF document that will not appear on paper.

Or maybe you are building a PDF document that you intend for people to read on smart phones and other mobile devices. In this case you want to compress the document so that it opens as quickly as possible. So you would reduce the size of the images in this case as well, but given that the screens are a lot smaller than a laptop or desktop monitor, you can reduce the resolution of the images in the PDF document to be less than a PDF document that is intended for opening in a browser window.

All of the settings for compressing a document are optional, and are turned off by default. That means then that a setting is only applied if it is included in the JSON file. Flag settings must be enabled, or set to on if the flag is an on/off value. Settings that are turned off do not need to be defined in the JSON profile file. So if you wanted to you could create a custom JSON file with only a single setting, to compress images. Your JSON file might only hold five or six lines of text.

Only use lowercase characters for the keys and values you add to the JSON profile file.

Example JSON Profile

Custom JSON Profile
{
    "images": {
        "color": {
            "downsample": {
                "trigger-dpi": 225,
                "target-dpi": 150
            },
            "recompress": {
                "type": "zip-jpeg",
                "quality": "medium"
            }
        },
        "grayscale": {
            "downsample": {
                "trigger-dpi": 225,
                "target-dpi": 150
            },
            "recompress": {
                "type": "zip-jpeg",
                "quality": "medium"
            }
        },
        "monochrome": {
            "downsample": {
                "trigger-dpi": 450,
                "target-dpi": 300
            },
            "recompress": {
                "type": "jbig2",
                "quality": "lossy"
            }
        },
        "optimize-images-only-if-reduction-in-size": "on",
        "consolidate-duplicate-objects": "on",
        "down-convert-16-to-8-bpc-images": "on"
    },
    "fonts": {
        "subset-embedded-fonts": "on",
        "consolidate-duplicate-fonts": "on",
        "unembed-standard-14-fonts": "on",
        "resubset-subset-fonts": "on",
        "remove-unused-fonts": "on"
    },
    "objects": {
        "discard-javascript-actions": "off",
        "discard-alternate-images": "on",
        "discard-thumbnails": "on",
        "discard-document-tags": "on",
        "discard-bookmarks": "off",
        "discard-output-intent": "on"
    },
    "userdata": {
        "discard-comments-forms-multimedia": "off",
        "discard-xmp-metadata-padding": "on",
        "discard-document-information-and-metadata": "on",
        "discard-file-attachments": "off",
        "discard-private-data": "on",
        "discard-hidden-layer-content": "off"
    },
    "cleanup": {
        "compression": "compress-entire-file",
        "flate-encode-uncompressed-streams": "on",
        "convert-lzw-to-flate": "on",
        "optimize-page-content": "on",
        "optimize-for-fast-web-view": "off"
    },
    "general": {
        "write-output-even-if-increase-in-size": "off",
        "preserve-version": "off"
    },
    "color-conversion": {
        "enabled": "off",
        "color-convert-action": "convert",
        "convert-intent": "profile-intent",
        "convert-profile": "srgb"
    },
    "pdfa-conversion": {
        "enabled": "off",
        "type": "1b",
        "pdfa-target-color-space": "rgb",
        "rasterize-if-errors-encountered": "off"
    }
}

Optimization Parameters

Parameter Categories

The methods you can use to optimize a PDF document are sorted into the following categories:

Image Compression Details

Downsampling Images

If you have images in a PDF document that you want to make smaller, and you know that these images don’t need to have a high resolution in the output file, you can reduce the resolution of these images. You can also compress these images within the file. Both steps will reduce the final size of the PDF document.

The process of reducing the resolution of images is called downsampling. You can choose to downsample color images in a PDF document, or grayscale, or monochrome (black & white). The settings for reducing the resolution for these three kinds of images in a PDF document must be added separately to the JSON profile file. Each type of image can have its own settings and resolution values. So you could, for example, enable resampling to only apply to the color images in a PDF document. Or you could include only grayscale and black and white images.

Downsampling and Recompression

Downsampling reduces the size of the image directly by reducing the resolution. In recompression, compressed images in a document are decompressed and then compressed again. You can enter a recompression setting to change the compression algorithm used for recompression, such as ZIP, JPEG or Flate, and another setting to change the final image quality. The image quality is part of the compression method used.

If you add settings in the JSON profile file to downsample images, the Datalogics PDF REST APIs will also recompress the images involved whether you provide recompression settings or not.

If you do not add recompression settings to the JSON profile, the API downsamples and recompresses each image in the PDF document using the default compression algorithm and quality value defined in the image itself. For example, if you provide downsample settings but not recompression settings in your JSON profile, and apply that profile to a document that only holds images using JPEG compression, the API will use the JPEG compression method. It will also use the highest quality recompression setting available (“maximum”) to keep from reducing the quality of the images as they are recompressed.

On the other hand, if you decide to leave out downsample settings from your JSON profile file, but add recompression settings, the API will recompress the images using the recompression algorithm you provide while keeping the image downsampling resolution (DPI) the same. Note that if you add recompression settings you must include both values in the JSON file, the compression algorithm and the recompression quality level.

Image Resolution

When we refer to the resolution of an image, we generally refer to the number of pixels in that image. This can be expressed in terms of megapixels, or in Dots per Inch (DPI). With an image in a PDF document, the resolution of the image is expressed as a certain number of pixels wide and pixels high. The downsampling process involves changing the width and height of an image in pixels, in order to reach a given target resolution. The API calculates the resolution for every image in the document. Keep in mind that the resolution values used with downsampling are distinct from the image quality settings used for image recompression.

You can specify a target resolution to use for downsampling images in a document (target-dpi) and a trigger resolution (trigger-dpi). If you decide to downsample an image type, both the target and the trigger resolution settings must be included in your profile file. The target resolution defines the goal—the maximum resolution for every image in the file. So if you add a target resolution to your JSON profile and set that target resolution to 600 DPI, the API will downsample every graphic in the PDF document to 600 DPI unless it that image is already at 600 DPI or less.

The trigger resolution, if used, defines the resolution the API uses as its starting point. Any image with a resolution greater than the trigger resolution will be downsampled. If an image has a resolution less than the trigger resolution, PDF Optimizer ignores it.

So if you set the trigger resolution to 800 DPI, and the target resolution to 400 DPI, it means that you want to downsample every image in the PDF document to 400 DPI, but only if the image is larger than 800 DPI to begin with. In this example you would be telling the API to look for only the really large images (the ones with a resolution at 800 DPI or more) and then downsample just those images to a certain set value, in this example 400 DPI.

If the trigger resolution is 500 DPI, and the target resolution is 400 DPI, the API will not downsample an image if it is 480 DPI. But if the trigger resolution is 500 and the target is 400, if the API finds an image with a resolution of 680 DPI, it will downsample it to 400 DPI.