Deploy the Container

Everything you need to know to deploy the pdfRest API Toolkit Container with your Docker framework.

Image on Docker Hub

The Container image is obtained from the public pdfRest Docker Hub repository. Submit a request form for a Free Trial License Key.

You can use docker pull to retrieve the image or docker run to pull and start a new container. Commands are found on the repository page. Below are instructions for deploying the image via a Dockerfile or with Kubernetes.

Using the API Toolkit Container image

This guide covers two containerized deployment methods, Docker and Kubernetes. It references some Environment Variables which are covered in-depth in the Configure Container Guide.

Docker Compose

Docker Compose is a tool that allows you to define and manage Docker deployments using a simple YAML file. Run the defined service(s) by executing docker compose up in the directory containing the docker-compose.yml file.

A simple Docker Compose file to define the pdfRest API Toolkit Container can be found below:

services:
  pdfrest:
    platform: linux/amd64
    image: pdfrest/pdf-api-toolkit:latest
    restart: always
    environment:
      - PDFREST_SERVER_DOMAIN=https://pdfrest.YOUR-COMPANY-HERE.com
      - LICENSE_KEY=YOUR-LICENSE-KEY-VALUE-HERE
    ports:
     - "80:3000"
    volumes:
      - /tmp:/opt/datalogics/public

This will instruct Docker to pull the latest image from the pdfRest Docker Hub repository and deploy that on a Linux machine.

Below are the environment variables, read the Configuration guide) for more information:

  • The optional PDFREST_SERVER_DOMAIN Environment Variable formats the input and output URLs of documents uploaded to and processed by the API.
  • The LICENSE_KEY variable is required to properly license the server. Without setting this variable before deploying, pdfRest will run in a reduced capabilities mode. Submit a request for a Free Trial License Key.
  • The ports section of the YAML instructs the stack to listen for API calls on port 80 and forward those calls to port 3000 inside of the container.

If you require shared storage between multiple pdfRest containers, set up a shared volume as described in the Docker storage volume documentation and configure the volumes section of the YAML to mount that volume as shown below:

volumes:
  - <your_volume>:/opt/datalogics/public

Kubernetes

Using the image previously loaded with docker load you can now create and expose a deployment. The only item to note is that the pdfRest server listens on port 3000

kubectl create deployment pdfrest --image=pdfrest/pdf-api-toolkit:latest
kubectl expose deployment pdfrest --type=NodePort --port=3000

When running multiple instances you will want to set up a shared volume so that all instances of pdfRest can read and write to the same location. This ensures that all input and output documents are available no matter which instance a processing or retrieval API call load-balances to. This requires the creation of a PersistentVolume.

Here, we will demonstrate with a hostPath PersistentVolume for testing and development purposes. It is not recommended to use a hostPath in a production cluster. A cluster administrator should provision a networked resource such as an NFS share, a Google Cloud persistent disk, Azure File Share, or an AWS EFS volume.

First, create the PersistentVolume:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pdfrest-pv
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  hostPath:
    path: "/mnt/data"

save it as pdfrest-pv.yaml and apply it to your deployment with:

kubectl apply -f pdfrest-pv.yaml

This configuration specifies that the volume will be found at /mnt/data/ on the cluster Node. It also defines the StorageClass name for the PersistentVolume as manual, which will be used to bind PersistentVolumeClaim requests to this PersistentVolume.

Then, create the PersistentVolumeClaim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pdfrest-pvc
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi

Save it as pdfrest-pvc.yaml and apply it to your deployment with:

kubectl apply -f pdfrest-pvc.yaml

The PersistentVolumeClaim should be automatically bound to the PersistentVolume you created.

Then configure the pdfRest deployment to use the PersistentVolumeClaim you made as a volume.

kubectl edit deployment pdfrest

Edit the deployment.yaml, under spec > template > spec:

spec:
# ...
  template:
    # ...
    spec:
      # Add the PersistentVolumeClaim under volumes
      volumes:
      - name: pdfrest-storage
        persistentVolumeClaim:
          claimName: pdfrest-pvc
      containers:
        # Mount to the /opt/datalogics/public directory
        volumeMounts:
          - name: pdfrest-storage
            mountPath: /opt/datalogics/public