I've finished setting up the shares, so it's time now oto install paperless. The repository has an installation via docker compose and via ansible. I need a combination of the two, with the NAS on top, so I've decided to follow this approach to convert the docker compose file to a list of Ansible tasks.
Paperless ahs a number of dependencies:
- Redis
- Gotenberg - to convert office documents to PDF
- Apache tika - Content analysis (aka OCR)
- PostgreSQL - Metadata storage
Dependencies
Directories
I've learned the hard way that ansible with Docker needs volume directories already prepared. For this, I've created a specific task:
- name: Prepare directories
hosts: app
tasks:
- name: Directories
file:
state: directory
path: '{{ item }}'
mode: 0755
owner: document
group: documents
with_items:
- /mnt/documents-consume/consume
- /mnt/documents-consume/export
- /opt/documents/paperless/media
- /opt/documents/paperless/data
All these will be used as docker volumes for paperless.
Group install tasks
I've grouped all subsequent tasks under a single entry:
- name: Set up network
hosts: app
tasks:
Network
I can't use depends_on
in ansible, so I have to emulate the dependencies via a custom network:
- name: Create network
docker_network:
name: network_paperless
ipam_config:
- subnet: '172.16.99.0/24'
You can type in the last line subnet
any internal network you might like.
Redis
Redis installation in docker is simple. With ansible is also simple:
- name: Setup REDIS broker
docker_container:
name: broker
recreate: true
restart_policy: unless-stopped
image: 'redis:6.0'
networks:
- name: 'network_paperless'
PostgreSQL
Next task in line is PostgreSQL:
- name: Install database
docker_container:
name: 'db'
recreate: true
restart_policy: unless-stopped
image: 'postgres:13'
user: '<doc user id>:<doc user group>'
volumes:
- '/mnt/documents/db:/var/lib/postgresql/data'
env:
POSTGRES_DB: 'paperless'
POSTGRES_USER: 'paperless'
POSTGRES_PASSWORD: '{{ paperless_pg_password }}'
networks:
- name: 'network_paperless'
Here, I've had to specify the UID:GID of the document user created previously. Otherwise, the NAS mount (where the DB is located) would barf.
Document processing
The document processing is done via Gotenberg and Apache Tika
- name: Install Gotenberg
docker_container:
name: 'gotenberg'
recreate: true
restart_policy: unless-stopped
image: 'thecodingmachine/gotenberg'
networks:
- name: 'network_paperless'
env:
DISABLE_GOOGLE_CHROME: '1'
- name: Install Apache Tika
docker_container:
name: 'tika'
recreate: true
restart_policy: unless-stopped
image: 'apache/tika'
networks:
- name: 'network_paperless'
PaperlessNG
The final piece is the paperless program.
- name: Install paperless-ng
docker_container:
name: "paperless"
recreate: true
restart_policy: unless-stopped
image: "jonaswinkler/paperless-ng:latest"
#user: "1005:8675310"
networks:
- name: "network_paperless"
ports:
- "38000:8000"
env:
PAPERLESS_REDIS: "redis://broker:6379"
PAPERLESS_DBHOST: "db"
PAPERLESS_DBUSER: "paperless"
PAPERLESS_DBPASS: "{{ paperless_pg_password }}"
PAPERLESS_TIKA_ENABLED: "1"
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: "http://gotenberg:3000"
PAPERLESS_TIKA_ENDPOINT: "http://tika:9998"
USERMAP_UID: "<doc user id>"
USERMAP_GID: "<doc user group>"
PAPERLESS_ADMIN_USER: "admin"
PAPERLESS_ADMIN_PASSWORD: "changeme"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:38000"]
interval: 30s
timeout: 10s
retries: 5
volumes:
- "/mnt/documents/paperless/data:/usr/src/paperless/data"
- "/mnt/documents/paperless/media:/usr/src/paperless/media"
- "/mnt/documents-consume/consume:/usr/src/paperless/consume"
- "/mnt/documents-consume/export:/usr/src/paperless/export"
Here, I had to do the same UID:GID specification, so NAS would work nicely.
Notes
Please note that I have paperless_pg_password
as a secret, as opposed to the default paperless one.
HTH,
Member discussion: