Use paperless-ngx to digitalize your paper assets
Paperless-ngx is open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper. The best way to setup paperless-ngx is to use docker. We will go through steps and best practice tips to install, upgrade and backup paperless-ngx in a self-hosted environment.
Installation
Make sure install Docker and Docker compose runtime first. Offical doc: https://docs.docker.com/compose/install. In linux, switch to a non-root user, and run:
bash -c "$(curl --location --silent --show-error https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"
Answer questions at each step. When prompt for database option:
Database backend (postgres sqlite mariadb) [postgres]:
Use sqlite
is a in-memory database, good for lightweight installation in selfhost environment. Use postgres
for production-ready installation. Here we use sqlite
.
Make selection for these 3 data persistent locations: Consume folder, Media folder, Data folder
- Consume folder: the location where incoming files are picked up and processed
- Media folder: the location where all file are stored
- Data folder: the location where database file is stored
Go ahead with user name and password. Once the script finish, it will download and start the docker container (docker compose up) automatically.
Tips: go to the location where docker-compose.yml file is located, and open it. locate the webserver image entry, update the tag from
latest
to the latest version number. You can find the latest version from: https://github.com/paperless-ngx/paperless-ngx/releases Using the version number will make future upgrade much easier. Once answered all the question, the script will pull the docker images and start it. Stop the container now, withdocker-compose down
, inspect the docker-compose.yml file and make some update and customization, e.g.version: "3.4" services: broker: image: docker.io/library/redis:7 restart: unless-stopped volumes: - ./data/paperless_redisdata:/data webserver: image: ghcr.io/paperless-ngx/paperless-ngx:2.3.2 restart: unless-stopped depends_on: - broker ports: - "8019:8000" healthcheck: test: ["CMD", "curl", "-fs", "-S", "--max-time", "2", "http://localhost:8000"] interval: 30s timeout: 10s retries: 5 volumes: - ./data/paperless_data:/usr/src/paperless/data - /mnt/tb4t/docs/paperless/media:/usr/src/paperless/media - /mnt/tb4t/docs/paperless/export:/usr/src/paperless/export - /mnt/tb4t/docs/paperless/consume:/usr/src/paperless/consume env_file: docker-compose.env environment: PAPERLESS_REDIS: redis://broker:6379
Upgrade ‘’’ docker compose down docker compose pull docker compose up -d ‘’’ Tips on version: make sure to pin version in docker-compose.yml file. not use latest
backup
Only the Media and Data folder needed to be backed-up. The media folder stores all the scanned paper archives (e.g. PDFs, images etc) and the data folder stores the database dump file. If you mount the two directories as above, then only backup the folders by making a copy is enough. In the above examples, backup these two directories:
- /mnt/tb4t/docs/paperless/media
- ./data/paperless_data