Files
bjoernpoettker 41eed1871e
Build and Push Multi-Platform Images / build-and-push (push) Successful in 44s
fix: Produktions-Crash durch TypeORM-synchronize beheben
NODE_ENV=production deaktiviert synchronize (zerstörerischer ADD/DROP-COLUMN-
Churn auf MariaDB, der die 8126-Byte-Zeilengröße sprengte) und aktiviert
migrationsRun. Neue data-source.ts als einzige Konfigquelle (Laufzeit + CLI),
Migrations-Workflow (generate/run/revert) inkl. dotenv ergänzt.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 09:27:04 +02:00

6.7 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Paperless Manager is a document automation platform that extends Paperless-NGX. It provides:

  • Scanner inbox management with PDF processing (splitting, rotation, barcode detection)
  • Email import with attachment extraction
  • Rule-based postprocessing (tag, export, send mail)
  • OCR via Ollama (vision model)
  • Label printing agent with SSE-based job queue

UI labels and comments are in German.

Structure

paperless-backend/    # NestJS API (port 3100)
paperless-frontend/   # React 19 + Ant Design + Vite
docker-compose.yml    # Production (backend + frontend/nginx + MySQL)
.env.example          # All required environment variables

Commands

Backend (in paperless-backend/):

npm run start:dev     # Dev server with watch mode
npm run build         # Compile TypeScript
npm run test          # Jest unit tests
npm run test:e2e      # End-to-end tests
npm run lint          # ESLint with auto-fix

Frontend (in paperless-frontend/):

npm run dev           # Vite dev server (proxies /api to backend)
npm run build         # TypeScript check + Vite build
npm run lint          # ESLint

Backend Architecture

Module Overview

Module Purpose
auth JWT (OIDC/Authentik) + API Key guards, permission system
database TypeORM entities + MySQL config (synchronize: true)
inbox Scanner document management (list, preview, rotate, split)
paperless Paperless-NGX REST API client
preprocessing PDF→images, OCR, QR extraction, task creation
postprocessing Rule engine: filter conditions → actions
email IMAP intake, PDF conversion, correspondent mapping
email-download External email retrieval, ZUGFeRD invoice parsing
barcode QR/barcode detection, template-based split actions
label-print-agent SVG label rendering, RxJS-based print job queue + SSE
inbox-postprocessor Applies edits/splits to PDFs, variable substitution
scanner chokidar file system watcher for scan directory
settings Global configuration
user-settings Per-user SMTP and preferences

Authentication & Permissions

Global guards apply to all routes:

  1. JwtOrApiKeyGuard — validates Bearer JWT (OIDC) or X-API-Key header
  2. PermissionsGuard — enforces @RequirePermissions() decorator

Decorate controllers/handlers with:

@RequirePermissions(Permission.VIEW_SCANNER)

Permissions map from OIDC groups (PM_Admin, PM_Belege, etc.) to the Permission enum in src/auth/permissions.enum.ts.

Use @Public() to bypass auth guards entirely.

Database

  • TypeORM with MySQL/MariaDB, UTF8MB4
  • Connection config lives in src/database/data-source.ts (single source of truth, shared by the NestJS runtime and the TypeORM CLI). database.module.ts consumes it.
  • Schema strategy depends on NODE_ENV:
    • Dev (NODE_ENV unset): synchronize: true — schema auto-migrates from entities.
    • Production (NODE_ENV=production): synchronize: false + migrationsRun: true — pending migrations in src/database/migrations/ are applied automatically on boot. Never run synchronize against the production DB — on MariaDB it issues destructive ADD/DROP COLUMN churn every boot (see caveat below).
  • 23 entities in src/database/entities/
  • JSON columns use transformers to normalize empty arrays/objects to null
  • Key entities: InboxDocument, Task, Email, Attachment, Postprocessing, BarcodeTemplate, LabelPrintJob, ApiKey, Setting

Migrations workflow

npm run migration:generate -- src/database/migrations/<Name>   # diff entities → DB
npm run migration:run                                          # apply pending
npm run migration:revert                                       # roll back last

The existing production schema is the implicit baseline (no baseline migration); only future changes get migration files.

Caveat — MariaDB reports json as longtext: every @Column({ type: 'json' }) is stored as longtext ... CHECK (json_valid(...)), so migration:generate always emits spurious no-op CHANGE statements (json↔longtext, nullable/default re-declares). Hand-trim generated migrations down to the real change before committing. (This same false diff is exactly why synchronize must stay off in production — left unchecked it accumulates ALGORITHM=INSTANT drops until a table trips the 8126-byte row-size limit.)

Document Processing Pipeline

Preprocessing (preprocessing/document-pipeline.service.ts):

  1. PDF → PNG page images (200 DPI via pdf-lib + sharp)
  2. QR code extraction from page 1 (jsqr)
  3. OCR via Ollama (llava vision model)
  4. Auto-generated internal document number (YYYY-000001)
  5. DB task entry creation, archive original (GoBD compliance)

Postprocessing (postprocessing/postprocessing.service.ts):

  1. Load rules ordered by priority with filter conditions (JSON)
  2. Evaluate filters (field operators: eq, contains, in, etc.)
  3. Execute actions: tag, update field, send mail, export WebDAV
  4. Log results; apply error tag on failure; stop if NoFurther flag set

Inbox Postprocessor (inbox-postprocessor/):

  • Applies virtual edits (page deletions, rotations, splits) stored in DB to original PDF
  • Resolves variable templates from barcode/task data
  • Returns PDF buffer for upload to Paperless-NGX

Label Print Agent

  • Uses RxJS Subject (newJob$) to stream print jobs via SSE to connected agents
  • Jobs are locked for 5 minutes to prevent race conditions
  • SVG templates rendered to PNG via @resvg/resvg-js
  • API documented in docs/BACKEND_API.md

Frontend Architecture

  • Auth: OIDC flow via oidc-client-ts, AuthContext provides user identity and tokens
  • API layer: Axios-based client methods in src/api/ (one file per domain)
  • Routing: React Router v7 in src/App.tsx
  • Key components: DocumentEditModal, PdfSplitViewer, WysiwygEditor (TipTap), BarcodePositioner
  • During dev, Vite proxies /api to localhost:3100

Environment Variables

Key variables (see .env.example for full list):

Variable Purpose
DB_* MySQL connection
PAPERLESS_URL, PAPERLESS_TOKEN Paperless-NGX API
OIDC_ISSUER, OIDC_CLIENT_ID OIDC provider (Authentik)
OLLAMA_URL, OLLAMA_MODEL OCR service
SCANNER_WATCH_DIR, SCANNER_ARCHIVE_DIR File system paths
BELEGNUMMER_GET_URL, BELEGNUMMER_SET_URL External invoice number API
PORT Backend port (default 3100)
VITE_API_URL Override API URL in frontend (dev only)