NODE_ENV=production deaktiviert synchronize (zerstörerischer ADD/DROP-COLUMN- Churn auf MariaDB, der die 8126-Byte-Zeilengröße sprengte) und aktiviert migrationsRun. Neue data-source.ts als einzige Konfigquelle (Laufzeit + CLI), Migrations-Workflow (generate/run/revert) inkl. dotenv ergänzt. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6.7 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Paperless Manager is a document automation platform that extends Paperless-NGX. It provides:
- Scanner inbox management with PDF processing (splitting, rotation, barcode detection)
- Email import with attachment extraction
- Rule-based postprocessing (tag, export, send mail)
- OCR via Ollama (vision model)
- Label printing agent with SSE-based job queue
UI labels and comments are in German.
Structure
paperless-backend/ # NestJS API (port 3100)
paperless-frontend/ # React 19 + Ant Design + Vite
docker-compose.yml # Production (backend + frontend/nginx + MySQL)
.env.example # All required environment variables
Commands
Backend (in paperless-backend/):
npm run start:dev # Dev server with watch mode
npm run build # Compile TypeScript
npm run test # Jest unit tests
npm run test:e2e # End-to-end tests
npm run lint # ESLint with auto-fix
Frontend (in paperless-frontend/):
npm run dev # Vite dev server (proxies /api to backend)
npm run build # TypeScript check + Vite build
npm run lint # ESLint
Backend Architecture
Module Overview
| Module | Purpose |
|---|---|
auth |
JWT (OIDC/Authentik) + API Key guards, permission system |
database |
TypeORM entities + MySQL config (synchronize: true) |
inbox |
Scanner document management (list, preview, rotate, split) |
paperless |
Paperless-NGX REST API client |
preprocessing |
PDF→images, OCR, QR extraction, task creation |
postprocessing |
Rule engine: filter conditions → actions |
email |
IMAP intake, PDF conversion, correspondent mapping |
email-download |
External email retrieval, ZUGFeRD invoice parsing |
barcode |
QR/barcode detection, template-based split actions |
label-print-agent |
SVG label rendering, RxJS-based print job queue + SSE |
inbox-postprocessor |
Applies edits/splits to PDFs, variable substitution |
scanner |
chokidar file system watcher for scan directory |
settings |
Global configuration |
user-settings |
Per-user SMTP and preferences |
Authentication & Permissions
Global guards apply to all routes:
JwtOrApiKeyGuard— validates Bearer JWT (OIDC) orX-API-KeyheaderPermissionsGuard— enforces@RequirePermissions()decorator
Decorate controllers/handlers with:
@RequirePermissions(Permission.VIEW_SCANNER)
Permissions map from OIDC groups (PM_Admin, PM_Belege, etc.) to the Permission enum in src/auth/permissions.enum.ts.
Use @Public() to bypass auth guards entirely.
Database
- TypeORM with MySQL/MariaDB, UTF8MB4
- Connection config lives in
src/database/data-source.ts(single source of truth, shared by the NestJS runtime and the TypeORM CLI).database.module.tsconsumes it. - Schema strategy depends on
NODE_ENV:- Dev (
NODE_ENVunset):synchronize: true— schema auto-migrates from entities. - Production (
NODE_ENV=production):synchronize: false+migrationsRun: true— pending migrations insrc/database/migrations/are applied automatically on boot. Never runsynchronizeagainst the production DB — on MariaDB it issues destructiveADD/DROP COLUMNchurn every boot (see caveat below).
- Dev (
- 23 entities in
src/database/entities/ - JSON columns use transformers to normalize empty arrays/objects to
null - Key entities:
InboxDocument,Task,Email,Attachment,Postprocessing,BarcodeTemplate,LabelPrintJob,ApiKey,Setting
Migrations workflow
npm run migration:generate -- src/database/migrations/<Name> # diff entities → DB
npm run migration:run # apply pending
npm run migration:revert # roll back last
The existing production schema is the implicit baseline (no baseline migration); only future changes get migration files.
Caveat — MariaDB reports json as longtext: every @Column({ type: 'json' })
is stored as longtext ... CHECK (json_valid(...)), so migration:generate always
emits spurious no-op CHANGE statements (json↔longtext, nullable/default re-declares).
Hand-trim generated migrations down to the real change before committing. (This same
false diff is exactly why synchronize must stay off in production — left unchecked it
accumulates ALGORITHM=INSTANT drops until a table trips the 8126-byte row-size limit.)
Document Processing Pipeline
Preprocessing (preprocessing/document-pipeline.service.ts):
- PDF → PNG page images (200 DPI via pdf-lib + sharp)
- QR code extraction from page 1 (jsqr)
- OCR via Ollama (llava vision model)
- Auto-generated internal document number (YYYY-000001)
- DB task entry creation, archive original (GoBD compliance)
Postprocessing (postprocessing/postprocessing.service.ts):
- Load rules ordered by priority with filter conditions (JSON)
- Evaluate filters (field operators: eq, contains, in, etc.)
- Execute actions: tag, update field, send mail, export WebDAV
- Log results; apply error tag on failure; stop if
NoFurtherflag set
Inbox Postprocessor (inbox-postprocessor/):
- Applies virtual edits (page deletions, rotations, splits) stored in DB to original PDF
- Resolves variable templates from barcode/task data
- Returns PDF buffer for upload to Paperless-NGX
Label Print Agent
- Uses RxJS
Subject(newJob$) to stream print jobs via SSE to connected agents - Jobs are locked for 5 minutes to prevent race conditions
- SVG templates rendered to PNG via
@resvg/resvg-js - API documented in
docs/BACKEND_API.md
Frontend Architecture
- Auth: OIDC flow via
oidc-client-ts,AuthContextprovides user identity and tokens - API layer: Axios-based client methods in
src/api/(one file per domain) - Routing: React Router v7 in
src/App.tsx - Key components:
DocumentEditModal,PdfSplitViewer,WysiwygEditor(TipTap),BarcodePositioner - During dev, Vite proxies
/apitolocalhost:3100
Environment Variables
Key variables (see .env.example for full list):
| Variable | Purpose |
|---|---|
DB_* |
MySQL connection |
PAPERLESS_URL, PAPERLESS_TOKEN |
Paperless-NGX API |
OIDC_ISSUER, OIDC_CLIENT_ID |
OIDC provider (Authentik) |
OLLAMA_URL, OLLAMA_MODEL |
OCR service |
SCANNER_WATCH_DIR, SCANNER_ARCHIVE_DIR |
File system paths |
BELEGNUMMER_GET_URL, BELEGNUMMER_SET_URL |
External invoice number API |
PORT |
Backend port (default 3100) |
VITE_API_URL |
Override API URL in frontend (dev only) |