# Gigafibre FSM — Ecosystem Architecture > Unified reference document for infrastructure, platform strategy, and application architecture on the remote Docker environment. ## 1. Executive Summary & Platform Strategy Gigafibre FSM is the operations platform for Gigafibre. It replaces a legacy PHP/MariaDB stack with a real-time push ecosystem (Vue 3, Node.js, ERPNext) running on a single Proxmox VM at `96.125.196.67`. Core pillars: - **ERPNext v16** — undisputed Source of Truth (CRM, billing, ticketing). - **Ops SPA** at `erp.gigafibre.ca/ops/` — single pane of glass for internal teams (dispatch, clients, settings, agent flows). - **targo-hub** at `msg.gigafibre.ca` — real-time API gateway (SMS, SSE, AI, OAuth admin, Stripe webhooks, Traccar proxy). - **Client portal** at `client.gigafibre.ca` — customer self-service. **Decommissioned (May 2026):** - ✗ `Oktopus CE` (TR-369 stack at `oss.gigafibre.ca`) — broker spammed 75 GB of debug logs over 13 days, took ERPNext down for 4. Stack removed (containers + volumes + images). The hub gates the integration behind `OKTOPUS_DISABLED=1` so the modules can be re-enabled later if we deploy a different USP controller. - ✗ `dispatch-app` (legacy PHP SPA at `dispatch.gigafibre.ca`) — now 301-redirects to `/ops/#/dispatch`. nginx config at `/opt/dispatch-app/nginx.conf` on the prod box. - ✗ `apps/field` — replaced by the lightweight mobile tech page at `/t/{token}` (server-rendered by `services/targo-hub/lib/tech-mobile.js`). **Two Authentik instances, in parallel — not a migration:** - `auth.targo.ca` (staff) — protects /ops/, n8n, Gitea; OAuth provider for ERPNext sign-in. - `id.gigafibre.ca` (clients) — protects the customer portal. --- ## 2. Infrastructure & Docker Networks All services are containerized and housed on a single Proxmox VM (`96.125.196.67`), managed via Traefik. ```text Internet │ 96.125.196.67 (Proxmox VM, Ubuntu 24.04) │ ├─ Traefik v2.11 (:80/:443, Let's Encrypt, ForwardAuth) │ ├─ Authentik (auth.targo.ca) → SSO for staff (ops, n8n, Gitea, ERPNext OAuth) ├─ Authentik (id.gigafibre.ca) → SSO for client portal │ ├─ ERPNext v16.10.1 (erp.gigafibre.ca) → 9 containers (db, redis, backend, queues, scheduler, websocket, n8n, n8n-proxy) │ ├─ Ops SPA (erp.gigafibre.ca/ops/) → Served via nginx:alpine from /opt/ops-app/ ├─ Dispatch redirect (dispatch.gigafibre.ca) → 301 → /ops/#/dispatch (former dispatch-app, decommissioned) │ ├─ targo-hub (msg.gigafibre.ca) → Node 20, /opt/targo-hub/ ├─ DocuSeal (docs.gigafibre.ca) → Contract e-signature ├─ traccar-proxy → nginx relay for Traccar UI │ └─ Marketing site (www.gigafibre.ca) → React/Vite/Tailwind ``` **DNS Configuration (Cloudflare):** - Domain `gigafibre.ca` is strictly DNS-only (no Cloudflare proxy) to allow Traefik Let's Encrypt generation. - Email via Mailjet + Google Workspace records configured on root. **Docker Networks:** - `proxy`: Public-facing network connected to Traefik. - `erpnext_erpnext`: Internal network for Frappe, Postgres, Redis, and targo-hub routing. --- ## 3. Core Services ### ERPNext (The Backend) - **Database:** PostgreSQL (`erpnext-db-1`). - **Extensions:** Custom doctypes for Dispatch Job, Technician, Tag, Service Location, Service Equipment, Subscription. - **API Token Auth:** `targo-hub` and the Ops PWA interact with Frappe via a highly-privileged service token (`Authorization: token ...`). ### Targo-Hub (API Gateway) - **Stack:** Node.js 20 (`msg.gigafibre.ca:3300`). - **Purpose:** Acts as the middleman for all heavy or real-time workflows out of ERPNext's scope. - **Key Abilities:** - Real-time Server-Sent Events (SSE) for timeline/chat updates. - Twilio SMS / Voice (IVR) routing. - Modem polling (GenieACS, OLT SNMP proxy). - Webhooks handling (Stripe payments, Uptime-Kuma, 3CX). ### Modem-Bridge - **Stack:** Playwright/Chromium (`:3301` internal). - **Purpose:** Allows reading encrypted TR-181 parameters from TP-Link XX230v modems by leveraging the modem's native JS cryptography. Exposes a simple JSON REST API locally to targo-hub. ### Vision / OCR (Gemini via targo-hub) - **Model:** Gemini 2.5 Flash (Google) — no local GPU, all inference remote. - **Endpoints (hub):** `/vision/barcodes`, `/vision/equipment`, `/vision/invoice`. - **Why centralized:** ops VM has no GPU, so the legacy Ollama `llama3.2-vision` install was retired. All three frontends (ops, field-as-ops `/j`, future client portal) hit the hub, which enforces JSON `responseSchema` per endpoint. - **Client-side resilience:** barcode scans use an 8s timeout + IndexedDB retry queue so techs in weak-LTE zones don't lose data. See [../features/vision-ocr.md](../features/vision-ocr.md) for the full pipeline. --- ## 4. Security & Authentication Flow ```text Staff user → erp.gigafibre.ca/ops/ (or n8n, Gitea) → Traefik checks session via ForwardAuth middleware → Outpost validates with Authentik staff (auth.targo.ca) → Authorized? Request forwarded to upstream container with X-Authentik-Email + X-Authentik-Groups headers → Ops SPA reads X-Authentik-Email; useUserGroups maps groups to in-app capabilities Customer user → client.gigafibre.ca → Traefik checks session via separate ForwardAuth chain → Outpost validates with Authentik client (id.gigafibre.ca) ``` **Two distinct ForwardAuth middlewares**: - `authentik@file` → backed by `auth.targo.ca` (staff) - `authentik-client@file` → backed by `id.gigafibre.ca` (customers) **ERPNext OAuth** — `auth.targo.ca` is also configured as a Frappe Social Login Key (provider name `Authentik`). The login page at `/login` shows both the password form and the "Login with Authentik" button. OAuth client_id `P0rFFdq2hhun7hOLwkF5zm87vvDqcVYAhLtoZnFX`, redirect_uri `/api/method/frappe.integrations.oauth2_logins.custom/authentik`. **Adding new users** is centralized through the hub, not the Authentik admin UI. The ops Settings page (`Settings → Utilisateurs → Inviter`) hits `POST /auth/users` on `msg.gigafibre.ca` which: 1. Creates the Authentik user (random username from local-part of email, password set explicitly), assigns OPS_GROUPS. 2. Sets a temp password (readable, no look-alikes) and emails it via the hub's Mailjet SMTP — Authentik's own recovery flow isn't wired (`flow_recovery=None` on the brand) and its global SMTP is unset, so the hub does it directly. 3. Creates the matching ERPNext User (System User, social_logins = [{provider:authentik, userid:email}]) so OAuth finds it on first login. The temp password is also returned to the admin (UI shows it with a copy button) so they can hand it over manually if Mailjet drops the message. See `services/targo-hub/lib/auth.js` for the full flow. **API Security**: frontends rely on the Authentik session cookie forwarded by Traefik. Backend scripts and the hub use `Authorization: token ` Bearer headers. --- ## 5. Network Intelligence & CPE Flow **Device Diagnostics (`targo-hub → GenieACS / OLT`)** When a CSR clicks "Diagnostiquer" in the Ops app: 1. Ops app asks `/devices/lookup?serial=X`. 2. `targo-hub` polls GenieACS NBI. 3. If deep data is needed, `targo-hub` queries `modem-bridge` (for TP-Link) or the OLT SNMP directly. 4. Returns consolidated interface, mesh, wifi, and opticalStatus array to the UI. **Future: QR Code Flow** - Tech applies QR sticker to modem (`msg.gigafibre.ca/q/{mac}`). - Client scans QR → `targo-hub` identifies customer via MAC matching in ERPNext. - Triggers SMS OTP → Client views diagnostic portal. --- ## 6. Development Gotchas 1. **Traefik v3** is incompatible with Docker 29 due to API changes. Stay on v2.11. 2. **Never click "Generate Keys"** for the Administrator user in ERPNext — it breaks the `targo-hub` API token (silently). 3. **Traccar API** supports only one `deviceId` per request. Use parallel polling (`Promise.allSettled`) — see `services/targo-hub/lib/traccar.js`. 4. **Docker log rotation** is set globally via `/etc/docker/daemon.json` (`max-size=100m, max-file=3`). Applied at container creation — old containers keep their previous (uncapped) policy until you `compose up -d --force-recreate` them. We learned this the hard way when the Oktopus broker filled `/var/sdb` with 75 GB of debug logs in 13 days. 5. **Weekly prune** runs via `/etc/cron.d/docker-prune` Sunday 03:00 ET — clears anything not used in 30 days. Don't add a stack you only run monthly without `restart: always` or it'll get pruned out. 6. **PostgreSQL transaction-aborted errors** in the backend log — usually benign (one bad query in the Frappe scheduler) but if persistent, it's the connection pool needing a recycle. `docker restart erpnext-backend-1` resolves. 7. **Authentik recovery flow** isn't configured on the brand. Don't use `recovery_email/` from the API — use the hub invite flow described in §4 instead.