Mass refresh — the docs were last touched 2026-04-22, two weeks behind
shipped reality. This commit updates 9 files to reflect current truth.
WHAT CHANGED IN THE PRODUCT (since 22 Apr) THAT THE DOCS NOW REFLECT:
• Oktopus CE / TR-369 stack decommissioned (containers + volumes +
images all removed; broker had filled /dev/sdb with 75 GB of debug
logs and took ERPNext down for 4 days). Hub gates the integration
behind OKTOPUS_DISABLED=1 — modules retained, no-op'd at runtime.
• dispatch.gigafibre.ca (legacy PHP SPA) replaced by an nginx 301
redirect to /ops/#/dispatch.
• Top toolbar of the dispatch module: collapsed to single-color
Lucide icons + ⋯ overflow menu + "Vue principale ▾" + "[👥 N ▾]"
resource type chip (defaults to techs, materials in the dropdown
only when relevant).
• Tech home base / departure point: editable per-tech via 📍 button,
address geocode (Nominatim) or click-on-map picker, right-click
on tech pin opens the same actions. Map defaults centered on
Gigafibre HQ (1867 chemin de la Rivière, Sainte-Clotilde) instead
of downtown Montreal.
• POST /auth/users invite flow on the hub: creates the Authentik
user, sets a temp password, mails it via Mailjet (Authentik's
own recovery flow isn't configured), creates the matching ERPNext
System User. Surfaced in ops Settings → Utilisateurs → Inviter.
• Two Authentik instances clarified as parallel-and-permanent (not
a migration): auth.targo.ca for staff, id.gigafibre.ca for clients.
FILES TOUCHED:
README.md — service table refreshed, arch diagram redrawn (no
Oktopus row), auth section explains the invite flow + two
parallel instances.
docs/architecture/overview.md — new "Decommissioned" section,
correct retirement status for dispatch-app + apps/field, two
Authentik instances explicitly distinguished, dev-gotchas list
rewritten (drops MongoDB AVX, adds log-rotation hard-learned
lesson, adds note about Authentik recovery flow).
docs/architecture/data-model.md — Step 5 hardware provisioning
now describes the GenieACS path (TR-069 Inform → preset push)
instead of the dead TR-369 path.
docs/architecture/module-interactions.md — oktopus.js and
oktopus-mqtt.js entries marked as gated, provision.js note
updated, GenieACS row in external-integrations updated, MQTT
row removed from real-time channels, interaction matrix loses
the Oktopus column and gains an Authentik admin REST cell.
docs/features/dispatch.md — Top bar section completely rewritten
to match the current chrome (left/center/right regions,
single-color Lucide, dropdowns); new Tech home base section
documenting the 📍 + map-pick + right-click flows; retirement
note now reads as a status, not a plan.
docs/features/cpe-management.md — full rewrite. Oktopus migration
plan replaced by a "decommissioned" note + the existing GenieACS
+ modem-bridge architecture as the steady state. TP-Link XX230v
deep-dive sections preserved (still accurate).
docs/README.md, docs/features/README.md, docs/roadmap.md —
intent-table descriptions and live-URLs table corrected.
The docs/archive/ snapshots (2026-04-18, 2026-04-19) are untouched —
they're historical and should remain that way.
8.9 KiB
Gigafibre FSM — Ecosystem Architecture
Unified reference document for infrastructure, platform strategy, and application architecture on the remote Docker environment.
1. Executive Summary & Platform Strategy
Gigafibre FSM is the operations platform for Gigafibre. It replaces a
legacy PHP/MariaDB stack with a real-time push ecosystem (Vue 3,
Node.js, ERPNext) running on a single Proxmox VM at 96.125.196.67.
Core pillars:
- ERPNext v16 — undisputed Source of Truth (CRM, billing, ticketing).
- Ops SPA at
erp.gigafibre.ca/ops/— single pane of glass for internal teams (dispatch, clients, settings, agent flows). - targo-hub at
msg.gigafibre.ca— real-time API gateway (SMS, SSE, AI, OAuth admin, Stripe webhooks, Traccar proxy). - Client portal at
client.gigafibre.ca— customer self-service.
Decommissioned (May 2026):
- ✗
Oktopus CE(TR-369 stack atoss.gigafibre.ca) — broker spammed 75 GB of debug logs over 13 days, took ERPNext down for 4. Stack removed (containers + volumes + images). The hub gates the integration behindOKTOPUS_DISABLED=1so the modules can be re-enabled later if we deploy a different USP controller. - ✗
dispatch-app(legacy PHP SPA atdispatch.gigafibre.ca) — now 301-redirects to/ops/#/dispatch. nginx config at/opt/dispatch-app/nginx.confon the prod box. - ✗
apps/field— replaced by the lightweight mobile tech page at/t/{token}(server-rendered byservices/targo-hub/lib/tech-mobile.js).
Two Authentik instances, in parallel — not a migration:
auth.targo.ca(staff) — protects /ops/, n8n, Gitea; OAuth provider for ERPNext sign-in.id.gigafibre.ca(clients) — protects the customer portal.
2. Infrastructure & Docker Networks
All services are containerized and housed on a single Proxmox VM (96.125.196.67), managed via Traefik.
Internet
│
96.125.196.67 (Proxmox VM, Ubuntu 24.04)
│
├─ Traefik v2.11 (:80/:443, Let's Encrypt, ForwardAuth)
│
├─ Authentik (auth.targo.ca) → SSO for staff (ops, n8n, Gitea, ERPNext OAuth)
├─ Authentik (id.gigafibre.ca) → SSO for client portal
│
├─ ERPNext v16.10.1 (erp.gigafibre.ca) → 9 containers (db, redis, backend, queues, scheduler, websocket, n8n, n8n-proxy)
│
├─ Ops SPA (erp.gigafibre.ca/ops/) → Served via nginx:alpine from /opt/ops-app/
├─ Dispatch redirect (dispatch.gigafibre.ca) → 301 → /ops/#/dispatch (former dispatch-app, decommissioned)
│
├─ targo-hub (msg.gigafibre.ca) → Node 20, /opt/targo-hub/
├─ DocuSeal (docs.gigafibre.ca) → Contract e-signature
├─ traccar-proxy → nginx relay for Traccar UI
│
└─ Marketing site (www.gigafibre.ca) → React/Vite/Tailwind
DNS Configuration (Cloudflare):
- Domain
gigafibre.cais strictly DNS-only (no Cloudflare proxy) to allow Traefik Let's Encrypt generation. - Email via Mailjet + Google Workspace records configured on root.
Docker Networks:
proxy: Public-facing network connected to Traefik.erpnext_erpnext: Internal network for Frappe, Postgres, Redis, and targo-hub routing.
3. Core Services
ERPNext (The Backend)
- Database: PostgreSQL (
erpnext-db-1). - Extensions: Custom doctypes for Dispatch Job, Technician, Tag, Service Location, Service Equipment, Subscription.
- API Token Auth:
targo-huband the Ops PWA interact with Frappe via a highly-privileged service token (Authorization: token ...).
Targo-Hub (API Gateway)
- Stack: Node.js 20 (
msg.gigafibre.ca:3300). - Purpose: Acts as the middleman for all heavy or real-time workflows out of ERPNext's scope.
- Key Abilities:
- Real-time Server-Sent Events (SSE) for timeline/chat updates.
- Twilio SMS / Voice (IVR) routing.
- Modem polling (GenieACS, OLT SNMP proxy).
- Webhooks handling (Stripe payments, Uptime-Kuma, 3CX).
Modem-Bridge
- Stack: Playwright/Chromium (
:3301internal). - Purpose: Allows reading encrypted TR-181 parameters from TP-Link XX230v modems by leveraging the modem's native JS cryptography. Exposes a simple JSON REST API locally to targo-hub.
Vision / OCR (Gemini via targo-hub)
- Model: Gemini 2.5 Flash (Google) — no local GPU, all inference remote.
- Endpoints (hub):
/vision/barcodes,/vision/equipment,/vision/invoice. - Why centralized: ops VM has no GPU, so the legacy Ollama
llama3.2-visioninstall was retired. All three frontends (ops, field-as-ops/j, future client portal) hit the hub, which enforces JSONresponseSchemaper endpoint. - Client-side resilience: barcode scans use an 8s timeout + IndexedDB retry queue so techs in weak-LTE zones don't lose data. See ../features/vision-ocr.md for the full pipeline.
4. Security & Authentication Flow
Staff user → erp.gigafibre.ca/ops/ (or n8n, Gitea)
→ Traefik checks session via ForwardAuth middleware
→ Outpost validates with Authentik staff (auth.targo.ca)
→ Authorized? Request forwarded to upstream container
with X-Authentik-Email + X-Authentik-Groups headers
→ Ops SPA reads X-Authentik-Email; useUserGroups maps groups
to in-app capabilities
Customer user → client.gigafibre.ca
→ Traefik checks session via separate ForwardAuth chain
→ Outpost validates with Authentik client (id.gigafibre.ca)
Two distinct ForwardAuth middlewares:
authentik@file→ backed byauth.targo.ca(staff)authentik-client@file→ backed byid.gigafibre.ca(customers)
ERPNext OAuth — auth.targo.ca is also configured as a Frappe
Social Login Key (provider name Authentik). The login page at
/login shows both the password form and the "Login with Authentik"
button. OAuth client_id P0rFFdq2hhun7hOLwkF5zm87vvDqcVYAhLtoZnFX,
redirect_uri /api/method/frappe.integrations.oauth2_logins.custom/authentik.
Adding new users is centralized through the hub, not the Authentik
admin UI. The ops Settings page (Settings → Utilisateurs → Inviter)
hits POST /auth/users on msg.gigafibre.ca which:
- Creates the Authentik user (random username from local-part of email, password set explicitly), assigns OPS_GROUPS.
- Sets a temp password (readable, no look-alikes) and emails it via
the hub's Mailjet SMTP — Authentik's own recovery flow isn't wired
(
flow_recovery=Noneon the brand) and its global SMTP is unset, so the hub does it directly. - Creates the matching ERPNext User (System User, social_logins = [{provider:authentik, userid:email}]) so OAuth finds it on first login.
The temp password is also returned to the admin (UI shows it with a
copy button) so they can hand it over manually if Mailjet drops the
message. See services/targo-hub/lib/auth.js for the full flow.
API Security: frontends rely on the Authentik session cookie
forwarded by Traefik. Backend scripts and the hub use
Authorization: token <ERP_SERVICE_TOKEN> Bearer headers.
5. Network Intelligence & CPE Flow
Device Diagnostics (targo-hub → GenieACS / OLT)
When a CSR clicks "Diagnostiquer" in the Ops app:
- Ops app asks
/devices/lookup?serial=X. targo-hubpolls GenieACS NBI.- If deep data is needed,
targo-hubqueriesmodem-bridge(for TP-Link) or the OLT SNMP directly. - Returns consolidated interface, mesh, wifi, and opticalStatus array to the UI.
Future: QR Code Flow
- Tech applies QR sticker to modem (
msg.gigafibre.ca/q/{mac}). - Client scans QR →
targo-hubidentifies customer via MAC matching in ERPNext. - Triggers SMS OTP → Client views diagnostic portal.
6. Development Gotchas
- Traefik v3 is incompatible with Docker 29 due to API changes. Stay on v2.11.
- Never click "Generate Keys" for the Administrator user in ERPNext — it breaks the
targo-hubAPI token (silently). - Traccar API supports only one
deviceIdper request. Use parallel polling (Promise.allSettled) — seeservices/targo-hub/lib/traccar.js. - Docker log rotation is set globally via
/etc/docker/daemon.json(max-size=100m, max-file=3). Applied at container creation — old containers keep their previous (uncapped) policy until youcompose up -d --force-recreatethem. We learned this the hard way when the Oktopus broker filled/var/sdbwith 75 GB of debug logs in 13 days. - Weekly prune runs via
/etc/cron.d/docker-pruneSunday 03:00 ET — clears anything not used in 30 days. Don't add a stack you only run monthly withoutrestart: alwaysor it'll get pruned out. - PostgreSQL transaction-aborted errors in the backend log — usually benign (one bad query in the Frappe scheduler) but if persistent, it's the connection pool needing a recycle.
docker restart erpnext-backend-1resolves. - Authentik recovery flow isn't configured on the brand. Don't use
recovery_email/from the API — use the hub invite flow described in §4 instead.