gigafibre-fsm/docs/architecture/overview.md
louispaulb beb6ddc5e5 docs: reorganize into architecture/features/reference/archive folders
All docs moved with git mv so --follow preserves history. Flattens the
single-folder layout into goal-oriented folders and adds a README.md index
at every level.

- docs/README.md — new landing page with "I want to…" intent table
- docs/architecture/ — overview, data-model, app-design
- docs/features/ — billing-payments, cpe-management, vision-ocr, flow-editor
- docs/reference/ — erpnext-item-diff, legacy-wizard/
- docs/archive/ — HANDOFF-2026-04-18, MIGRATION, status-snapshots/
- docs/assets/ — pptx sources, build scripts (fixed hardcoded path)
- roadmap.md gains a "Modules in production" section with clickable
  URLs for every ops/tech/portal route and admin surface
- Phase 4 (Customer Portal) flipped to "Largely Shipped" based on
  audit of services/targo-hub/lib/payments.js (16 endpoints, webhook,
  PPA cron, Klarna BNPL all live)
- Archive files get an "ARCHIVED" banner so stale links inside them
  don't mislead readers

Code comments + nginx configs rewritten to use new doc paths. Root
README.md documentation table replaced with intent-oriented index.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 11:51:33 -04:00

118 lines
5.4 KiB
Markdown

# Gigafibre FSM — Ecosystem Architecture
> Unified reference document for infrastructure, platform strategy, and application architecture on the remote Docker environment.
## 1. Executive Summary & Platform Strategy
Gigafibre FSM represents the complete operations platform for Gigafibre, shifting from a polling-based legacy PHP system to a modern, real-time push ecosystem (Vue 3, Node.js, ERPNext, TR-369).
The strategy pivots around a **unified core platform** running entirely on a remote Proxmox VM (96.125.196.67):
- **ERPNext v16** as the undisputed Source of Truth (CRM, billing, ticketing).
- **Targo Ops PWA** as the single pane of glass for internal teams.
- **Targo Hub** as the real-time API gateway (SMS, SSE, AI, TR-069 proxy).
- **Client.gigafibre.ca** for customer self-service.
**Legacy Retirement Plan (April-May 2026):**
- *Retire* `dispatch-app` — Functionality now in Ops + lightweight mobile tech page (`/t/{token}`).
- *Retire* `apps/field` — Redundant to the mobile tech page workflow.
- *Retire* `auth.targo.ca` — Fully migrated to `id.gigafibre.ca` Authentik.
---
## 2. Infrastructure & Docker Networks
All services are containerized and housed on a single Proxmox VM (`96.125.196.67`), managed via Traefik.
```text
Internet
96.125.196.67 (Proxmox VM, Ubuntu 24.04)
├─ Traefik v2.11 (:80/:443, Let's Encrypt, ForwardAuth)
├─ Authentik SSO (id.gigafibre.ca) → Secures /ops/ and client portal
├─ ERPNext v16.10.1 (erp.gigafibre.ca) → 9 containers (db, redis, workers)
├─ Targo Ops App (erp.gigafibre.ca/ops/) → Served via nginx:alpine
├─ n8n (n8n.gigafibre.ca) → Auto-login proxy wired to Authentik headers
├─ Oktopus CE (oss.gigafibre.ca) → TR-369 CPE management
└─ WWW / Frontend (www.gigafibre.ca) → React marketing site
```
**DNS Configuration (Cloudflare):**
- Domain `gigafibre.ca` is strictly DNS-only (no Cloudflare proxy) to allow Traefik Let's Encrypt generation.
- Email via Mailjet + Google Workspace records configured on root.
**Docker Networks:**
- `proxy`: Public-facing network connected to Traefik.
- `erpnext_erpnext`: Internal network for Frappe, Postgres, Redis, and targo-hub routing.
---
## 3. Core Services
### ERPNext (The Backend)
- **Database:** PostgreSQL (`erpnext-db-1`).
- **Extensions:** Custom doctypes for Dispatch Job, Technician, Tag, Service Location, Service Equipment, Subscription.
- **API Token Auth:** `targo-hub` and the Ops PWA interact with Frappe via a highly-privileged service token (`Authorization: token ...`).
### Targo-Hub (API Gateway)
- **Stack:** Node.js 20 (`msg.gigafibre.ca:3300`).
- **Purpose:** Acts as the middleman for all heavy or real-time workflows out of ERPNext's scope.
- **Key Abilities:**
- Real-time Server-Sent Events (SSE) for timeline/chat updates.
- Twilio SMS / Voice (IVR) routing.
- Modem polling (GenieACS, OLT SNMP proxy).
- Webhooks handling (Stripe payments, Uptime-Kuma, 3CX).
### Modem-Bridge
- **Stack:** Playwright/Chromium (`:3301` internal).
- **Purpose:** Allows reading encrypted TR-181 parameters from TP-Link XX230v modems by leveraging the modem's native JS cryptography. Exposes a simple JSON REST API locally to targo-hub.
### Vision / OCR (Gemini via targo-hub)
- **Model:** Gemini 2.5 Flash (Google) — no local GPU, all inference remote.
- **Endpoints (hub):** `/vision/barcodes`, `/vision/equipment`, `/vision/invoice`.
- **Why centralized:** ops VM has no GPU, so the legacy Ollama `llama3.2-vision` install was retired. All three frontends (ops, field-as-ops `/j`, future client portal) hit the hub, which enforces JSON `responseSchema` per endpoint.
- **Client-side resilience:** barcode scans use an 8s timeout + IndexedDB retry queue so techs in weak-LTE zones don't lose data. See [../features/vision-ocr.md](../features/vision-ocr.md) for the full pipeline.
---
## 4. Security & Authentication Flow
```text
User → app.gigafibre.ca
→ Traefik checks session via ForwardAuth middleware
→ Flow routed to Authentik (id.gigafibre.ca)
→ Authorized? Request forwarded to native container with 'X-Authentik-Email' header
```
- **ForwardAuth (`authentik-client@file`):** Currently protects `erp.gigafibre.ca/ops/`, `n8n`, and `hub`.
- **API Security:** Frontends use the Authentik session proxy; Backend services/scripts use the `Authorization: token` headers directly hitting Frappe's `/api/method`.
---
## 5. Network Intelligence & CPE Flow
**Device Diagnostics (`targo-hub → GenieACS / OLT`)**
When a CSR clicks "Diagnostiquer" in the Ops app:
1. Ops app asks `/devices/lookup?serial=X`.
2. `targo-hub` polls GenieACS NBI.
3. If deep data is needed, `targo-hub` queries `modem-bridge` (for TP-Link) or the OLT SNMP directly.
4. Returns consolidated interface, mesh, wifi, and opticalStatus array to the UI.
**Future: QR Code Flow**
- Tech applies QR sticker to modem (`msg.gigafibre.ca/q/{mac}`).
- Client scans QR → `targo-hub` identifies customer via MAC matching in ERPNext.
- Triggers SMS OTP → Client views diagnostic portal.
---
## 6. Development Gotchas
1. **Traefik v3** is incompatible with Docker 29 due to API changes. Stay on v2.11.
2. **MongoDB 5+** (Oktopus) requires AVX extensions. Proxmox CPU must be set to `host`.
3. Never click "Generate Keys" for the Administrator user in ERPNext or it breaks the `targo-hub` API token.
4. **Traccar API** supports only one `deviceId` per request. Use parallel polling.