All docs moved with git mv so --follow preserves history. Flattens the single-folder layout into goal-oriented folders and adds a README.md index at every level. - docs/README.md — new landing page with "I want to…" intent table - docs/architecture/ — overview, data-model, app-design - docs/features/ — billing-payments, cpe-management, vision-ocr, flow-editor - docs/reference/ — erpnext-item-diff, legacy-wizard/ - docs/archive/ — HANDOFF-2026-04-18, MIGRATION, status-snapshots/ - docs/assets/ — pptx sources, build scripts (fixed hardcoded path) - roadmap.md gains a "Modules in production" section with clickable URLs for every ops/tech/portal route and admin surface - Phase 4 (Customer Portal) flipped to "Largely Shipped" based on audit of services/targo-hub/lib/payments.js (16 endpoints, webhook, PPA cron, Klarna BNPL all live) - Archive files get an "ARCHIVED" banner so stale links inside them don't mislead readers Code comments + nginx configs rewritten to use new doc paths. Root README.md documentation table replaced with intent-oriented index. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
23 KiB
Vision & OCR Pipeline
All vision runs on Gemini 2.5 Flash via
targo-hub. No local Ollama. The ops/ERPNext VM has no GPU, so every vision request — bills, barcodes, equipment labels — goes to Google's Gemini API from a single backend service and gets normalized before hitting the frontend.
Last refreshed: 2026-04-22 (cutover from Ollama → Gemini)
1. Architecture at a glance
┌──────────────────┐ ┌───────────────────────┐
│ apps/ops (PWA) │ │ apps/field (PWA) │
│ /ops/* │ │ /field/* (retiring) │
└────────┬─────────┘ └──────────┬────────────┘
│ │
│ src/api/ocr.js │ src/api/ocr.js
│ {ocrBill, scanBarcodes, │ {ocrBill, scanBarcodes,
│ scanEquipmentLabel} │ checkOllamaStatus}
│ │
└──────────────┬──────────────┘
│ POST https://msg.gigafibre.ca/vision/*
▼
┌───────────────────────┐
│ targo-hub │
│ lib/vision.js │
│ ├─ /vision/barcodes │
│ ├─ /vision/equipment│
│ └─ /vision/invoice │
└──────────┬────────────┘
│ generativelanguage.googleapis.com
▼
┌───────────────────────┐
│ Gemini 2.5 Flash │
│ (text + image, JSON │
│ responseSchema) │
└───────────────────────┘
Why route everything through the hub:
- No GPU on ops VM. The only machine with a local Ollama was retired in Phase 2.5. Centralizing on Gemini means the frontend stops caring where inference happens.
- Single AI_API_KEY rotation surface. Key lives in the hub env only.
- Schema guarantees. Gemini supports
responseSchemain the v1beta API — the hub enforces it per endpoint, so the frontend can trust the JSON shape without defensive parsing. - Observability. Every call is logged in the hub with image size, model, latency, output preview (first 300 chars).
2. Hub endpoints (services/targo-hub/lib/vision.js)
All three endpoints:
- are
POSTwith JSON body{ image: <base64 or data URI> }, - return structured JSON (see per-endpoint schemas below),
- require
AI_API_KEYin the hub environment, - are unauthenticated from the browser (rate-limiting is the hub's job).
POST /vision/barcodes
Extracts up to 3 identifiers (serials, MACs, GPON SNs, barcodes).
{
"barcodes": ["1608K44D9E79FAFF5", "0418D6A1B2C3", "TPLG-A1B2C3D4"]
}
Used by: tech scan page, equipment link dialog, invoice scan (fallback).
POST /vision/equipment
Structured equipment-label parse (ONT/ONU/router/modem).
{
"brand": "TP-Link",
"model": "XX230v",
"serial_number": "2234567890ABCD",
"mac_address": "0418D6A1B2C3",
"gpon_sn": "TPLGA1B2C3D4",
"hw_version": "1.0",
"equipment_type": "ont",
"barcodes": ["..."]
}
Post-processing: mac_address stripped of separators + uppercased;
serial_number trimmed of whitespace.
Used by: useEquipmentActions in the ops client detail page to pre-fill
a "create Service Equipment" dialog.
POST /vision/invoice
Structured invoice/bill OCR. Canadian-tax-aware (GST/TPS + QST/TVQ).
{
"vendor": "Acme Fibre Supplies",
"vendor_address": "123 rue Somewhere, Montréal, QC",
"invoice_number": "INV-2026-0042",
"date": "2026-04-18",
"due_date": "2026-05-18",
"subtotal": 1000.00,
"tax_gst": 50.00,
"tax_qst": 99.75,
"total": 1149.75,
"currency": "CAD",
"items": [
{ "description": "OLT SFP+ module", "qty": 4, "rate": 250.00, "amount": 1000.00 }
],
"notes": "Payment terms: net 30"
}
Post-processing: string-shaped numbers (e.g. "1,234.56") are coerced to
floats, both at the invoice level and per line item.
Used by: apps/ops/src/pages/OcrPage.vue (invoice intake), future
supplier-bill wizard.
3. Frontend surface (apps/ops/src/api/ocr.js)
Thin wrapper over the hub. Same signatures for ops and field during the
migration window (see apps/field/src/api/ocr.js — same file, different
HUB_URL source).
| Function | Endpoint | Error behavior |
|---|---|---|
ocrBill(image) |
/vision/invoice |
Throws on non-2xx — caller shows Notify |
scanBarcodes(image) |
/vision/barcodes |
Throws on non-2xx — useScanner catches + queues |
scanEquipmentLabel(image) |
/vision/equipment |
Throws on non-2xx |
checkOllamaStatus() |
/health |
Returns {online, models, hasVision}. Name kept for back-compat. |
The checkOllamaStatus name is a leftover from the Ollama era — it now
pings the hub's health endpoint and reports models: ['gemini-2.5-flash']
so existing callers (status chips, diagnostics panels) keep working. The
name will be renamed to checkVisionStatus once no page references the
old symbol.
4. Scanner composable (apps/ops/src/composables/useScanner.js)
Wraps the API with camera capture and resilience. Two modes on one composable:
Mode A — processPhoto(file) (barcodes, resilient)
- Resize the
Filetwice:- 400px thumbnail for on-screen preview
- 1600px @ q=0.92 for Gemini (text must stay readable)
- Race
scanBarcodes(aiImage)against an 8s timeout (SCAN_TIMEOUT_MS). - On timeout / network error, if the error is retryable
(ScanTimeout | Failed to fetch | NetworkError | TypeError):
- persist
{ id, image, ts, status: 'queued' }to IndexedDB viauseOfflineStore.enqueueVisionScan, - flag
photos[idx].queued = truefor the UI chip, - show "Réseau faible — scan en attente. Reprise automatique au retour du signal."
- persist
- Otherwise, show the raw error.
On success, newly found codes are merged into barcodes.value (capped at
MAX_BARCODES = 5, dedup by value), and the optional onNewCode(code)
callback fires for each one.
Mode B — scanEquipmentLabel(file) (structured, synchronous)
No timeout, no queue. Returns the full Gemini response. Auto-merges any
serial_number + barcodes[] into the same barcodes.value list so a
page using both modes shares one visible list. Used in desktop/wifi flows
where callers want a sync answer to pre-fill a form.
Late-delivered results
The composable runs a watch(() => offline.scanResults.length) so that
when the offline store later completes a queued scan (tech walks out of
the basement, signal returns), the codes appear in the UI as if they
had come back synchronously. onNewCode fires for queued codes too, so
lookup-and-notify side-effects happen regardless of path.
It also drains offline.scanResults once at mount, to catch the case
where a scan completed while the page was unmounted (phone locked, app
backgrounded, queue sync ran, user reopens ScanPage).
5. Offline store (apps/ops/src/stores/offline.js)
Pinia store, two queues, IndexedDB (idb-keyval):
Mutation queue
{ type: 'create'|'update', doctype, name?, data, ts, id } — ERPNext
mutations. Flushed when window emits online. Failed items stay
queued across reconnects. Keyed under offline-queue.
Vision queue
{ id, image (base64), ts, status } — photos whose Gemini call timed
out or failed. Keyed under vision-queue.
Retries are time-driven, not event-driven. We don't trust
navigator.onLine because it reports true on 2-bar LTE that can't
actually reach msg.gigafibre.ca. First retry at 5s, back off to 30s on
repeated failure. A reconnect (online event) also triggers an
opportunistic immediate sync.
Successful scans land in scanResults (keyed vision-results) and the
scanner composable consumes them via watcher + consumeScanResult(id)
to avoid duplicates.
Generic cache
cacheData(key, data) / getCached(key) — plain read cache used by
list pages for offline browsing. Keyed under cache-{key}.
6. Data flow example (tech scans an ONT in a basement)
[1] Tech taps "Scan" in /j/ScanPage (camera opens)
[2] Tech takes photo (File → input.change)
[3] useScanner.processPhoto(file)
→ resizeImage(file, 400) (thumbnail shown immediately)
→ resizeImage(file, 1600, 0.92)
→ Promise.race([scanBarcodes(ai), timeout(8s)])
CASE A — signal ok:
[4a] Gemini responds in 2s → barcodes[] merged → onNewCode fires
→ ERPNext lookup → Notify "ONT lié au client Untel"
CASE B — weak signal / timeout:
[4b] 8s timeout fires → isRetryable('ScanTimeout') → true
→ offline.enqueueVisionScan({ image: aiImage })
→ photos[idx].queued = true (chip "scan en attente")
→ tech keeps scanning next device
[5b] Tech walks out of basement — window.online fires
→ syncVisionQueue() retries the queued photo
→ Gemini responds → scanResults.push({id, barcodes, ts})
[6b] useScanner watcher on scanResults.length fires
→ mergeCodes(barcodes, 'queued') → onNewCode fires (late)
→ Notify arrives while tech is walking back to the truck
→ consumeScanResult(id) (removed from persistent queue)
7. Changes from the previous (Ollama) pipeline
| Aspect | Before (Phase 2) | After (Phase 2.5) |
|---|---|---|
| Invoice OCR | Ollama llama3.2-vision:11b on the serving VM |
Gemini 2.5 Flash via /vision/invoice |
| Barcode scan | Hub /vision/barcodes (already Gemini) |
Unchanged |
| Equipment label | Hub /vision/equipment (already Gemini) |
Unchanged |
| GPU requirement | Yes (11GB VRAM for vision model) | None — all inference remote |
| Offline resilience | Only barcode mode, only in apps/field | Now in apps/ops too (ready for /j) |
| Schema validation | Hand-parsed from prompt-constrained JSON | Gemini responseSchema enforces shape |
| Frontend import path | 'src/api/ocr' (both apps) |
Unchanged — same symbols |
8. Where to look next
- Hub implementation:
services/targo-hub/lib/vision.js,services/targo-hub/server.js(routes:/vision/barcodes,/vision/equipment,/vision/invoice). - Frontend API client:
apps/ops/src/api/ocr.js(+apps/field/src/api/ocr.jskept in sync during migration). - Scanner composable:
apps/ops/src/composables/useScanner.js. - Offline store:
apps/ops/src/stores/offline.js.
8.1 Secrets, keys and rotation
The only secret this pipeline needs is the Gemini API key. Everything else (models, base URL, hub public URL) is non-sensitive config.
| Variable | Where it's read | Default | Notes |
|---|---|---|---|
AI_API_KEY |
services/targo-hub/lib/config.js:38 |
(none — required) | Google AI Studio key for generativelanguage.googleapis.com. Server-side only, never reaches the browser bundle. |
AI_MODEL |
config.js:39 |
gemini-2.5-flash |
Primary vision model. |
AI_FALLBACK_MODEL |
config.js:40 |
gemini-2.5-flash-lite-preview |
Used by text-only calls (not vision) when primary rate-limits. |
AI_BASE_URL |
config.js:41 |
https://generativelanguage.googleapis.com/v1beta/openai/ |
OpenAI-compatible endpoint used by agent code. Vision bypasses this and talks to the native /v1beta/models/...:generateContent URL. |
Storage policy. The repo is private and follows the same posture as
the ERPNext service token already hardcoded in
apps/ops/infra/nginx.conf:15 and apps/field/infra/nginx.conf:13. The
Gemini key can live in any of three places, in increasing order of
"checked into git":
- Prod VM env only (status quo): key is in the
environment:block of thetargo-hubservice in/opt/targo-hub/docker-compose.ymlon96.125.196.67.config.js:38reads it viaprocess.env.AI_API_KEY. Rotation = edit that one line +docker compose restart targo-hub. - In-repo fallback in
config.js: change line 38 toAI_API_KEY: env('AI_API_KEY', 'AIzaSy...')— the env var still wins when set, so prod doesn't break, but a fresh clone Just Works. Same pattern as nginx's ERPNext token. - Hardcoded constant (not recommended): replace
env(...)entirely. Loses the ability to override per environment (dev, staging).
If/when option 2 is chosen, the literal value should also be recorded
in MEMORY.md (reference_google_ai.md) so that's the rotation source
of truth — not scattered across the codebase.
Browser exposure. Zero. The ops nginx config proxies /hub/* to
targo-hub on an internal Docker network; the hub injects the key before
calling Google. apps/ops/src/api/ocr.js just does
fetch('/hub/vision/barcodes', ...) — no key in the bundle, no key in
DevTools, no key in the browser's Network tab.
9. Related
- ../architecture/overview.md — the full service map this lives in.
- cpe-management.md — how scanned serials flow into the TR-069/TR-369 device management plane.
- ../architecture/app-design.md — frontend conventions (Vue 3 Composition API, feature folders).
- ../roadmap.md — Phase 2.7 tracks the ongoing /j tech unification this pipeline depends on.
10. Data-model relationships triggered by a scan
A scan is never just "identify a barcode." Every successful lookup fans
out into the ERPNext graph: the scanned Service Equipment is the entry
point, and the tech page (/j/device/:serial) surfaces everything tied
to the same Customer and Service Location. This section documents that
graph, the exact fields read per entity, and the write rules.
10.1 Graph (Service Equipment is the anchor)
┌─────────────────────────┐
│ Service Equipment │
│ EQP-##### │◀───── scanned serial / MAC / barcode
│ │ (3-tier lookup in TechScanPage)
│ • serial_number │
│ • mac_address │
│ • barcode │
│ • equipment_type (ONT) │
│ • brand / model │
│ • firmware / hw_version│
│ • status │
│ │
│ FK → customer ─────────┼───┐
│ FK → service_location ─┼─┐ │
│ FK → olt / port │ │ │ (ONT-specific, TR-069 bind)
└─────────────────────────┘ │ │
│ │
┌─────────────────────────────────┘ │
│ │
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ Service Location │ │ Customer │
│ LOC-##### │ │ CUST-##### │
│ • address │ │ • customer_name │
│ • city │ │ • stripe_id │
│ • postal_code │ │ • ppa_enabled │
│ • connection_type │ │ • legacy_*_id │
│ • olt_port │ └────────┬──────────┘
│ • gps lat/lng │ │
└───┬──────────┬────┘ │
│ │ │
inbound│ │inbound inbound│
│ │ │
▼ ▼ ▼
┌────────────┐ ┌──────────────┐ ┌──────────────┐
│ Issue │ │ Dispatch Job │ │ Subscription │
│ TCK-##### │ │ DJ-##### │ │ SUB-##### │
│ │ │ │ │ │
│ open │ │ upcoming │ │ active plan │
│ tickets │ │ installs / │ │ billing │
│ │ │ repairs │ │ RADIUS creds │
└────────────┘ └──────────────┘ └──────────────┘
FK: service_location FK: party_type='Customer', party=<cust>
Two FK axes, not one. Tickets + Dispatch Jobs pivot on where the problem is (Service Location). Subscriptions pivot on who owns the account (Customer). A customer can have multiple locations (duplex, rental, commercial); the scan page shows the subscription freshest for the customer, even if the scanned device is at a secondary address.
10.2 Exact reads issued from TechDevicePage.vue
| Step | Call | Filter | Fields read | Purpose |
|---|---|---|---|---|
| 1 | listDocs('Service Equipment') |
serial_number = :serial |
name |
Exact-serial lookup |
| 1 | listDocs('Service Equipment') |
barcode = :serial |
name |
Fallback if serial missed |
| 2 | getDoc('Service Equipment', name) |
— | full doc | Device card: brand/model/MAC/firmware/customer/service_location/olt_* |
| 3 | getDoc('Service Location', loc) |
— | full doc | Address, GPS, connection_type, olt_port |
| 4 | listDocs('Subscription') |
party_type='Customer', party=<cust>, status='Active' |
name, status, start_date, current_invoice_end |
Active plan chip |
| 5 | listDocs('Issue') |
service_location=<loc>, status ∈ {Open, In Progress, On Hold} |
name, subject, status, priority, opening_date |
Open tickets list |
| 6 | listDocs('Dispatch Job') |
service_location=<loc>, status ∈ {Planned, Scheduled, En Route, In Progress} |
name, subject, job_type, status, scheduled_date, technician |
Upcoming interventions |
All five fan-out queries run in parallel via Promise.allSettled, so a
permission error on any single doctype (e.g. tech role can't read
Subscription in some envs) doesn't block the page render — just that
card is omitted.
10.3 Writes issued from TechScanPage.vue
The scan page writes to exactly one doctype — Service Equipment —
never to Customer, Location, Subscription, Issue, or Dispatch Job. All
relationship changes happen via FK updates on the equipment row:
| Trigger | Write | Why |
|---|---|---|
| Auto-link from job context | updateDoc('Service Equipment', name, { customer, service_location }) |
Tech opened Scan from a Dispatch Job (?job=&customer=&location=) and the scanned equipment has no location yet — this "claims" the device for the install. |
| Manual link dialog | updateDoc('Service Equipment', name, { customer, service_location }) |
Tech searched customer + picked one of the customer's locations. |
| Create new device | createDoc('Service Equipment', data) |
3-tier lookup came up empty — create a stub and tie it to the current job if available. |
| Customer re-link (from TechDevicePage) | updateDoc('Service Equipment', name, { customer }) |
Tech realized the device is at the wrong account; re-linking the customer auto-reloads the subscription card. |
Subscription / Issue / Dispatch Job are read-only in the scan flow.
The tech app never creates a ticket from a scan — that's the job of the
ops dispatcher in DispatchPage.vue + ClientDetailPage.vue. The scan
page's contribution is to make the FK (service_location on the
equipment) accurate so those downstream cards light up correctly when
the dispatcher or the next tech opens the page.
10.4 Auto-link rule (the one piece of scan-time "business logic")
When TechScanPage is opened from a Dispatch Job (goScan on
TechJobDetailPage propagates ?job=<name>&customer=<id>&location=<loc>),
each successful lookup runs:
if (result.found && jobContext.customer && !result.equipment.service_location) {
await updateDoc('Service Equipment', result.equipment.name, {
customer: jobContext.customer,
service_location: jobContext.location, // only if the job has one
})
}
Why gated on "no existing service_location": a device that's already tied to address A should never silently move to address B just because a tech scanned it on a job ticket. If the location is already set, the tech has to use the "Re-link" action in TechDevicePage, which is explicit and logged. This prevents swap-out scenarios (tech brings a tested spare ONT from another install and scans it to confirm serial) from corrupting address ownership.
10.5 Why this matters for offline mode
The offline store (stores/offline.js) queues updateDoc calls under
the mutation queue, not the vision queue. That means:
- Scan photo → offline →
vision-queue→ retries against Gemini when signal returns. - Auto-link / create-equipment → offline →
offline-queue→ retries against ERPNext when signal returns.
Because both queues drain time-driven, a tech who scans 6 ONTs in a no-signal basement comes back to the truck and the phone silently:
- Sends the 6 photos to Gemini (vision queue)
- Receives the 6 barcode lists
- Fans each one through
lookupInERPNext(the scan page watcher) - For found + unlinked devices, enqueues 6
updateDoccalls - Drains the mutation queue → all 6 devices now carry
customer + service_locationFKs - Next time dispatcher opens the Dispatch Job, all 6 equipment rows appear in the equipment list (via reverse FK query from the job page)
The FK write on Service Equipment is what "connects" the scan to every downstream card (ticket list, subscription chip, dispatch job list). Everything else is a read on those FKs.