gigafibre-fsm/docs/SETUP.md
louispaulb a6974e2443 chore(ops): install frappe_pg + version-control the post-install patch
Done what the docs suggested in c31a9e0 — actually installed
excel-azmin/frappe_pg on prod erp.gigafibre.ca (1.0.0, master pinned
at commit a237f5995b). Hit one compat bug along the way and fixed it.

The bug

  frappe_pg monkey-patches PostgresDatabase.commit() and .rollback()
  with wrappers that have the OLD `(self)` signature. Frappe 16.12+
  now calls `db.rollback(chain=True)` from app.py:sync_database(),
  which makes every HTTP request crash with:

    TypeError: patched_rollback() got an unexpected keyword argument 'chain'

  Symptom: HTTP 500 on /api/method/ping, Sales Invoice list, etc.
  Customer Server Scripts that don't return through sync_database
  (like our customer_balance) still worked, which is why the bug
  only surfaced after the post-install restart.

The fix

  Two-part: signatures take `*args, **kwargs`, and the wrapped call
  forwards them to the original. Idempotent sed.

    -def patched_rollback(self):
    +def patched_rollback(self, *args, **kwargs):
    -    return _original_rollback(self)
    +    return _original_rollback(self, *args, **kwargs)

  Both files in frappe_pg need it: postgres/database_patches.py and
  patches/postgres_fix.py. Same fix for patched_commit.

Saved as patches/fix_frappe_pg_signatures.sh so we can re-apply after
every `bench update` or fresh install. The comment block in the
script documents why it exists and links the upstream issue (TODO:
file PR at excel-azmin/frappe_pg). docs/SETUP.md §7 was updated to
mention the post-install patch step, the nginx-IP-cache gotcha that
will produce a confusing 502 if you only restart the backend, and
the correct repo (excel-azmin, not the-commit-company that I had
hallucinated in the previous commit).

Smoke test results post-install:
  ping, Customer list, Sales Invoice list, Service Location list,
  customer_balance Server Script, ops SPA, hub /health — all 200.
2026-05-21 15:15:31 -04:00

154 lines
6.4 KiB
Markdown

# Dev setup — gigafibre-fsm
Quick reference for getting the stack running on a new machine.
## 1. Clone
```bash
git clone https://git.targo.ca/louis/gigafibre-fsm.git
cd gigafibre-fsm
```
## 2. Env files
The actual `.env` files are gitignored (they hold secrets). Each component
ships a `.env.example` with placeholder values + comments. Copy and fill in:
```bash
cp apps/ops/.env.example apps/ops/.env
cp services/targo-hub/.env.example services/targo-hub/.env
```
Ask the team for the real values (or copy from `/opt/<service>/.env` on the
prod box if you have access). The hub `.env` is the long one — most fields
correspond to one external integration (Stripe, Twilio, Authentik, etc.).
Anything left blank disables that feature gracefully.
## 3. Run the apps
### `apps/ops` (Vue 3 + Quasar SPA)
```bash
cd apps/ops
npm install
npx quasar dev # dev server at http://localhost:9000
npx quasar build # production bundle in dist/spa/
```
Notes:
- The SPA expects to find ERPNext at the same origin in production
(`erp.gigafibre.ca/ops/` is served from `/opt/ops-app/` via the
ERPNext nginx). In dev, set `VITE_HUB_URL` to the local hub or the
prod hub for backend calls.
- Authentik SSO redirects only work behind a real domain — dev mode
uses the API token (`VITE_ERP_TOKEN`) for direct ERPNext calls.
### `services/targo-hub` (Node 20+)
```bash
cd services/targo-hub
npm install --production
node server.js # listens on :3300
```
In production this runs in a Docker container under `/opt/targo-hub/` with
the host's `.env` file mounted.
### Other services
The `services/` and `apps/` directories also contain Docker compose stacks
that run on the prod server (ERPNext, Authentik, Traccar proxy, Fonoster,
DocuSeal, …). Reproducing them locally is rarely needed — the hub talks
to ERPNext + Authentik over the network and that's enough for most
front-end work.
## 4. Common tasks
| Task | Command |
| --- | --- |
| Build + deploy ops SPA to prod | `cd apps/ops && npx quasar build && scp -r dist/spa/* root@96.125.196.67:/opt/ops-app/` |
| Push hub code change | `scp services/targo-hub/lib/<file>.js root@96.125.196.67:/opt/targo-hub/lib/` then `ssh root@... 'docker restart targo-hub'` |
| Tail prod logs | `ssh root@96.125.196.67 'docker logs -f targo-hub --tail 50'` |
| Re-build after changing daemon.json or compose | `docker compose up -d --force-recreate` from the relevant `/opt/<service>/` |
## 5. Where things live
```
apps/ops/ Quasar SPA — main internal tool (dispatch, clients, …)
apps/ops/src/pages/ Top-level pages (DispatchPage, ClientDetailPage, …)
apps/ops/src/composables/ Shared logic (useMap, useResourceFilter, …)
apps/ops/src/components/shared/detail-sections/ Per-doctype detail panels
services/targo-hub/ Node middleware between SPA / ERPNext / 3rd parties
services/targo-hub/lib/ One module per integration (auth, dispatch, ai, …)
services/targo-hub/server.js Top-level HTTP router
docs/ This file + future runbooks
```
## 6. Auth quirks (fyi)
- **Authentik staff instance** = `auth.targo.ca` (admin token in
`AUTHENTIK_TOKEN`). ERPNext uses it as an OAuth provider.
- **Authentik client instance** = `id.gigafibre.ca` (separate stack,
for customer portal — uses `/opt/authentik-client/`).
- Inviting a user via ops Settings → Utilisateurs hits
`POST /auth/users` on the hub, which (a) creates the Authentik user,
(b) sets a temp password, (c) emails it via Mailjet, (d) creates the
matching ERPNext System User.
- The Authentik recovery email flow isn't configured (no `flow_recovery`
on the brand) — the hub sends the credentials itself instead.
## 7. ERPNext on PostgreSQL — known incompatibilities
ERPNext was built for MariaDB. We run on PostgreSQL because the legacy
migration data was easier to handle there. Frappe & ERPNext generate
SQL that's lenient under MariaDB but strict under Postgres — symptoms
on the UI are "column X does not exist" errors or empty/blank reports
on certain accounting screens (Bank Clearance, Payment Reconciliation,
Gross Profit, etc.).
**Strongly suggested**: install the
[`frappe_pg`](https://github.com/the-commit-company/frappe_pg)
community app, which bundles a comprehensive set of PostgreSQL
compatibility patches as a Frappe app. It's the cleaner alternative
to maintaining our own per-file patches in `patches/fix_pg_groupby.py`,
which we have to re-apply after every ERPNext upgrade.
```bash
# On the prod box (correct repo is excel-azmin, not the-commit-company):
docker exec erpnext-backend-1 bench get-app https://github.com/excel-azmin/frappe_pg --branch master
docker exec erpnext-backend-1 bench --site erp.gigafibre.ca install-app frappe_pg
# REQUIRED before restart: patch the rollback/commit signatures so they
# accept Frappe 16.12+'s chain=True kwarg. Without this every HTTP
# request crashes with TypeError. The script is idempotent.
bash patches/fix_frappe_pg_signatures.sh erpnext-backend-1
docker restart erpnext-backend-1 erpnext-frontend-1
```
Notes from the actual install (2026-05-21):
- The repo is `excel-azmin/frappe_pg` (11 stars, 3 commits — small).
Pin a commit in your install script rather than tracking `master`.
- Their `fix_erpnext_trends.py` has a non-UTF-8 byte on line 39 that
Frappe defers automatically with "Will apply trends patch later" —
not fatal but worth knowing.
- After installing, **always** run `patches/fix_frappe_pg_signatures.sh`
to fix the `patched_rollback(self)` / `patched_commit(self)`
signatures to accept `*args, **kwargs`. Frappe 16.12+ calls
`db.rollback(chain=True)` from `app.py:sync_database()` and the
unpatched frappe_pg crashes every request with `TypeError`.
- The 502 you'll see right after `docker restart erpnext-backend-1`
is nginx caching the old container IP. Restart the frontend too:
`docker restart erpnext-frontend-1`.
Validate before touching prod by running on staging first. The 4
known-broken accounting UIs (Bank Clearance, Bank Reconciliation,
Payment Reconciliation, Gross Profit) are the regression targets.
Our own custom Server Scripts with raw SQL (e.g. `customer_balance`)
need the same vigilance regardless: PostgreSQL treats `"Customer"` as
a column identifier; use `'Customer'` (single quotes) for string
literals. Add a `bench export-fixtures` step to version-control any
Server Script we tweak so the fix isn't lost on re-deployment. See
`docs/architecture/overview.md` §6 item 8 for the full background.