diff --git a/README.md b/README.md new file mode 100644 index 0000000..9eff2ec --- /dev/null +++ b/README.md @@ -0,0 +1,88 @@ +# Gigafibre Infrastructure + +## Architecture + +``` +Internet -> Traefik (80/443) -> Docker containers + |-- oss.gigafibre.ca -> Oktopus (CPE management TR-069/TR-369) + |-- git.gigafibre.ca -> Gitea + |-- dispatch.gigafibre.ca -> Dispatch App + |-- hub.gigafibre.ca -> Traefik Hub (management UI) + |-- traefik.gigafibre.ca -> Traefik Dashboard +``` + +## Quick Setup + +```bash +git clone https://git.targo.ca/louis/gigafibre-infra.git /opt/infra +cd /opt/infra && bash setup.sh +``` + +## Services + +| Service | Compose | Domain | Notes | +|---------|---------|--------|-------| +| Traefik v2.11 | traefik/ | traefik.gigafibre.ca | Reverse proxy + SSL | +| Traefik Hub | traefik-hub/ | hub.gigafibre.ca | Custom mgmt UI (DNS + routes + containers) | +| Oktopus CE | oktopus/ | oss.gigafibre.ca | ACS port 9292, MQTT port 1883 | +| Gitea | apps/ | git.gigafibre.ca | Container registry enabled | +| Dispatch | apps/ | dispatch.gigafibre.ca | Field technician scheduling | +| PostgreSQL x2 | apps/ | internal | targo-db + gitea-db | + +## Network + +- ens18: LAN 10.100.5.61/24 (VLAN 4000 native) +- ens19: WAN 96.125.196.67/26 (VLAN 4001) +- Default route: WAN metric 100, LAN metric 200 + +## Security + +- UFW firewall: 22, 80, 443, 1883, 9292 only +- Fail2ban: SSH (3 attempts = 1h ban, systemd backend) +- Traefik Hub: session auth with HttpOnly cookies +- SSH: key-based auth (ed25519) + +## DNS + +Managed via OpenSRS API (user: targo, domain: gigafibre.ca). +Can be managed from Traefik Hub UI (DNS Records page). + +## SSL + +Let's Encrypt via Traefik HTTP-01 challenge. + +**Do NOT enable global HTTP-to-HTTPS redirect** - it breaks the Let's Encrypt challenge. +Use per-router redirect middleware after certs are issued instead. + +## Known Issues and Gotchas + +1. **Traefik v3 vs Docker 29** - Traefik v3.x client uses API v1.24, Docker 29 minimum is v1.40. Stay on v2.11 until fixed. +2. **HTTP redirect breaks SSL** - Global entrypoint redirect prevents Let's Encrypt HTTP-01 validation. +3. **MongoDB needs AVX** - MongoDB 5+ requires AVX CPU. Proxmox CPU type must be "host", not emulated. +4. **netplan overrides** - Debian cloud images have netplan generating runtime configs that override static ones. Fix: remove netplan.io package. +5. **Multi-network routing** - Containers on multiple Docker networks need label `traefik.docker.network=proxy` so Traefik picks the correct IP. +6. **NATS JetStream** - Oktopus controller requires `--jetstream` flag on NATS. +7. **MQTT adapter config** - Needs explicit `MQTT_URL=tcp://mqtt:1883` environment variable. +8. **Adapter service** - Oktopus needs the `adapter` service (not just mqtt-adapter) for device queries via NATS. +9. **Frontend API URL** - Oktopus frontend `NEXT_PUBLIC_REST_ENDPOINT` must be empty so browser requests go through Traefik, not internal Docker DNS. +10. **Cloudflare tunnels** - Quick tunnels change URL on restart. Use systemd service for persistence, or better: direct SSH via public IP. + +## Container Registry + +Push custom images to git.targo.ca: + +```bash +docker login git.targo.ca -u louis +docker tag myimage:latest git.targo.ca/louis/myimage:latest +docker push git.targo.ca/louis/myimage:latest +``` + +## Rebuild from scratch + +```bash +# On a fresh Debian 12 VM (Proxmox CPU type: host) +apt-get update && apt-get install -y git +git clone https://git.targo.ca/louis/gigafibre-infra.git /opt/infra +cd /opt/infra && bash setup.sh +# Then configure DNS via hub.gigafibre.ca +```