Robi: production RAG assistant
Retrieval-augmented chatbot with hybrid search, guardrails, eval, and live monitoring.
LIVE DEMODeployment
Robi runs as a small production service rather than a local demo. The deployment is designed to be boring: containerized services, HTTPS by default, health checks, and repeatable deploys.
Runtime
The backend runs on a VPS with Docker Compose. FastAPI serves the API, Postgres stores the retrieval corpus, Redis handles cache and rate limits, Prometheus scrapes metrics, and Grafana renders the dashboard.
HTTPS and routing
Caddy sits in front of the service and handles automatic HTTPS. Public traffic goes through chat.robertjeanpierre.com, while the portfolio frontend talks to the /ask endpoint from the About page widget.
CI/CD
GitHub Actions runs checks and deploys on push to main. The deploy path rebuilds the service and restarts the stack so code, corpus updates, monitoring config, and API changes move together.
Health checks
The public health endpoint confirms the backend is reachable. It gives a simple operational signal that the service is up behind the reverse proxy.
Why not serverless
Robi uses Postgres with pgvector, Redis, Prometheus, Grafana, and model-side retrieval components. A VPS keeps those pieces close together and makes the full stack observable. For this project, operational control mattered more than serverless convenience.