Falsafa
BackendHigh-Level Design

Endpoints

The backend exposes exactly two business endpoints: POST /character/processing and POST /character/chat. Plus /health and /metrics for operations.

The backend is intentionally small at the HTTP surface. There are two business endpoints and two operational endpoints. Anything more complex is composed from these.

Summary

MethodPathPurposeResponse
GET/healthLiveness probe200 text/plain "ok"
GET/metricsPrometheus scrapePrometheus text format
POST/character/processingSubmit a book for ingestion202 with job accepted, or 400/422/500
POST/character/chatStream a chat replySSE stream of token, done, or error events

POST /character/processing

Submit a book for ingestion. The request is validated synchronously and a job is enqueued; the response returns 202 immediately.

Request body

{
  "book_id": "uuid",
  "file_url": "https://<supabase>/storage/...",
  "user_id": "uuid"
}

Behavior

  1. Confirm the book exists in Supabase.
  2. Download the file via httpx with a 120-second timeout.
  3. Reject anything that is not .txt or .md (returns 422).
  4. Push a job dict to the Redis list processing_queue.
  5. Return 202 with {"status": "accepted", "job_id": "..."}.

Error responses

  • 400 (malformed body)
  • 422 (invalid file extension or missing book)
  • 500 (internal failure, e.g., Supabase unreachable)

The actual BookNLP extraction, prompt generation, and indexing happen asynchronously in the JobManager. The frontend observes completion through the Supabase books.processing_status field.

POST /character/chat

Stream a chat reply for one message. Returns an SSE stream.

Request body

{
  "user_id": "uuid",
  "character_id": "uuid",
  "book_id": "uuid",
  "session_id": "uuid",
  "user_message": "What would you do if you were free?"
}

Behavior

The endpoint validates the session (it must exist in Supabase, and the user_id must match), acquires a per-session write lock in Redis, and then begins streaming. See Chat Flow for the full sequence.

Success events

  • data: {"token":"..."}, one per LLM token
  • data: {"done":true,"full_response":"..."}, the final event

SSE error events

  • data: {"error":"...","code":429} (another request is already streaming for this session)
  • data: {"error":"...","code":502} (LLM call failed mid-stream)

Session-not-found and user-mismatch are returned as HTTP 404 before the SSE stream starts. Once the stream begins, errors are emitted as SSE events.

GET /health

Liveness probe. Returns 200 text/plain "ok". No body, no caching headers, no dependencies checked. Used by Traefik and Docker health checks.

GET /metrics

Prometheus scrape endpoint. Returns metrics in the Prometheus text format. No caching headers. The full list of metrics is in Observability.

Why So Few Endpoints

The backend is a focused service. The frontend owns auth, library, payments, comments, notifications, and the admin panel. Pushing more endpoints into the backend would mean duplicating the auth and ownership logic that already lives in the frontend. Keeping the surface to two business endpoints keeps the trust boundary narrow: the backend trusts the user_id and book_id from the frontend, and that is enough.

On this page