Endpoints

The backend exposes exactly two business endpoints: POST /character/processing and POST /character/chat. Plus /health and /metrics for operations.

The backend is intentionally small at the HTTP surface. There are two business endpoints and two operational endpoints. Anything more complex is composed from these.

Summary

Method	Path	Purpose	Response
GET	`/health`	Liveness probe	`200 text/plain "ok"`
GET	`/metrics`	Prometheus scrape	Prometheus text format
POST	`/character/processing`	Submit a book for ingestion	`202` with `job accepted`, or `400`/`422`/`500`
POST	`/character/chat`	Stream a chat reply	SSE stream of `token`, `done`, or `error` events

POST /character/processing

Submit a book for ingestion. The request is validated synchronously and a job is enqueued; the response returns 202 immediately.

Request body

{
  "book_id": "uuid",
  "file_url": "https://<supabase>/storage/...",
  "user_id": "uuid"
}

Behavior

Confirm the book exists in Supabase.
Download the file via httpx with a 120-second timeout.
Reject anything that is not .txt or .md (returns 422).
Push a job dict to the Redis list processing_queue.
Return 202 with {"status": "accepted", "job_id": "..."}.

Error responses

400 (malformed body)
422 (invalid file extension or missing book)
500 (internal failure, e.g., Supabase unreachable)

The actual BookNLP extraction, prompt generation, and indexing happen asynchronously in the JobManager. The frontend observes completion through the Supabase books.processing_status field.

The endpoint validates the session (it must exist in Supabase, and the user_id must match), acquires a per-session write lock in Redis, and then begins streaming. See Chat Flow for the full sequence.

Success events

data: {"token":"..."}, one per LLM token
data: {"done":true,"full_response":"..."}, the final event

SSE error events

data: {"error":"...","code":429} (another request is already streaming for this session)
data: {"error":"...","code":502} (LLM call failed mid-stream)

Session-not-found and user-mismatch are returned as HTTP 404 before the SSE stream starts. Once the stream begins, errors are emitted as SSE events.

GET /health

Liveness probe. Returns 200 text/plain "ok". No body, no caching headers, no dependencies checked. Used by Traefik and Docker health checks.

GET /metrics

Prometheus scrape endpoint. Returns metrics in the Prometheus text format. No caching headers. The full list of metrics is in Observability.

Why So Few Endpoints

The backend is a focused service. The frontend owns auth, library, payments, comments, notifications, and the admin panel. Pushing more endpoints into the backend would mean duplicating the auth and ownership logic that already lives in the frontend. Keeping the surface to two business endpoints keeps the trust boundary narrow: the backend trusts the user_id and book_id from the frontend, and that is enough.