Backend Overview

The Falsafa backend is a focused Python ML/LLM service. It extracts characters from books and streams in-character chat replies. It does not own auth, payments, or user-facing data.

The backend is the ML and LLM service for Falsafa. It is a single FastAPI application that runs inside a Docker container, depends on Redis, Qdrant, TypeSense, Supabase, and at least one LLM provider, and exposes exactly two business endpoints to the frontend.

What It Does

The backend handles the two things that are not feasible to do well in a Next.js app:

Book ingestion. When a reader submits a book, the backend runs BookNLP to extract entities, quotes, coreferences, supersenses, and events. It turns those into structured character profiles, asks an LLM to write a two-pass in-character system prompt for each, persists the prompts, then chunks the full text and indexes it into Qdrant (vector) and TypeSense (BM25) in parallel.
Chat streaming. When a reader sends a message to a character, the backend rewrites the query for retrieval, runs hybrid search across Qdrant and TypeSense, reranks the results, and streams an in-character reply grounded in the book. The whole reply is grounded in the indexed book text and bounded by the character system prompt.

Everything else (login, library, payments, comments, notifications, the admin panel) lives in the frontend and Supabase. The backend never touches Stripe, never serves a UI, and never owns user accounts.

Place in the Platform

                Browser
                   │
                   │ HTTP + SSE
                   ▼
        ┌─────────────────────┐
        │  Next.js Frontend   │   Auth, library, payments, UI
        │  (Port 3000)        │
        └──────────┬──────────┘
                   │  2 endpoints
                   │  /character/processing
                   │  /character/chat
                   ▼
        ┌─────────────────────┐
        │  FastAPI Backend    │   BookNLP, LLM, retrieval
        │  (Port 8001)        │
        └────┬───┬───┬───┬────┘
             │   │   │   │
             ▼   ▼   ▼   ▼
         Redis  Qdrant  TypeSense  Supabase

The frontend is the system of record for user-facing data. The backend is a worker that the frontend calls. It is stateless across requests except for Redis caches and locks.

What It Owns

The backend owns three things and only three things:

Character system prompts, the in-character LLM persona for every book. Stored in Supabase characters table (and mirrored to disk for backup).
Book text indexes, Qdrant per-book collection (vector) and TypeSense per-book collection (BM25).
Async job queue, a Redis FIFO list of pending ingestion jobs, plus the polling loop that drains it.

Everything else in the system is read or written by the frontend directly. The backend reads Supabase (with the service-role key) only to validate ownership and to write back the artifacts it produced.

What It Does Not Handle

Auth. Supabase Auth is the source of truth. The backend trusts the user_id and book_id that the frontend sends.
Payments, library, wishlist, comments, ratings, notifications, admin. All frontend.
User-uploaded assets. Book files, covers, and avatars live in Supabase Storage. The backend downloads book files by signed URL only.
UI. The backend is a JSON and SSE API. There is no web frontend.

Quick Facts


Language	Python 3.11
Web framework	FastAPI on Uvicorn
Container	`python:3.11-slim`, port 8001
Endpoints	2 business + `/health` + `/metrics`
Heavy work	BookNLP (sync, runs in thread), LLM calls (async, streaming)
State	Stateless across requests; relies on Redis, Qdrant, TypeSense, Supabase
Auth	None of its own; trusts the frontend

Where to Go Next

High-Level Design, covering tech stack, module layout, ingestion and chat flows, concurrency, failure isolation, observability. Start here if you are working on or debugging the backend.
Architecture Overview, the system-wide picture of how the backend, frontend, Supabase, Stripe, and ML providers fit together.