Ingestion Flow
End-to-end walkthrough of POST /character/processing. Synchronous validation, async job queue, BookNLP extraction, two-pass prompt generation, parallel indexing into Qdrant and TypeSense.
The ingestion endpoint turns a book file into indexed chunks and one system prompt per character. The synchronous part is small; the heavy work happens in a background worker.
Sequence
Synchronous Phase
request_validator.verify_request runs in the request handler. It:
- Looks up the book in Supabase by
book_id. If it does not exist, returns422. - Downloads the file from the signed URL using
httpxwith a 120-second timeout. - Checks the extension. Anything other than
.txtor.mdis rejected with422. - Pushes a job dict to the Redis list
processing_queueviaRPUSH. - Returns
202with the job id.
The file is held in memory by the request handler. If the handler crashes after enqueue, the worker will redownload the file from the signed URL on its own.
Async Phase
The JobManager is a long-running coroutine started in the FastAPI lifespan. It:
- Polls Redis with
LPOPevery 10 seconds. - Acquires an
asyncio.Lockto atomically check capacity (at mostMAX_PROCESSING_WORKERSconcurrent jobs) and reserve a slot. - Dispatches the job to
process_jobviaasyncio.to_threadbecause BookNLP is synchronous.
Per-Job Steps
process_job runs in a worker thread and operates on temp files in /tmp/{book_id}/. The steps are:
- BookNLP extraction. Run the entity, quote, coref, supersense, and event pipelines on the book file. The output is a set of JSON files in the temp directory.
- Character extraction.
character_extraction.pyparses the BookNLP output into a list of structured character profiles (name, aliases, descriptors, quote counts, top relations). - System prompt generation. For each character,
system_prompt_builder.pymakes a two-pass LLM call:- Pass 1 at
temperature=0.3produces a structuredCharacterAnalysisJSON (psychology, voice, motivations, relationships, signature themes). - Pass 2 at
temperature=0.7transforms that JSON into a free-text markdown persona prompt.
- Pass 1 at
- Disk write. The final prompt is written to
SYSTEM_PROMPT_STORAGE_PATH/{book_id}/{safe_name}.md. - Supabase insert. A row is inserted into the
characterstable with the prompt, the BookNLP profile, and references back to the book. - Index pipeline. The book's full text is cleaned, chunked by token count, and indexed. See below.
- Status update. The book row's
processing_statusis set tocompleted(orfailedif anything threw and was caught).
Index Pipeline
index_pipeline.py runs after the per-character steps:
- Chunk the cleaned book text using
tiktokenwitho200k_base. The chunk size is configurable; each chunk gets a stablechunk_indexbased on its position in the book. - Create a Qdrant collection named after the
book_id(or recreate it if rerunning). - Create a TypeSense collection named after the
book_id(or recreate it if rerunning). - Embed each chunk through the configured embedding provider. Embedding calls retry up to three times on
BadRequestErrorwith delays of 30, 120, and 200 seconds; other errors propagate. - Upsert into Qdrant and TypeSense in parallel.
Failure Handling
- Per-character failures are caught individually. A failed system prompt or Supabase insert for one character is logged, but the rest of the book's characters continue.
- Per-chunk failures are counted. If any chunk fails to index, the book's
processing_statusis set tofailed, but successfully indexed chunks remain in Qdrant and TypeSense. - Temp files. On success the entire
/tmp/{book_id}/directory is cleaned up. On failure the BookNLP output and character profiles are removed regardless (in afinallyblock), while the original input file and chunks directory are preserved for debugging.
The full failure matrix is in Failure Isolation.
Endpoints
The backend exposes exactly two business endpoints: POST /character/processing and POST /character/chat. Plus /health and /metrics for operations.
Chat Flow
End-to-end walkthrough of POST /character/chat. Per-session lock, query rewrite, hybrid retrieval across Qdrant and TypeSense, rerank, stream, persist.