Falsafa
SystemHigh-Level Design

Book Ingestion Flow

End-to-end trace of uploading a book: form upload to indexed chunks in Qdrant and TypeSense, with payloads at every hop.

End-to-End Flow: Book Ingestion

This flow traces a book from the user's upload form to indexed chunks in Qdrant and TypeSense.

Payloads at Each Hop

User to Frontend - upload form:

POST /api/books/upload
Content-Type: multipart/form-data

Fields:
  title: "The Republic"
  author: "Plato"
  description: "A philosophical dialogue about justice..."
  pageCount: 380
  categoryId: 1
  visibility: "public"
  mainCharacter: "Socrates"
  coverImage: (binary, image/jpeg, up to 5 MB)
  bookFiles: (binary, .txt or .md, up to 100 MB)

Frontend to Supabase Storage - cover upload:

PUT /storage/v1/object/falsafa_public/covers/{bookId}/the-republic.jpg
Content-Type: image/jpeg
Authorization: Bearer <anon_key>

(binary body)

Response: { "Key": "covers/{bookId}/the-republic.jpg" }

Frontend to Supabase Storage - file upload:

PUT /storage/v1/object/falsafa_private/{bookId}/the-republic.txt
Content-Type: text/plain
Authorization: Bearer <service_role_key> <!-- getAdminClient() uses service-role key -->

(binary body)

Frontend to Supabase DB - book row:

INSERT INTO books(id, title, slug, author, description, page_count, cover_image_url, file_path, category_id, uploader_id, status, processing_status, is_public, is_free)
VALUES (
  '{book_id}',
  'The Republic',
  'the-republic',
  'Plato',
  'A philosophical dialogue about justice...',
  380,
  'https://<project>.supabase.co/storage/v1/object/public/falsafa_public/covers/{bookId}/the-republic.jpg',
  '{bookId}/the-republic.txt',
  1,
  '{user_id}',
  'approved',
  'pending',
  true,
  true
)

Frontend inserts user_library row alongside the book row:

INSERT INTO user_library(user_id, book_id, reading_status, current_page)
VALUES ('{user_id}', '{book_id}', 'unread', 0)

Frontend to Backend - processing request:

POST http://backend:8001/character/processing
Content-Type: application/json

{
  "book_id": "{book_id}",
  "file_url": "https://<project>.supabase.co/storage/v1/object/sign/falsafa_private/{bookId}/the-republic.txt?token=<signed_url_token>",
  "user_id": "{user_id}"
}

Response: 202 { "status": "accepted", "job_id": "{uuid}" }

Backend to Redis - job queue:

RPUSH processing_queue {
  "book_id": "{book_id}",
  "file_url": "signed_url",
  "user_id": "{user_id}",
  "enqueued_at": "2026-06-07T12:00:00Z"
}

Backend to Supabase - verify book (during job dispatch):

SELECT id, title, language, file_path FROM books WHERE id = '{book_id}'

Backend to filesystem - download:

/tmp/{book_id}/original.txt  (from signed URL, 120s timeout)

Backend to BookNLP - character extraction:

/tmp/{book_id}/:
  {book_id}.entity
  {book_id}.quote
  {book_id}.coref
  {book_id}.supersense
  {book_id}.event

Backend to LLM - character analysis pass 1:

POST https://gateway.truefoundry.ai/chat/completions
Authorization: Bearer <openai_api_key>

{
  "model": "falsafa-temp/chat",
  "temperature": 0.3,
  "messages": [
    {"role": "system", "content": "Analyze this character profile..."},
    {"role": "user", "content": "Name: Socrates\nAliases: ...\nQuotes: ..."}
  ],
  "response_format": { "type": "json_object" }
}

Response:

{
  "psychology": "Socratic questioning, relentless pursuit of definitions...",
  "voice": "Dialectical, ironic, probing...",
  "motivations": ["truth", "virtue", "refutation of sophistry"],
  "relationships": [{"name": "Glaucon", "type": "interlocutor"}],
  "signatureThemes": ["justice", "forms", "philosopher-king"]
}

Backend to LLM - persona prompt pass 2:

POST https://gateway.truefoundry.ai/chat/completions
Authorization: Bearer <openai_api_key>

{
  "model": "falsafa-temp/chat",
  "temperature": 0.7,
  "messages": [
    {"role": "system", "content": "Write a first-person persona prompt..."},
    {"role": "user", "content": "CharacterAnalysis JSON from pass 1"}
  ]
}

Response:

I am Socrates, a philosopher in the Athenian agora. I question everything...

Backend to filesystem - prompt backup:

SYSTEM_PROMPT_STORAGE_PATH/{book_id}/socrates.md

Backend to Supabase - character row:

INSERT INTO characters(id, book_id, name, description, system_prompt, profile_json, avatar_url, color, emoji)
VALUES (
  '{character_id}',
  '{book_id}',
  'Socrates',
  'Athenian philosopher...',
  'I am Socrates, a philosopher in the Athenian agora...',
  '{ "psychology": "...", "voice": "...", ... }',
  NULL, '#4A90D9', '🏛️'
)

Backend to chunker - text chunking:

Input: Full book text, ~100k tokens
Output: 200 chunks, each ~512 tokens via tiktoken o200k_base
chunks[0] = { chunk_index: 0, text: "Book I. I went down to the Piraeus..." }
chunks[1] = { chunk_index: 1, text: "Socrates: Tell me, Cephalus..." }
...

Backend to Embedding API - per chunk:

POST https://gateway.truefoundry.ai/embeddings
Authorization: Bearer <openai_api_key>

{
  "model": "falsafa-temp/embed",
  "input": "Book I. I went down to the Piraeus..."
}

Response: {"data": [{"embedding": [0.001, ..., 0.532]}], "model": "falsafa-temp/embed", "usage": {"prompt_tokens": 12}}

Backend to Qdrant - batch upsert:

PUT https://qdrant:6333/collections/{book_id}/points
Content-Type: application/json
api-key: <qdrant_api_key>

{
  "points": [
    {
      "id": 0,
      "vector": [0.001, ..., 0.532],
      "payload": {
        "chunk_index": 0,
        "text": "Book I. I went down to the Piraeus...",
        "book_id": "{book_id}"
      }
    }
  ],
  "wait": true
}

Backend to TypeSense - index document:

POST https://typesense:8108/collections/{book_id}/documents
X-TYPESENSE-API-KEY: <typesense_api_key>

{
  "id": "{book_id}:0",
  "chunk_index": 0,
  "text": "Book I. I went down to the Piraeus...",
  "book_id": "{book_id}"
}

Backend to Supabase - completion:

UPDATE books SET processing_status = 'completed' WHERE id = '{book_id}'

Flow Diagram

On this page