Chat Message Flow
End-to-end trace of a single chat message: browser to SSE stream, including caching, locking, rewrite, hybrid retrieval, rerank, and persist.
End-to-End Flow: Chat Message
This flow traces a single user message from the chat input to the SSE stream back.
Payloads at Each Hop
Browser to Frontend - chat message:
POST /api/chat
Content-Type: application/json
Cookie: sb-{ref}-auth-token=<token>
{
"session_id": "{session_id}",
"message": "What would you do if you were free?"
}Frontend to Supabase - session & preference fetch:
-- Validate session ownership
SELECT id, user_id, character_id, book_id FROM chat_sessions WHERE id = '{session_id}'
-- Fetch user preferences for this character
SELECT relationship_mode, custom_relationship, speech_modifiers, behavioral_modifiers, preference_summary
FROM user_preferences WHERE user_id = '{user_id}' AND character_id = '{character_id}'Frontend compiles preferences into a modifier block:
USER PREFERENCE MODIFIERS
- Relationship mode: partner
- Speech modifiers: be_informal
- Behavioral modifiers: be_flirty
- Preference summary: Socrates should challenge my assumptionsFrontend to Backend - proxied chat request:
POST http://backend:8001/character/chat
Content-Type: application/json
Accept: text/event-stream
X-User-Id: {user_id}
X-API-Key: {process.env.BACKEND_API_KEY || ''}
{
"user_id": "{user_id}",
"character_id": "{character_id}",
"book_id": "{book_id}",
"session_id": "{session_id}",
"user_message": "What would you do if you were free?"
}Note: X-API-Key is sent by the frontend but the backend does not validate it. The X-User-Id header is advisory only - the backend reads user_id from the JSON body.
Backend to Redis - cache check + lock:
# Cache check (individual GET calls, no MGET)
GET chat:char:{character_id}
GET chat:sess:{session_id}
GET chat:book:{book_id}
# On any miss, fetch from Supabase and SETEX with 3600s TTL
SELECT id, user_id, character_id, message_count FROM chat_sessions WHERE id = '{session_id}'
SELECT id, name, system_prompt, profile_json FROM characters WHERE id = '{character_id}'
SELECT id, title, author, description FROM books WHERE id = '{book_id}'
# Acquire lock
SET chat:lock:{session_id} {hex_token} NX EX 30Backend to Supabase - session validation (if cache miss):
SELECT id, user_id FROM chat_sessions WHERE id = '{session_id}'
-- user_id must match request.user_idBackend to LLM - query rewrite:
POST https://gateway.truefoundry.ai/chat/completions
Authorization: Bearer <openai_api_key>
{
"model": "falsafa-temp/chat",
"temperature": 0,
"max_tokens": 200,
"messages": [
{"role": "system", "content": "Rewrite the user's question as a narrative query and a keyword query. Character psychology: \"Socratic questioning, relentless pursuit of definitions...\""},
{"role": "user", "content": "What would you do if you were free?\n\nLast 4 messages: [\"Justice is harmony of the soul\", \"Tell me more about the forms\", \"The cave represents ignorance\", \"Education is the art of turning the soul towards the light\"]"}
]
}Response:
{
"narrative_query": "What actions would a philosopher who values truth and virtue take if they were liberated from the constraints of Athenian society and could pursue wisdom without restriction?",
"keyword_query": "freedom philosopher truth virtue action escape prison"
}Cache key: chat:qr:{character_id}:{sha256hex[:16]} where the hash input is user_message + last_4_messages_content.
Backend to Qdrant - vector search:
POST https://qdrant:6333/collections/{book_id}/points/search
api-key: <qdrant_api_key>
{
"vector": [0.002, ..., 0.412], // embedded narrative_query
"limit": 10,
"with_payload": true
}Response:
{
"result": [
{
"id": 42,
"score": 0.89,
"payload": {
"chunk_index": 42,
"text": "Socrates: The unexamined life is not worth living. If I were free of these chains, I would spend every hour in the agora, questioning those who claim to know...",
"book_id": "{book_id}"
}
},
...
]
}Backend to TypeSense - BM25 search:
GET https://typesense:8108/collections/{book_id}/documents/search?q=freedom+philosopher+truth+virtue&query_by=text&per_page=10
X-TYPESENSE-API-KEY: <typesense_api_key>Response:
{
"hits": [
{
"document": {
"id": "{book_id}:42",
"chunk_index": 42,
"text": "Socrates: The unexamined life is not worth living...",
"book_id": "{book_id}"
}
},
...
]
}Backend to Jina Reranker - rerank:
POST https://api.jina.ai/v1/rerank
Authorization: Bearer <reranker_api_key>
{
"model": "jina-reranker-v3",
"query": "What would you do if you were free?",
"documents": [
"Socrates: The unexamined life is not worth living...",
"Glaucon: Would you not escape the cave...",
...
],
"top_n": 5,
"return_documents": false
}Response:
{
"results": [
{"index": 0, "relevance_score": 0.97},
{"index": 1, "relevance_score": 0.82}
]
}The backend uses the index field (0-based position in the input) to look up the original chunk, then attaches the relevance_score to it. If RERANKER_API_KEY is empty, the backend falls back to sorting chunks by their raw similarity scores from Qdrant and TypeSense.
Backend to LLM - stream completion:
POST https://gateway.truefoundry.ai/chat/completions
Authorization: Bearer <openai_api_key>
Accept: text/event-stream
{
"model": "falsafa-temp/chat",
"stream": true,
"messages": [
{"role": "system", "content": "I am Socrates...\n\nRelevant Passages:\n[1] Socrates: The unexamined life is not worth living...\n[2] Glaucon: Would you not escape the cave...\n\nConversation Summary: The user and Socrates have discussed justice, the forms, and the allegory of the cave."},
{"role": "user", "content": "What would you do if you were free?"}
]
}Backend to Frontend - SSE stream:
data: {"token":"If "}
data: {"token":"I "}
data: {"token":"were "}
data: {"token":"free "}
data: {"token":"from "}
data: {"token":"the "}
data: {"token":"chains "}
data: {"token":"of "}
data: {"token":"this "}
data: {"token":"body"}
data: {"token":","}
data: {"token":" I "}
data: {"token":"would "}
data: {"token":"devote "}
data: {"token":"every "}
data: {"token":"day "}
data: {"token":"to "}
data: {"token":"the "}
data: {"token":"pursuit "}
data: {"token":"of "}
data: {"token":"wisdom"}
data: {"token":"."}
data: {"done":true,"full_response":"If I were free from the chains of this body, I would devote every day to the pursuit of wisdom."}Backend to Supabase - persist after stream:
INSERT INTO messages(session_id, role, content)
VALUES ('{session_id}', 'user', 'What would you do if you were free?');
INSERT INTO messages(session_id, role, content)
VALUES ('{session_id}', 'assistant', 'If I were free from the chains of this body, I would devote every day to the pursuit of wisdom.');
UPDATE chat_sessions
SET message_count = message_count + 1,
preview = 'If I were free from the chains of this body...',
updated_at = NOW()
WHERE id = '{session_id}';Backend to Redis - session cache update:
SETEX chat:sess:{session_id} 3600 "<updated session JSON with last 10 messages + summary>"
# Release lock (Lua CAS - script compares stored token before deleting)
EVAL "if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end" 1 chat:lock:{session_id} {hex_token}