Deployment Overview¶

Everything you need to run the MEMORISE AI stack, on your own machine or on cloud infrastructure. This section covers the active components, their configuration, and step-by-step guides for local and production deployment.

What you deploy¶

flowchart TB
    app["In-memorial app"]
    rs["RudderStack<br/>(event source)"]
    subgraph apps["Application services (you run these)"]
        api["AI Engine API<br/>FastAPI · :8000"]
        ce["Content Engine<br/>FastAPI · :8002"]
    end
    subgraph data["Data stores"]
        qd[("Qdrant<br/>:6333")]
        rd[("Redis<br/>:6379")]
    end
    llm["Ollama / OpenRouter<br/>(LLM)"]
    omeka["Omeka CMS<br/>(content source)"]

    app -->|events| rs -->|webhook| api
    app -->|search / recommend| api
    api --> qd & rd & llm
    ce --> qd
    ce --> omeka

    classDef store fill:#EFEAE0,stroke:#A8895B,color:#423D34;
    class qd,rd store;

Active components and ports¶

The stack runs on a Kubernetes cluster. Each component we operate has its own walkthrough (local Docker for dev plus Kubernetes manifests). Keycloak, the LLM endpoint, and Omeka are run by other teams; you only point at their URLs and keys in Configuration.

Component	Role	Port	Deploy guide
AI Engine API	serving REST API (search, recommend, narrative) and the ingest webhook	8000	Walkthrough
Content Engine	ingest content into Qdrant	8002	Walkthrough
Qdrant	vector + tag store	6333, 6334	Walkthrough
Redis	event buffer + materialized user model	6379	Walkthrough
RudderStack	behavioral event source into `POST /api/ingest`	n/a	Walkthrough
Ollama / OpenRouter	LLM for narratives	11434 / API	run by another team; set its URL/key
Keycloak	auth for the LLM gateway	8080	run by another team
Omeka	content source of truth	n/a	already deployed by the content team

User and event capture

Visitor behavior flows through the recsys path: RudderStack to POST /api/ingest to the Redis event buffer, then into the materialized user model. There is no separate relational database in the active stack.

Minimum vs full stack¶

Minimum to serve recommendations: AI Engine API, Qdrant, Redis, plus a populated Qdrant collection (run the Content Engine once).
Add narratives: an LLM endpoint (Ollama or OpenRouter), and Keycloak if you use the MEMORISE LLM gateway.
Add behavioral capture: a RudderStack source pointed at POST /api/ingest.

The recsys core degrades gracefully: if REDIS_URL or QDRANT_API_URL are unset it falls back to in-memory fakes, so the API boots even before infra exists (useful for smoke tests).

Prerequisites¶

Docker and Docker Compose (local), or a Kubernetes cluster (cloud).
Python 3.12+ if running services outside containers.
Network access to your chosen LLM endpoint.

Pick a path¶

Configuration: every environment variable, grouped by service.
Local with Docker Compose: one machine, from zero to a working API.
Cloud and production: managed data stores, Kubernetes, security, and scaling.