Skip to content

Deployment Overview

Everything you need to run the MEMORISE AI stack, on your own machine or on cloud infrastructure. This section covers the active components, their configuration, and step-by-step guides for local and production deployment.

What you deploy

flowchart TB
    app["In-memorial app"]
    rs["RudderStack<br/>(event source)"]
    subgraph apps["Application services (you run these)"]
        api["AI Engine API<br/>FastAPI · :8000"]
        ce["Content Engine<br/>FastAPI · :8002"]
    end
    subgraph data["Data stores"]
        qd[("Qdrant<br/>:6333")]
        rd[("Redis<br/>:6379")]
    end
    llm["Ollama / OpenRouter<br/>(LLM)"]
    omeka["Omeka CMS<br/>(content source)"]

    app -->|events| rs -->|webhook| api
    app -->|search / recommend| api
    api --> qd & rd & llm
    ce --> qd
    ce --> omeka

    classDef store fill:#EFEAE0,stroke:#A8895B,color:#423D34;
    class qd,rd store;

Active components and ports

The stack runs on a Kubernetes cluster. Each component we operate has its own walkthrough (local Docker for dev plus Kubernetes manifests). Keycloak, the LLM endpoint, and Omeka are run by other teams; you only point at their URLs and keys in Configuration.

Component Role Port Deploy guide
AI Engine API serving REST API (search, recommend, narrative) and the ingest webhook 8000 Walkthrough
Content Engine ingest content into Qdrant 8002 Walkthrough
Qdrant vector + tag store 6333, 6334 Walkthrough
Redis event buffer + materialized user model 6379 Walkthrough
RudderStack behavioral event source into POST /api/ingest n/a Walkthrough
Ollama / OpenRouter LLM for narratives 11434 / API run by another team; set its URL/key
Keycloak auth for the LLM gateway 8080 run by another team
Omeka content source of truth n/a already deployed by the content team

User and event capture

Visitor behavior flows through the recsys path: RudderStack to POST /api/ingest to the Redis event buffer, then into the materialized user model. There is no separate relational database in the active stack.

Minimum vs full stack

  • Minimum to serve recommendations: AI Engine API, Qdrant, Redis, plus a populated Qdrant collection (run the Content Engine once).
  • Add narratives: an LLM endpoint (Ollama or OpenRouter), and Keycloak if you use the MEMORISE LLM gateway.
  • Add behavioral capture: a RudderStack source pointed at POST /api/ingest.

The recsys core degrades gracefully: if REDIS_URL or QDRANT_API_URL are unset it falls back to in-memory fakes, so the API boots even before infra exists (useful for smoke tests).

Prerequisites

  • Docker and Docker Compose (local), or a Kubernetes cluster (cloud).
  • Python 3.12+ if running services outside containers.
  • Network access to your chosen LLM endpoint.

Pick a path