Architecture Overview¶
Principle: ports & adapters (hexagonal)¶
The redesigned recommender (ai_engine.recsys) follows a pure typed core knows only
Protocols rule. All IO, Qdrant, Redis, RudderStack, the embedding model, lives behind
adapters. Every boundary is typed, so every stage is testable with fakes and no network.
flowchart TB
subgraph core["Pure core (no IO imports)"]
contracts["contracts/<br/>models · enums · config · ports"]
signals["signals/<br/>engagement · signal_builder"]
ranking["ranking/<br/>scorers · fusion"]
end
subgraph orch["Orchestration"]
updater["updater.py<br/>UserModelUpdater"]
recommender["recommender.py<br/>Recommender"]
composition["composition.py<br/>build_components"]
apilayer["api.py<br/>FastAPI router"]
end
subgraph adapters["adapters/ (IO)"]
qdrant["QdrantContentStore"]
redis["RedisEventBuffer<br/>RedisUserModelStore"]
rudder["rudderstack normalizer"]
fastembed["FastEmbedModel"]
end
subgraph fakes["testing/ (deterministic fakes)"]
fk["FakeContentStore<br/>FakeEventSource<br/>InMemoryUserModelStore"]
end
contracts --> signals --> ranking
updater --> signals
recommender --> ranking
updater -.uses ports.-> contracts
recommender -.uses ports.-> contracts
composition --> updater & recommender
composition --> adapters
composition -. fallback .-> fakes
apilayer --> composition
classDef pure fill:#e8f5e9,stroke:#2e7d32,color:#1b5e20;
classDef io fill:#ede7f6,stroke:#7c4dff,color:#311b92;
class contracts,signals,ranking pure;
class qdrant,redis,rudder,fastembed io;
Scope: pragmatic, not full hexagonal¶
A Protocol is introduced only where there are ≥2 real implementations or a deterministic test fake is needed:
EventSource: Postgres + RudderStack + PostHog incoming → pays off now.ContentStore: real Qdrant +FakeContentStorefor offline tests.EmbeddingModel: real fastembed +InMemoryEmbeddingModel(deterministic).UserModelStore: Redis for prod + in-memory fake.
Everything else (engagement scoring, signal building, the scorers, fusion + MMR) stays plain pure functions over typed models: already testable, a Protocol there would add indirection with no second impl.
The package map¶
ai_engine/
├── recsys/ # the redesign (typed, hexagonal)
│ ├── contracts/ # models · enums · config · ports (NO io imports)
│ ├── signals/ # engagement (pure) · signal_builder (the user model)
│ ├── ranking/ # scorers (→[0,1]) · fusion (weighted + MMR)
│ ├── adapters/ # qdrant · redis · fastembed · rudderstack
│ ├── testing/ # fakes + domain fixtures
│ ├── recommender.py # SERVE side
│ ├── updater.py # INGEST side
│ ├── composition.py # wiring root (env → real or fake)
│ └── api.py # FastAPI router (/ingest /recommend /usermodel)
│
├── search/ # legacy serving stack (Qdrant-coupled)
│ ├── global_searcher.py # umbrella: vector | geo | hybrid | similar | random
│ ├── vector_searcher.py # fastembed encode + Qdrant search
│ ├── geo_searcher.py # GeoRadius filter + hybrid geo+vector
│ ├── user_searcher.py # Qdrant recommend (avg pos/neg vectors)
│ ├── help_searcher.py # CommonSearch, low-level point/vector fetch
│ └── ingest_content.py # Omeka → embed → upsert pipeline
│
├── common.py # Item · User · Event · Session · SearchResult · narrative models
├── config.py # all env-driven settings
├── db_interface.py # PostgreSQL: users, events, sessions
├── projection_builder.py # user events → reading-time projection + profile text
├── user_state.py # reading-time estimate + interaction-success check
└── narrative.py # LLM narrative generation (Ollama/OpenRouter + Keycloak auth)
How the two stacks relate¶
flowchart LR
subgraph today["Serving today"]
gs["GlobalSearch"]
ur["UserRecommender"]
end
subgraph tomorrow["Redesign (core done, wiring pending)"]
rec["Recommender"]
upd["UserModelUpdater"]
end
api["ai-engine-api"]
api --> gs
api --> ur
api -. optional /recsys router .-> rec
ur -.replaced by.-> rec
gs -. semantic recall feeds .-> rec
The legacy search/ stack is what the FastAPI service calls today (/api/search/*).
The recsys redesign hardens the recommender: continuous engagement, expert-tag fusion,
diversity-aware MMR, and an online materialized user model. The API mounts the recsys
router conditionally at /api/* when the module + infra are present.
Key design decisions (locked)¶
- Tags live in Qdrant payload + payload index. Flat
tag_labels(facet:label, KEYWORD index) for recall viaFilter(should=...); graded tag ranking happens in pure Pythonscore_tag, not in Qdrant. - Rule-based weighted fusion + MMR. No learned ranker yet: collect impression/click
data first. Hard contract: every scorer returns
[0,1]so the weighted sum is valid. - Engagement is continuous, not binary
dwell >= estimate. - Dwell pairing moves out of SQL into the EventSource normalizer, shared across sources.
- Cold/warm routing on positive count; cold path uses survey + demographics → tag affinity.