Skip to content

Ranking

recsys/ranking/ scores candidates and fuses them into a diverse, explainable list. Pure functions throughout. The hard contract: every scorer returns [0,1], so the weighted sum is valid without rescaling.

scorers.py

Function Signature Range Formula
cosine (a, b) -> float [-1,1] standard; 0 if either side missing/zero
score_semantic (signals, candidate_vector) -> float [0,1] (cosine(taste_vec, cand_vec)+1)/2
score_tag (signals, content) -> float [0,1] Σ aff[l]·w[l] / Σ aff[l]

score_tag is the graded counterpart to Qdrant's coarse tag recall: affinity-weighted overlap of the user's tag_affinity with the candidate's tag weights, normalized by total affinity. Geo and popularity scorers are planned/off by default.

fusion.py

Weighted fusion, explainable

weighted_fuse(per_scorer: dict[str, float], weights: FusionWeights)
    -> tuple[float, dict[str, float]]
fused = Σ_s fusion_weight[s] · score_s

Returns the fused score and a {scorer: weight·score} breakdown that rides along in each ScoredCandidate, every recommendation can answer "why this item?".

MMR rerank, diversity

mmr_rerank(candidates, vectors, *, lambda_, limit) -> list[ScoredCandidate]

Greedy Maximal Marginal Relevance:

select  argmax_i [ λ · fused_i − (1−λ) · max_{j∈selected} cosine(vec_i, vec_j) ]
flowchart LR
    sc["scored candidates<br/>(fused score each)"] --> init["pick top-1 by fused"]
    init --> step{"selected < limit?"}
    step -->|yes| pick["pick argmax<br/>λ·relevance − (1−λ)·max sim to selected"]
    pick --> step
    step -->|no| out["ranked list (≤ limit)"]

λ = mmr_lambda (default 0.7) trades relevance vs diversity, higher λ favors raw score, lower λ pushes variety. Avoids returning ten near-duplicate stories on the same theme.

Property tested

MMR keeps the top-1 relevant item; raising λ → more relevance, lowering λ → more spread; output length ≤ limit.

Putting it together

flowchart TD
    cand["Candidates (semantic ∪ tag)"] --> ss["score_semantic"]
    cand --> st["score_tag"]
    ss & st --> wf["weighted_fuse → fused + breakdown"]
    wf --> mmr["mmr_rerank(λ, limit)"]
    mmr --> rec["Recommendation.items"]

This is the body of Recommender.recommend_for_signals, see Orchestration.


Full auto-generated reference

Code reference → Recsys package.