Evaluation¶
How the engine will be assessed against its goals for engagement and learning. Performance and engagement measured quantitatively; user experience and satisfaction qualitatively.
Building the engine is not enough; it has to demonstrably help visitors learn and engage, and do so respectfully. The evaluation plan groups checks into three modules, each noting whether it is run internally (by the team) or externally (with users), and each with example metrics so assessment is concrete rather than vague.
User modelling and adaptation (internal) assesses how well the system builds profiles and adapts over time.
| Aspect | Metric |
|---|---|
| User persona alignment | correlation between survey/persona data and generated recommendations (qualitative) |
| Adaptivity over time | improvement in relevant content after repeated interactions; simulations of extreme interest, frustration, disinterest (quantitative) |
| Cold-start handling | success rate for new or anonymous profiles, measured as sustained engagement in the chosen topic (quantitative) |
User experience (external) measures engagement, learning, and perceived personalisation, coordinated with WP7.
| Aspect | Metric |
|---|---|
| Knowledge gain | pre- and post-surveys and user interviews |
| Engagement and immersion | average session duration and navigation depth from implicit indicators |
| Personalisation experience | survey results, target mean at least 4 on a 5-point Likert scale, plus interviews |
Ethical (internal) evaluates sensitivity and balance, per the MEMORISE ethical guidelines.
| Aspect | Metric |
|---|---|
| Diversity and sensitivity | distributional analysis of recommended content across themes, communities, and narratives, attending to similarity and dissimilarity so the nuance of HNP is communicated |
These modules will be put into practice and reported in the next deliverable, D5.4 (performance evaluation of the Individual Experience framework on final applications).