Setting Goals and Choosing Metrics for Recommender System Evaluations
- Number of items that are either relevant or irrelevant and either contained in the recommendation set of a user or not
- How many of Top K contains relevant items
- If the recommendation list contains only relevant items, then the area under the curve is in fact zero
- Relevant items that are retrieved at the end of the list with no irrelevant items following do not add to the area under the limited curve.
- A top-k list that contains more relevant items will yield a higher score than a list with less relevant items
- How many of Top K contains relevant items. If the recommendation list contains only relevant items
Common metrics to evaluate recommendation systems
A ROC curve plots recall (true positive rate) against fallout (false positive rate) for increasing recommendation set size
- True Positive items are therefore the items that you showed in your Top-N list that match what the user preferred in her held-out testing set
- False Positive are the items in your Top-N list that don't match her preferred items in her held-out testing set
- True Negative items are those you didn't include in your Top-N recommendations and are items the user didn't have in her preferred items in her held-out testing set.
- False Negative are items you didn't include in your Top-N recommendations but do match what the user preferred in her held-out testing set.
Classification: ROC Curve and AUC
On Sampled Metrics for Item Recommendation
Keep Exploring!!!
No comments:
Post a Comment