"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;
Showing posts with label recommendations. Show all posts
Showing posts with label recommendations. Show all posts

October 01, 2023

Google Dialogflow Notes

Pointers / References

  • Intent - Entity mapping / Options
  • Intent - Supplied from Traning Text 
  • Entity - Grab these values from User inputs


Ref - Link

GCP App Builder Notes

App type - Search / Chat / Recommendations



Data Source - Big Query / API / Site information




Architecture for Content Generation

Architecture for Chatbot



Google Retail 

Infuse your digital properties with Google-quality recommendations and search results that enhance user engagement, deliver personalized experiences

Using Google Online Analytics to personalize



Keep Exploring!!!

March 09, 2023

Recommendation - Distance Measures

 




Ref - Link

Keep Exploring!!!

Notes - Recommendation Metrics

Setting Goals and Choosing Metrics for Recommender System Evaluations

  • Number of items that are either relevant or irrelevant and either contained in the recommendation set of a user or not
  • How many of Top K contains relevant items



  • If the recommendation list contains only relevant items, then the area under the curve is in fact zero
  • Relevant items that are retrieved at the end of the list with no irrelevant items following do not add to the area under the limited curve.

  • A top-k list that contains more relevant items will yield a higher score than a list with less relevant items
  • How many of Top K contains relevant items. If the recommendation list contains only relevant items

Common metrics to evaluate recommendation systems

ROC Curve

A ROC curve plots recall (true positive rate) against fallout (false positive rate) for increasing recommendation set size

  • True Positive items are therefore the items that you showed in your Top-N list that match what the user preferred in her held-out testing set
  • False Positive are the items in your Top-N list that don't match her preferred items in her held-out testing set
  • True Negative items are those you didn't include in your Top-N recommendations and are items the user didn't have in her preferred items in her held-out testing set.
  • False Negative are items you didn't include in your Top-N recommendations but do match what the user preferred in her held-out testing set. 

Classification: ROC Curve and AUC

On Sampled Metrics for Item Recommendation

Keep Exploring!!!

January 11, 2023

Tiktok Algo Analysis

Ref  - Real-time Short Video Recommendation on Mobile Devices

  • Generally models the ranking problem as a regression (e.g., predict user’s rating of a video)
  • Classification (e.g., predict whether the user will like a video) task
  • Pair-wise ranking uses pair-wise loss functions to learn the semantic distance of a pair of items. Distance between embeddings
  • model personalization by combining local data set of each user and similar samples retrieved from cloud to train ranking model on device
  • Client maintains a watched video list, and all the features and user feedback of each video in the list will be collected and stored. Every time a video is consumed, it will be appended to the list, so we can extract real-time signals from this list with almost no latency

Ref - Link

User Features

  • Which accounts you follow
  • Creators you’ve hidden
  • Comments you’ve posted
  • Videos you’ve liked or shared on the app
  • Videos you’ve added to your favorites
  • Videos you’ve marked as “Not Interested”
  • Videos you’ve reported as inappropriate
  • Longer videos you watch all the way to the end (aka video completion rate)
  • Content you create on your own account
  • Interests you’ve expressed by interacting with organic content and ads

Signals from content

  • Captions
  • Sounds
  • Hashtags*
  • Effects
  • Trending topics

Info collected from device

TikTok’s data processing practices

In total, to reach a FYP stream that recommended one questionable video

out of every four videos, it took: 

  • An estimated 4 hours and 41 min, Viewing 650 videos in total (41 in the Search stream, 609 in the FYP stream)
  • Making 200 likes ظ Liking 25 videos in the search stream (For clarity, these are excluded from the harms tally)
  • Liking 175 videos in the FYP stream (of which 146 were questionable videos and 29 were borderline)
  • Making four searches for problematic hashtags
  • Swiping to ‘skip’ 352 videos before they were finished (an indication to the algorithm that you are not interested in this content).

Keep Exploring!!!

music streaming service recommendations - Ideas

  • Custom embeddings based on actors, movie choices, music director choices
  • User-user / item-item based on age, gender, persona
  • Weekday, and weekend patterns of groups of listeners
  • Location-based / time based - on the cab, commute to work, weekend
  • Custom search-based keywords based
  • Recency/relevance vs existing playlists
  • User personalization/customization for new/old / English / regional language / Spiritual
  • Balance between precompute / recent ranking
  • Custom embeddings to find similar songs with lyrics, text, music
  • A/B testing between paid / free users
  • Conversions / increase time / auto-populate with more context info
  • User affinity towards seasons / late-night sleep patterns/jogging music 
  • Millennials, Boomers, GenX, and Retired get some customization at each level
More Reads

For each segment understand the preferences


  • Seasonal Song
  • Emotional Song
  • Danceability
  • Loudness 







Popularity ranker: The first ranking algorithm is the popularity ranking, where we simply rank songs by popularity in descending order. we select the k most popular songs to show to the user

Relevance ranker: The second ranking algorithm is the relevance ranking, where we simply rank songs by their relevance to the user.  Given a large pool of songs and a given user u, we select the k most relevant songs

Learned ranker: The third ranking algorithm is a model learned based on user preferences. We train a neural regression model that scores each song for a given user based on user-level, songlevel, and interaction-level features





  • Next-item suggestion: predict user’s next action based on historical sequence.
  • Next-in the basket: What will a user add to their cart e.g., on e-commerce sites, fast-food drive-thru.
  • Session-based recommendation: Recommend items within short-term sessions e.g., on music streaming platform, social media etc.
  • Getting inputs from user like stitch fix
Session-Level Information
  • Time context. We use day of the week Dt and time of the day Ht. Note that even though, in Section 4, we partition sessions by using Dt only, both features are used for the model described in Section 5.
  • Device context. In addition, we consider the device Yt used by the user to access the service at the beginning of a session. We restrict ourselves to the major devices: Y = {mobile, desktop, speaker, web, tablet}.

Current approaches experience difficulty with combining emotional features of the music to
the listener’s personality due to the fact that people’s perception of music genres is different





  • Extroversion
  • Agreeableness
  • Consciousness
  • Neuroticism
  • Openness



  • We can precompute and cache item representations for items when they become available in the catalog or for all items at once, as they only depend on item features
  • Twitter is leveraging the Two Tower to combine earlier heuristics under a single umbrella model.
  • Pinterest begins their post by exploiting a characteristic of the Two Tower architecture, the in-batch negative sampling, to generate a high number of negative examples which in return leads to better performance on several key metrics.
Merlin Models relies on the schema object to automatically build all necessary input and output layers.

Rather than relying on the scoring or retrieval models to infer this business logic and to recommend items appropriately, it’s necessary to add a Filtering stage to your recommender system.

NVIDIA-Merlin
Merlin is a framework providing end-to-end GPU-accelerated recommender systems, from feature engineering to deep learning training and deploying to production


Keep Exploring!!!