AWS re:Invent 2019 – Highlights from Machine Learning Workshops on Day One
This is the first of five articles in the AWS re:Invent 2019 recap series. If you missed any of the posts, you can find them here:
- AWS re:Invent 2019 – Highlights from AWS Product Releases on Day Two
- AWS re:Invent 2019 – Highlights from Machine Learning Workshops on Day Three
- AWS re:Invent 2019 – Highlights from Day Four
Panoramic’s Senior Data Scientist Nik Buenning recaps notable moments from the following machine learning workshops at AWS re:Invent 2019 including:
- Build a Content-recommendation Engine with Amazon Personalize
- Amazon SageMaker RL: Solving business problems with RL and bandits
Workshop Review I:
Build a Content-recommendation Engine with Amazon Personalize
The first workshop we’ll review focused on Amazon Personalize, a tool for building and training a recommender system that ingests user data and provides relevant product recommendations based on the data. This is similar to the recommendations Amazon makes for consumer products.
Amazon Personalize has the ability to personalize the user experience (e.g., “these are the TV shows we think you’ll like”), find related items (e.g., customers who viewed this item also viewed this item), and rank recommendations (a smart rank).
The workshop provided Jupyter Notebook instances within Amazon Sagemaker that imports the data, selects the recommender algorithm, trains the model, deploys it (ideally to an app or endpoint), and gets recommendations (a slide that shows a schematic of the workflow is shown above). For this workshop, my team and I used a movie rating dataset to make recommendations to each user based on their ratings and the ratings of other users.
Specifically, we explored the Hierarchical Recurrent Neural Network (HRNN) algorithm (or recipe) for the recommendation, which incorporates user ratings (or any feature) and how it changes over time. There are other recipes that a developer can use with Amazon Personalize. HRNN-Meta allows the user to enter additional meta-data about users, though they warn that it often leads to worse results. HRNN-coldstart is a preferable recipe to use when new items are continually being added and need to be included in the recommendations. Popularity Count is a simple recipe that returns the top item from the dataset as a whole and is independent of the user. Personalized-Rankings returns a list of ranked items. And finally, SIMS uses a method (akin to a collaborative filter) to recommend items similar to a given item.
The notebook was designed for the purpose of getting our feet wet so that we can build upon it for our own use-cases.
How can Panoramic use Amazon Personalize? Panoramic could use Personalize to make the boards and charts “smart.” We can see which features, charts, or boards users engage with most, and make recommendations based on a combination of their personal platform use and overall engagement data. We could also use Personalize to tailor our email notifications based on how users interact with certain features within the email.
Workshop Review II:
Amazon SageMaker RL: Solving business problems with RL and bandits
This workshop provided real-world use cases for Reinforcement Learning (RL) algorithms. When most people hear the term RL, they think of a machine learning model that is designed to play Go or Atari games. However, they can also be used to solve real-world problems.
This workshop focused on using a Bandits model to find the best action for a given user context. This is an alternative to a more traditional A/B Test, and in many ways, preferable as it efficiently weeds out actions that give sub-optimal results (see image to the left).
We worked on another Jupyter Notebook within Amazon Sagemaker at this workshop. This notebook uses toy data to find the best action for a given set of feature values (the “state” in RL). The notebook uses the “Contextual Bandits” RL algorithm for this particular example, however, it’s worth noting that more sophisticated RL algorithms are also available.
The second half of the workshop shifted focus from the machine learning aspect of notebook to ETL pipeline workflows that use the built-in Vowpal Wabbit (VW) container to train and deploy the RL models. This pipeline uses a variety of different AWS microservices (e.g., S3, Sagemaker, and Athena). In general, the user interacts with the Sagemaker endpoint, and the sets of state, action, and rewards is stored in S3. After enough data is collected a new model will be trained. This new model will be evaluated (based on the negative of a cost function) and compared to the previous model. If the new model has a better score it will replace the previous version. A more detailed schematic of the workflow is shown below.
How can Panoramic use Amazon Sagemaker RL? We could use Amazon Sagemaker RL anytime we want to perform an A/B Test within our platform. For example, we have recently implemented several sets of email notifications that include interactive links. We may want to try different templates to test which format or text of the email will lead to the optimal engagement rate. Amazon Sagemaker RL could help us personalize custom emails for users that receive our email alerts. Similar testing could also be applied to features within the Panoramic platform.
Stay tuned for more workshop reviews from AWS re:Invent 2019 in the days to come.