AWS re:Invent 2019 – Highlights from AWS product releases on Day Two
This is the second of five articles in the AWS re:Invent 2019 recap series. If you missed any of the posts, you can find them here:
- AWS re:Invent 2019 – Highlights from Machine Learning Workshops on Day One
- AWS re:Invent 2019 – Highlights from AWS Product Releases on Day Three
- AWS re:Invent 2019 – Highlights from Day Four
Panoramic’s Senior Data Scientist Nik Buenning provides a glimpse into day two of AWS re:Invent 2019 highlighting:
- Amazon Web Services (AWS) CEO Andy Jassy’s keynote presentation on notable product releases
- TensorFlow Workshop: Distributed training, tuning, and interference with TensorFlow in Amazon SageMaker
Amazon Web Services (AWS) CEO Andy Jassy’s Keynote Review
Day two of AWS re:Invent started with the first keynote address by AWS CEO Andy Jassy. During the presentation, there were a number of new services that were announced. Below is a list of some of the services that caught my attention:
- Graviton2 instances, specifically M6g, R6g, and C6g, offer 40% better pricing and performance than the standard x86 instances.
- Inf1 instances that were designed to handle high volumes of machine learning inference computations.
- The announcement of Amazon Fargate for Amazon EKS seemed to get the most cheers from the crowd. Companies will no longer need to create and manage EC2 instances for their Amazon EKS clusters. Simply put, customers no longer need to be Kubernetes experts.
- Amazon Managed Apache Cassandra Service (MCS) allows customers to use Cassandra clusters without the usual headaches of managing (set up, configure, and maintain) the nodes.
- Amazon SageMaker Studio appears to be an impressive IDE for the development of machine learning models.
- Amazon SageMaker Autopilot performs optimizations to find the best machine learning algorithms and hyperparameters given in a dataset. The service also produces python code if the customer wants to perform additional tuning.
- Amazon SageMaker Debugger, as the name suggests, facilitates users to debug their machine learning model code.
- Amazon CodeGuru automatically reviews code and finds lines that computationally expensive.
- Amazon Fraud Detector creates a fraud detection model for users with just a few clicks and without any machine learning knowledge.
- Amazon Kendra uses natural language processing to search a company’s website and applications to help find the right data and information across all of the content.
As previously mentioned, this is only a snippet of the full list of AWS services announced during Jassy’s keynote, SageMaker being my personal favorite. I am excited to implement some of these new services and components while developing new data science tools within the Panoramic platform.
Distributed training, tuning, and inference with TensorFlow in Amazon SageMaker
This workshop explored three different labs to better understand how to use TensorFlow, one of the most widely used machine learning libraries within SageMaker. TensorFlow can train, tune, and deploy machine learning models on one and multiple compute nodes (instances).
Lab 1. Sentiment Analysis
This lab created a sentiment analysis model on IMDB movie reviews using a convolutional neural network. In this lab, we demonstrated what we learned:
- How to use Script Mode along with a prebuilt TensorFlow container for model training.
- How to use Local Mode for model training, which allows code testing before starting a larger, full-scale training job.
- Offline and asynchronous machine learning prediction/inference of large batches of data.
Lab 2. Boston Housing Market
This lab uses the publicly available Boston Housing dataset to predict housing prices based on 13 features, such as average number of rooms, accessibility to radial highways, adjacency to the Charles River, and more. The machine learning workflow for this lab involves local training and hosted training in SageMaker. The workflow also includes both local and hosted inference with a real-time endpoint. This lab also uses Eager execution, which is the default in TensorFlow 2, that executes each line of code immediately, as opposed to building a static computational/symbolic graph. Finally, the lab also demonstrates the use hyperparameter tuning via Bayesian methods, though other methods are available within SageMaker.
Lab 3. Image Classifier and Distributed Training
The purpose of lab 3 was to perform distributed training on a cluster of multiple machines. Amazon SageMaker makes it easy to manage a cluster for model training, taking care of the cluster setup and tear down.
In this lab we used the CIFAR-10 dataset to build a convolutional neural network to classify images. This dataset consists of 60,000 images that fall into ten different classes.
We explored two options for the distributed training:
- Parameter Servers: processes that receive asynchronous updates from worker nodes and distribute updated gradients to all workers.
- Horovod: a framework based on Ring-AllReduce, wherein worker nodes synchronously exchange gradient updates only with two other workers at a time.
The best method when conducting distributed training is an ongoing debate, which the workshop discussed briefly. The image classifier bias and accuracy is shown above for both training and testing data when using training using Parameter Servers. Note the training curves (blue) are much smoother than the validation curves (orange). This is due to the relatively small size of the validation dataset (i.e., it is not completely representative as the training dataset).
How could Panoramic use TensorFlow in Amazon SageMaker? TensorFlow is one of the most popular (python) libraries for building neural networks, and we have already used TensorFlow in the past. However, we don’t typically use it in tandem with SageMaker. I particularly liked how easily it was to deploy a TensorFlow model and get endpoints that can be used for API calls. We could use the tools and methods from these labs to deploy some of our TensorFlow neural networks (e.g., chatbot).
Stay tuned for more reviews from AWS re:Invent 2019 in the days to come.