As part of the flurry of announcements coming this week out of AWS re:Invent, Amazon announced the release of Amazon SageMaker Operators for Kubernetes, a way for data scientists and developers to simplify training, tuning and deploying containerized machine learning models.
Packaging machine learning models in containers can help put them to work inside organizations faster, but getting there often requires a lot of extra management to make it all work. Amazon SageMaker Operators for Kubernetes is supposed to make it easier to run and manage those containers, the underlying infrastructure needed to run the models and the workflows associated with all of it.
“While Kubernetes gives customers control and portability, running ML workloads on a Kubernetes cluster brings unique challenges. For example, the underlying infrastructure requires additional management such as optimizing for utilization, cost and performance; complying with appropriate security and regulatory requirements; and ensuring high availability and reliability,” AWS’ Aditya Bindal wrote in a blog post introducing the new feature.
When you combine that with the workflows associated with delivering a machine learning model inside an organization at scale, it becomes part of a much bigger delivery pipeline, one that is challenging to manage across departments and a variety of resource requirements.
This is precisely what Amazon SageMaker Operators for Kubernetes has been designed to help DevOps teams do. “Amazon SageMaker Operators for Kubernetes bridges this gap, and customers are now spared all the heavy lifting of integrating their Amazon SageMaker and Kubernetes workflows. Starting today, customers using Kubernetes can make a simple call to Amazon SageMaker, a modular and fully-managed service that makes it easier to build, train, and deploy machine learning (ML) models at scale,” Bindal wrote.
The promise of Kubernetes is that it can orchestrate the delivery of containers at the right moment, but if you haven’t automated delivery of the underlying infrastructure, you can over (or under) provision and not provide the correct amount of resources required to run the job. That’s where this new tool, combined with SageMaker, can help.
“With workflows in Amazon SageMaker, compute resources are pre-configured and optimized, only provisioned when requested, scaled as needed, and shut down automatically when jobs complete, offering near 100% utilization,” Bindal wrote.
Amazon SageMaker Operators for Kubernetes are available today in select AWS regions.