AWS Machine Learning: Part 2: SageMaker | Nitor Infotech
Send me Nitor Infotech's Monthly Blog Newsletter!
×
nitor logo
  • Company
    • About
    • Leadership
    • Partnership
  • Resource Hub
  • Blog
  • Contact
nitor logo
Add more content here...
Artificial intelligence Big Data Blockchain and IoT
Business Intelligence Careers Cloud and DevOps
Digital Transformation Healthcare IT Manufacturing
Mobility Product Modernization Software Engineering
Thought Leadership
Aastha Sinha Abhijeet Shah Abhishek Suranglikar
Abhishek Tanwade Abhishek Tiwari Ajinkya Pathak
Amit Pawade Amol Jadhav Ankita Kulkarni
Antara Datta Anup Manekar Ashish Baldota
Chandra Gosetty Chandrakiran Parkar Deep Shikha Bhat
Dr. Girish Shinde Gaurav Mishra Gaurav Rathod
Gautam Patil Harish Singh Chauhan Harshali Chandgadkar
Kapil Joshi Madhavi Pawar Marappa Reddy
Milan Pansuriya Minal Doiphode Mohit Agarwal
Mohit Borse Nalini Vijayraghavan Neha Garg
Nikhil Kulkarni Omkar Ingawale Omkar Kulkarni
Pooja Dhule Pranit Gangurde Prashant Kamble
Prashant Kankokar Priya Patole Rahul Ganorkar
Ramireddy Manohar Ravi Agrawal Robin Pandita
Rohan Chavan Rohini Wwagh Sachin Saini
Sadhana Sharma Sambid Pradhan Sandeep Mali
Sanjeev Fadnavis Saurabh Pimpalkar Sayanti Shrivastava
Shardul Gurjar Shravani Dhavale Shreyash Bhoyar
Shubham Kamble Shubham Muneshwar Shubham Navale
Shweta Chinchore Sidhant Naveria Souvik Adhikary
Sreenivasulu Reddy Sujay Hamane Tejbahadur Singh
Tushar Sangore Vasishtha Ingale Veena Metri
Vidisha Chirmulay Yogesh Kulkarni
Artificial intelligence | 20 Jul 2022 |   11 min

AWS Machine Learning: Part 2: SageMaker

featured image

In my previous blog, we broadly understood what AWS is and how it provides machine learning as a service. These services give us a lot of flexibility to scale up or down computational resources whenever required. One of the most important services provided by AWS is AWS SageMaker. It basically gives developers and data scientists the ability to build, train, test, and deploy machine learning models into production with minimum effort and at a lower cost.

In today’s blog, allow me to acquaint you with AWS SageMaker.

Introduction to SageMaker

It is a fully managed cloud-based machine learning service, and it is made up of three different capabilities-

Retrieving data. Wait a few seconds and try to cut or copy again.

Let’s go through each of them one by one in brief.

Build

In SageMaker, we don’t have to worry about installing Python or Anaconda, as everything is taken care of by AWS. We can use all our favorite algorithms in this environment. SageMaker also supports preconfigured popular frameworks like Tensorflow and Pytorch in an optimized way. We can also bring custom algorithms and train on SageMaker.

Train

One biggest advantage of SageMaker is it helps in distributed training across single or multiple instances. It manages all the compute resources, which if needed, can scale petabytes of data. In the end, after training our model, SageMaker saves our model artifacts in S3 bucket.

Deploy

To get predictions, we can use real-time predictions and batch predictions. Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. They are fully managed and support autoscaling. For non-interactive use cases, we can use batch transform. In batch predictions we are not worried about a real-time endpoint or latency performance. SageMaker manages all the resources required for batch inference.

Instance Type and Pricing

An instance is basically a virtual server in cloud. An Amazon SageMaker notebook instance is a ML compute instance running the Jupyter Notebook App. SageMaker offers instances in four varieties-

  • Standard
  • Compute Optimized
  • Accelerated Computing
  • Inference Acceleration

Standard family are lowest cost instances which provide balanced CPU and memory performance. Examples are T2, T3, M5.

T type are good for notebook and development systems. M type are higher type instances suitable for CPU intensive model training and hosting.

Compute family are latest generation CPUs with higher performance. Examples are C4 and C5

These are suitable for CPU intensive model training and hosting.

Accelerated Computing Family are powerful GPUs. Examples are P2 and P3.

These instances are highly priced over other families. However, algorithms which are tuned for GPUs train faster in this family of instances.

Inference Acceleration is different than the above-mentioned three families. They can be added to other families of instances.

Ideally, it is recommended to choose an instance type based on the type of algorithms we are using. For example, scikit learn machine learning algorithms are CPU-based, hence we can use instances optimized for CPUs. So, the best thing we can do is to choose a family first and then experiment around with various instance sizes.

SageMaker Pricing Components

SageMaker has multiple components in pricing. It varies according to the instance type and size we choose. There is also a pricing component in storage allocated to the instance. Along with this, there is a variational cost involved in fractional GPUs. There is also a pricing component for data transfer in and out of the instance. In the end, the pricing also varies a bit around AWS regions. Here is the link to the pricing. The major components in development, training and inference are-

  • Instance, Fractional GPUs – Here the prices are hourly and varies based on instance type you choose. A variational cost is also involved if you choose fractional GPUs.
  • Storage- Storage cost is basically the disk space attached to the instance, and as of now it is 14 cents per GB/month.
  • For data transfer in and out the pricing is 1.6 cents per GB

Now that we have understood what SageMaker is and how the pricing works, let’s take a look at the algorithms and frameworks that are available to us.

Algorithms and Frameworks

SageMaker gives us four different options of training and hosting models.

  • Built-in Algorithms- They are the easiest to use and we can scale up easily as they are optimized for AWS cloud.
  • Pre-built Container Images- These container images support popular ML frameworks like TensorFlow, PyTorch, Scikit-learn, and so forth.
  • Extend Pre-built Container Images- We can extend container images and modify them according to our needs. These are for advanced users who contribute to the frameworks.
  • Custom Container Images- We can bring in our own model container image and host it on SageMaker.

Built-in Algorithms

These are the algorithms provided by SageMaker. They are highly optimized for AWS Cloud and can easily scale up. The algorithms provided by SageMaker are-

  • Supervised Machine learning Algorithms (Regression/Classification)

K-nearest Neighbours, Linear Learner, Xgboost, LightGBM, CatBoost

  • Time Series Forecasting (Regression)

DeepAR

  • Computer Vision

Image Classification, Semantic Segmentation

  • Natural Language Processing

Seq to Seq, Blazing Text, Object2Vec

  • Unsupervised Machine Learning Algorithms

K-means, LDA, Neural Topic Model, Principal Component Analysis, Random Cut Forest,

IP Insights

  • Recommendation Systems

Factorization Machines

You can get details about these algorithms here.

How to train your model

We can train our model in AWS SageMaker in multiple ways.

1. AWS CLI

Retrieving data. Wait a few seconds and try to cut or copy again.

2. Sagemaker Console

AWS CLI

3. Model Coding

SageMaker Console

We can use any of the three ways to build our model. We can code the model from end to end, use the AWS command line tool to pass the commands, or use the SageMaker console to train our model.

Cost Saving Tips

Here are a few dos and don’ts when it comes to cost saving-

  • Set up a budget, monitor it, and set up alert notifications.
  • Choose a perfect compute resource for the development of the model.
  • Don’t let your instances keep running; stop when you are not using them.
  • Use spot instances for training.
  • Use batch inferencing whenever there is no need of real inferencing and terminate real time inferencing when the job is done.

It can be overwhelming as we can see there are a lot of things involved in SageMaker, but if we understand our data, we will be able to figure out what we need in SageMaker. We also need to understand the pricing and how it works – this can benefit us in the long run.

In the next blog, we will try to understand how we can programmatically code in SageMaker. Till then, stay tuned and visit us at Nitor Infotech to learn more about what we do in the technology realm.

Related Topics

Artificial intelligence

Big Data

Blockchain and IoT

Business Intelligence

Careers

Cloud and DevOps

Digital Transformation

Healthcare IT

Manufacturing

Mobility

Product Modernization

Software Engineering

Thought Leadership

<< Previous Blog fav Next Blog >>
author image

Sambid Pradhan

Senior Software Engineer

Sambid is an AI enthusiast and is currently working with Nitor Infotech as a Senior Software engineer in the AI/ML team. He has extensive experience in working with Machine Learning, Computer Vision, and NLP projects. In his free time, he loves to connect and is always curious to understand the bridge between the real world and world of data science.

   

You may also like

featured image

10 Heuristic Principles in UX Engineering

Say, you’ve built a modern, cutting-edge application. It has a complex, multi-layered user interface (UI), that is the basis for some amazing features. Since you’re the one who has built the applic...
Read Blog


featured image

ETL Testing: A Detailed Guide

Just in case the term is new to you, ETL is defined from data warehousing and stands for Extract-Transform-Load. It covers the process of how the data is loaded from the multiple source system to t...
Read Blog


featured image

Getting Started with ArcGIS Online

GeoServer is an open-source server that facilitates the sharing, processing and editing of geospatial data. When we are dealing with a large set of geospatial d...
Read Blog


subscribe

Subscribe to our fortnightly newsletter!

We'll keep you in the loop with everything that's trending in the tech world.

Services

    Modern Software Engineering


  • Idea to MVP
  • Quality Engineering
  • Product Engineering
  • Product Modernization
  • Reliability Engineering
  • Product Maintenance

    Enterprise Solution Engineering


  • Idea to MVP
  • Strategy & Consulting
  • Enterprise Architecture & Digital Platforms
  • Solution Engineering
  • Enterprise Cognition Engineering

    Digital Experience Engineering


  • UX Engineering
  • Content Engineering
  • Peer Product Management
  • RaaS
  • Mobility Engineering

    Technology Engineering


  • Cloud Engineering
  • Cognitive Engineering
  • Blockchain Engineering
  • Data Engineering
  • IoT Engineering

    Industries


  • Healthcare
  • Retail
  • Manufacturing
  • BFSI
  • Supply Chain

    Company


  • About
  • Leadership
  • Partnership
  • Contact Us

    Resource Hub


  • White papers
  • Brochures
  • Case studies
  • Datasheet

    Explore More


  • Blog
  • Career
  • Events
  • Press Releases
  • QnA

About


With more than 16 years of experience in handling multiple technology projects across industries, Nitor Infotech has gained strong expertise in areas of technology consulting, solutioning, and product engineering. With a team of 700+ technology experts, we help leading ISVs and Enterprises with modern-day products and top-notch services through our tech-driven approach. Digitization being our key strategy, we digitally assess their operational capabilities in order to achieve our customer's end- goals.

Get in Touch


  • +1 (224) 265-7110
  • marketing@nitorinfotech.com

We are Social 24/7


© 2023 Nitor Infotech All rights reserved

  • Terms of Usage
  • Privacy Policy
  • Cookie Policy
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it. Accept Cookie policy