AWS, Clustering and Data Scientist - Data Science Current

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

AWS Machine Learning Blog

DECEMBER 24, 2024

The process of setting up and configuring a distributed training environment can be complex, requiring expertise in server management, cluster configuration, networking and distributed computing. To simplify infrastructure setup and accelerate distributed training, AWS introduced Amazon SageMaker HyperPod in late 2023.

AWS

AWS Clustering Deep Learning Deep Learning

Serverless Kubernetes Has Become Invaluable to Data Scientists

Smart Data Collective

MARCH 2, 2022

Standards and expectations are rapidly changing, especially in regards to the types of technology used to create data science projects. Most data scientists are using some form of DevOps interface these days. There are a lot of important nuances for data scientists using Kubernetes. Why Serverless in Kubernetes?

Data Scientist

Data Scientist Clustering Data Science AWS

Open source observability for AWS Inferentia nodes within Amazon EKS clusters

AWS Machine Learning Blog

APRIL 17, 2024

Despite the availability of advanced distributed training libraries, it’s common for training and inference jobs to need hundreds of accelerators (GPUs or purpose-built ML chips such as AWS Trainium and AWS Inferentia ), and therefore tens or hundreds of instances. or later NPM version 10.0.0

AWS

AWS Clustering ML ML

Webinars

How to Achieve High-Accuracy Results When Using LLMs

Maximizing Profit and Productivity: The New Era of AI-Powered Accounting

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

It allows data scientists and machine learning engineers to interact with their data and models and to visualize and share their work with others with just a few clicks. SageMaker Canvas has also integrated with Data Wrangler , which helps with creating data flows and preparing and analyzing your data.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

The Hadoop environment was hosted on Amazon Elastic Compute Cloud (Amazon EC2) servers, managed in-house by Rockets technology team, while the data science experience infrastructure was hosted on premises. Communication between the two systems was established through Kerberized Apache Livy (HTTPS) connections over AWS PrivateLink.

Data Science

Data Science AWS Hadoop Data Scientist

Transforming financial analysis with CreditAI on Amazon Bedrock: Octus’s journey with AWS

AWS Machine Learning Blog

MARCH 10, 2025

We walk through the journey Octus took from managing multiple cloud providers and costly GPU instances to implementing a streamlined, cost-effective solution using AWS services including Amazon Bedrock, AWS Fargate , and Amazon OpenSearch Service. Along the way, it also simplified operations as Octus is an AWS shop more generally.

AWS

AWS Database AI AI

Map Earth’s vegetation in under 20 minutes with Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 16, 2024

Amazon SageMaker supports geospatial machine learning (ML) capabilities, allowing data scientists and ML engineers to build, train, and deploy ML models using geospatial data. We use the purpose-built geospatial container with SageMaker Processing jobs for a simplified, managed experience to create and run a cluster.

ML

ML ML Clustering Machine Learning

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

AWS Machine Learning Blog

MARCH 3, 2025

SageMaker HyperPod recipes help data scientists and developers of all skill sets to get started training and fine-tuning popular publicly available generative AI models in minutes with state-of-the-art training performance. The launcher will interface with your cluster with Slurm or Kubernetes native constructs.

Clustering

Clustering AWS ML ML

Boost your MLOps efficiency with these 6 must-have tools and platforms

Data Science Dojo

FEBRUARY 20, 2023

It allows data scientists to build models that can automate specific tasks. It provides a large cluster of clusters on a single machine. Spark is a general-purpose distributed data processing engine that can handle large volumes of data for applications like data analysis, fraud detection, and machine learning.

Machine Learning

Machine Learning Machine Learning AWS Azure

Bring legacy machine learning code into Amazon SageMaker using AWS Step Functions

AWS Machine Learning Blog

MARCH 15, 2023

Tens of thousands of AWS customers use AWS machine learning (ML) services to accelerate their ML development with fully managed infrastructure and tools. We demonstrate how two different personas, a data scientist and an MLOps engineer, can collaborate to lift and shift hundreds of legacy models.

AWS

AWS Machine Learning Machine Learning Data Scientist

Integrate HyperPod clusters with Active Directory for seamless multi-user login

AWS Machine Learning Blog

APRIL 22, 2024

Amazon SageMaker HyperPod is purpose-built to accelerate foundation model (FM) training, removing the undifferentiated heavy lifting involved in managing and optimizing a large training compute cluster. In this solution, HyperPod cluster instances use the LDAPS protocol to connect to the AWS Managed Microsoft AD via an NLB.

Clustering

Clustering AWS Machine Learning Machine Learning

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The compute clusters used in these scenarios are composed of more than thousands of AI accelerators such as GPUs or AWS Trainium and AWS Inferentia , custom machine learning (ML) chips designed by Amazon Web Services (AWS) to accelerate deep learning workloads in the cloud.

Clustering

Clustering AWS ML ML

Develop and train large models cost-efficiently with Metaflow and AWS Trainium

AWS Machine Learning Blog

APRIL 29, 2024

This is a guest post co-authored with Ville Tuulos (Co-founder and CEO) and Eddie Mattia (Data Scientist) of Outerbounds. For AWS and Outerbounds customers, the goal is to build a differentiated machine learning and artificial intelligence (ML/AI) system and reliably improve it over time.

AWS

AWS ML ML Python

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Orchestrate with Tecton-managed EMR clusters – After features are deployed, Tecton automatically creates the scheduling, provisioning, and orchestration needed for pipelines that can run on Amazon EMR compute engines. You can view and create EMR clusters directly through the SageMaker notebook.

ML

ML ML AWS AI

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

AWS Machine Learning Blog

NOVEMBER 27, 2024

Launching a machine learning (ML) training cluster with Amazon SageMaker training jobs is a seamless process that begins with a straightforward API call, AWS Command Line Interface (AWS CLI) command, or AWS SDK interaction. Surya Kari is a Senior Generative AI Data Scientist at AWS.

AWS

AWS Clustering ML ML

Deploy Amazon SageMaker pipelines using AWS Controllers for Kubernetes

AWS Machine Learning Blog

SEPTEMBER 4, 2024

A challenge for DevOps engineers is the additional complexity that comes from using Kubernetes to manage the deployment stage while resorting to other tools (such as the AWS SDK or AWS CloudFormation ) to manage the model building pipeline. kubectl for working with Kubernetes clusters. eksctl for working with EKS clusters.

AWS

AWS Clustering ML ML

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

AWS Machine Learning Blog

MAY 1, 2024

Llama2 by Meta is an example of an LLM offered by AWS. To learn more about Llama 2 on AWS, refer to Llama 2 foundation models from Meta are now available in Amazon SageMaker JumpStart. Virginia) and US West (Oregon) AWS Regions, and most recently announced general availability in the US East (Ohio) Region.

AWS

AWS ML ML Clustering

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

AWS Machine Learning Blog

NOVEMBER 14, 2024

We recently announced the general availability of cross-account sharing of Amazon SageMaker Model Registry using AWS Resource Access Manager (AWS RAM) , making it easier to securely share and discover machine learning (ML) models across your AWS accounts.

AWS

AWS ML ML Machine Learning

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

AWS Machine Learning Blog

SEPTEMBER 3, 2024

Seamless integration with SageMaker – As a built-in feature of the SageMaker platform, the EMR Serverless integration provides a unified and intuitive experience for data scientists and engineers. This same interface is also used for provisioning EMR clusters. The following diagram illustrates this solution.

AWS

AWS Clustering Big Data Big Data

Implementing login node load balancing in SageMaker HyperPod for enhanced multi-user experience

AWS Machine Learning Blog

DECEMBER 13, 2024

Multiple users such as ML researchers, software engineers, data scientists, and cluster administrators can work concurrently on the same cluster, each managing their own jobs and files without interfering with others. This blog post specifically applies to HyperPod clusters using Slurm as the orchestrator.

Clustering

Clustering AWS ML ML

Automate chatbot for document and data retrieval using Agents and Knowledge Bases for Amazon Bedrock

AWS Machine Learning Blog

MAY 1, 2024

Our unstructured data comes from the Amazon EC2 User Guide for Linux Instances and Amazon EC2 Instance Types documentation, and the structured data is derived from the EC2 Instance On-Demand Pricing for the US East (N. Virginia) AWS Region. file for deploying the solution using the AWS CDK. xlarge instance to $24.78

AWS

AWS Machine Learning Machine Learning SQL

Connecting Amazon Redshift and RStudio on Amazon SageMaker

AWS Machine Learning Blog

DECEMBER 29, 2022

However, working with data in the cloud can present challenges, such as the need to remove organizational data silos, maintain security and compliance, and reduce complexity by standardizing tooling. AWS offers tools such as RStudio on SageMaker and Amazon Redshift to help tackle these challenges. 1 Public subnet. 1 NAT gateway.

AWS

AWS Machine Learning Machine Learning Natural Language Processing

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

AWS Machine Learning Blog

MARCH 30, 2023

The match-related data is collected and ingested using DFL’s DataHub. Metadata of the match is processed within the AWS Lambda function MetaDataIngestion , while positional data is ingested using the AWS Fargate container called MatchLink. Tareq Haschemi is a consultant within AWS Professional Services.

AWS

AWS Machine Learning Machine Learning Apache Kafka

Achieve operational excellence with well-architected generative AI solutions using Amazon Bedrock

AWS Machine Learning Blog

OCTOBER 2, 2024

In this post, we share AWS guidance that we have learned and developed as part of real-world projects into practical guides oriented towards the AWS Well-Architected Framework , which is used to build production infrastructure and applications on AWS. To learn more, see Log Amazon Bedrock API calls using AWS CloudTrail.

AWS

AWS AI AI ML

Use mobility data to derive insights using Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

JANUARY 17, 2024

In this first post, we introduce mobility data, its sources, and a typical schema of this data. We then discuss the various use cases and explore how you can use AWS services to clean the data, how machine learning (ML) can aid in this effort, and how you can make ethical use of the data in generating visuals and insights.

Clustering

Clustering AWS ML ML

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the best price-performance at any scale. To do this, we provide an AWS CloudFormation template to create a stack that contains the resources.

ML

ML ML AWS Data Warehouse

Accelerating time-to-insight with MongoDB time series collections and Amazon SageMaker Canvas

AWS Machine Learning Blog

DECEMBER 18, 2023

By using these capabilities, businesses can efficiently store, manage, and analyze time-series data, enabling data-driven decisions and gaining a competitive edge. sales-train-data is used to store data extracted from MongoDB Atlas, while sales-forecast-output contains predictions from Canvas. Note we have two folders.

Clustering

Clustering AWS Database ML

OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

AWS Machine Learning Blog

FEBRUARY 5, 2025

The listing indexer AWS Lambda function continuously polls the queue and processes incoming listing updates. With Amazon OpenSearch Service, you get a fully managed solution that makes it simple to deploy, scale, and operate OpenSearch in the AWS Cloud. For data handling, 24 data nodes (r6gd.2xlarge.search

K-nearest Neighbors

K-nearest Neighbors Machine Learning Machine Learning Database

Scalable training platform with Amazon SageMaker HyperPod for innovation: a video generation case study

AWS Machine Learning Blog

SEPTEMBER 26, 2024

During the iterative research and development phase, data scientists and researchers need to run multiple experiments with different versions of algorithms and scale to larger models. However, building large distributed training clusters is a complex and time-intensive process that requires in-depth expertise.

Clustering

Clustering Algorithm ML ML

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

APRIL 5, 2024

SageMaker geospatial capabilities make it straightforward for data scientists and machine learning (ML) engineers to build, train, and deploy models using geospatial data. Now, with the specialized geospatial container in SageMaker, managing and running clusters for geospatial processing has become more straightforward.

Clustering

Clustering ML ML AWS

Connect Amazon EMR and RStudio on Amazon SageMaker

AWS Machine Learning Blog

APRIL 17, 2023

In conjunction with tools like RStudio on SageMaker, users are analyzing, transforming, and preparing large amounts of data as part of the data science and ML workflow. Data scientists and data engineers use Apache Spark, Hive, and Presto running on Amazon EMR for large-scale data processing.

Clustering

Clustering AWS Machine Learning Machine Learning

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 16, 2023

This is a joint blog with AWS and Philips. Since 2014, the company has been offering customers its Philips HealthSuite Platform, which orchestrates dozens of AWS services that healthcare and life sciences companies use to improve patient care.

ML

ML ML AWS AI

How VMware built an MLOps pipeline from scratch using GitLab, Amazon MWAA, and Amazon SageMaker

Flipboard

MARCH 13, 2023

There are two main purposes for building this pipeline: support the data scientists for late-stage model development, and surface model predictions in the product by serving models in high volume and in real-time production traffic. Amazon SNS is fully managed pub/sub service for A2A and A2P messaging.

ML

ML ML AWS Data Scientist

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 2

AWS Machine Learning Blog

JANUARY 13, 2023

Because they’re in a highly regulated domain, HCLS partners and customers seek privacy-preserving mechanisms to manage and analyze large-scale, distributed, and sensitive data. To mitigate these challenges, we propose a federated learning (FL) framework, based on open-source FedML on AWS, which enables analyzing sensitive HCLS data.

AWS

AWS Analytics Analytics Machine Learning

How Veriff decreased deployment time by 80% using Amazon SageMaker multi-model endpoints

AWS Machine Learning Blog

OCTOBER 16, 2023

Infrastructure and development challenges Veriff’s backend architecture is based on a microservices pattern, with services running on different Kubernetes clusters hosted on AWS infrastructure. These APIs required production-grade code, which made it challenging for data scientists to productionize models.

Data Scientist

Data Scientist ML ML AWS

MLOps and DevOps: Why Data Makes It Different

O'Reilly Media

OCTOBER 19, 2021

ML use cases rarely dictate the master data management solution, so the ML stack needs to integrate with existing data warehouses. Prior to the cloud, setting up and operating a cluster that can handle workloads like this would have been a major technical challenge. Data Science Layers. Software Development Layers.

ML

ML ML Data Scientist AWS

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

AWS Machine Learning Blog

APRIL 16, 2024

Amazon SageMaker Studio provides a fully managed solution for data scientists to interactively build, train, and deploy machine learning (ML) models. In the process of working on their ML tasks, data scientists typically start their workflow by discovering relevant data sources and connecting to them.

SQL

SQL AWS Database Data Scientist

Accelerate time to insight with Amazon SageMaker Data Wrangler and the power of Apache Hive

AWS Machine Learning Blog

MARCH 10, 2023

Aggregating and preparing large amounts of data is a critical part of ML workflow. Data scientists and data engineers use Apache Spark, Apache Hive, and Presto running on Amazon EMR for large-scale data processing. For each option, we deploy a unique stack of AWS CloudFormation templates.

Clustering

Clustering AWS ML ML

Deploy a Hugging Face (PyAnnote) speaker diarization model on Amazon SageMaker as an asynchronous endpoint

AWS Machine Learning Blog

APRIL 25, 2024

We provide a comprehensive guide on how to deploy speaker segmentation and clustering solutions using SageMaker on the AWS Cloud. Solution overview Amazon Transcribe is the go-to service for speaker diarization in AWS. Make sure the AWS account has a service quota for hosting a SageMaker endpoint for an ml.g5.2xlarge instance.

AWS

AWS ML ML Python

Power recommendations and search using an IMDb knowledge graph – Part 3

AWS Machine Learning Blog

JANUARY 6, 2023

Many AWS media and entertainment customers license IMDb data through AWS Data Exchange to improve content discovery and increase customer engagement and retention. In Part 1 , we discussed the applications of GNNs and how to transform and prepare our IMDb data into a knowledge graph (KG). Background. Prerequisites.

AWS

AWS ML ML Machine Learning

Integrate SaaS platforms with Amazon SageMaker to enable ML-powered applications

AWS Machine Learning Blog

JULY 6, 2023

Many organizations choose SageMaker as their ML platform because it provides a common set of tools for developers and data scientists. We also deep dive into the most common architectures and AWS resources to facilitate these integrations. In some cases, an ISV may deploy their software in the customer AWS account.

ML

ML ML AWS Data Scientist

How Booking.com modernized its ML experimentation framework with Amazon SageMaker

AWS Machine Learning Blog

FEBRUARY 12, 2024

One of the several challenges faced was adapting the existing on-premises pipeline solution for use on AWS. The solution involved two key components: Modifying and extending existing code – The first part of our solution involved the modification and extension of our existing code to make it compatible with AWS infrastructure.

ML

ML ML AWS Machine Learning

Automating the Automators: Shift Change in the Robot Factory

O'Reilly Media

JANUARY 17, 2023

Given that, what would you say is the job of a data scientist (or ML engineer, or any other such title)? A common task for a data scientist is to build a predictive model. You know the drill: pull some data, carve it up into features, feed it into one of scikit-learn’s various algorithms. Building Models.

ML

ML ML Data Scientist Machine Learning

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

Serverless Kubernetes Has Become Invaluable to Data Scientists

Webinars

Trending Sources

Open source observability for AWS Inferentia nodes within Amazon EKS clusters

Webinars

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

How Rocket Companies modernized their data science solution on AWS

Transforming financial analysis with CreditAI on Amazon Bedrock: Octus’s journey with AWS

Map Earth’s vegetation in under 20 minutes with Amazon SageMaker

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

Boost your MLOps efficiency with these 6 must-have tools and platforms

Bring legacy machine learning code into Amazon SageMaker using AWS Step Functions

Integrate HyperPod clusters with Active Directory for seamless multi-user login

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod

Develop and train large models cost-efficiently with Metaflow and AWS Trainium

Real value, real time: Production AI with Amazon SageMaker and Tecton

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

Deploy Amazon SageMaker pipelines using AWS Controllers for Kubernetes

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

Implementing login node load balancing in SageMaker HyperPod for enhanced multi-user experience

Automate chatbot for document and data retrieval using Agents and Knowledge Bases for Amazon Bedrock

Connecting Amazon Redshift and RStudio on Amazon SageMaker

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

Achieve operational excellence with well-architected generative AI solutions using Amazon Bedrock

Use mobility data to derive insights using Amazon SageMaker geospatial capabilities

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Accelerating time-to-insight with MongoDB time series collections and Amazon SageMaker Canvas

OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

Scalable training platform with Amazon SageMaker HyperPod for innovation: a video generation case study

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

Connect Amazon EMR and RStudio on Amazon SageMaker

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

How VMware built an MLOps pipeline from scratch using GitLab, Amazon MWAA, and Amazon SageMaker

Federated Learning on AWS with FedML: Health analytics without sharing sensitive data – Part 2

How Veriff decreased deployment time by 80% using Amazon SageMaker multi-model endpoints

MLOps and DevOps: Why Data Makes It Different

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

Accelerate time to insight with Amazon SageMaker Data Wrangler and the power of Apache Hive

Deploy a Hugging Face (PyAnnote) speaker diarization model on Amazon SageMaker as an asynchronous endpoint

Power recommendations and search using an IMDb knowledge graph – Part 3

Integrate SaaS platforms with Amazon SageMaker to enable ML-powered applications

How Booking.com modernized its ML experimentation framework with Amazon SageMaker

Automating the Automators: Shift Change in the Robot Factory

Stay Connected