2019, AWS and Python - Data Science Current

Racing into the future: How AWS DeepRacer fueled my AI and ML journey

AWS Machine Learning Blog

NOVEMBER 19, 2024

In 2018, I sat in the audience at AWS re:Invent as Andy Jassy announced AWS DeepRacer —a fully autonomous 1/18th scale race car driven by reinforcement learning. But AWS DeepRacer instantly captured my interest with its promise that even inexperienced developers could get involved in AI and ML.

AWS

AWS ML ML AI

Announcing New Tools for Building with Generative AI on AWS

Flipboard

APRIL 13, 2023

At AWS, we have played a key role in democratizing ML and making it accessible to anyone who wants to use it, including more than 100,000 customers of all sizes and industries. AWS has the broadest and deepest portfolio of AI and ML services at all three layers of the stack. Today’s FMs, such as the large language models (LLMs) GPT3.5

AWS

AWS AI AI ML

Llama 4 family of models from Meta are now available in SageMaker JumpStart

AWS Machine Learning Blog

APRIL 7, 2025

Virginia) AWS Region. Prerequisites To try the Llama 4 models in SageMaker JumpStart, you need the following prerequisites: An AWS account that will contain all your AWS resources. An AWS Identity and Access Management (IAM) role to access SageMaker AI. Access to accelerated instances (GPUs) for hosting the LLMs.

AWS

AWS Machine Learning Machine Learning AI

Webinars

Going Beyond Chatbots: Connecting AI to Your Tools, Systems, & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Smart Tech + Human Expertise = How to Modernize Manufacturing Without Losing Control

MORE WEBINARS

Develop and train large models cost-efficiently with Metaflow and AWS Trainium

AWS Machine Learning Blog

APRIL 29, 2024

For AWS and Outerbounds customers, the goal is to build a differentiated machine learning and artificial intelligence (ML/AI) system and reliably improve it over time. First, the AWS Trainium accelerator provides a high-performance, cost-effective, and readily available solution for training and fine-tuning large models.

AWS

AWS ML ML Python

Fast and cost-effective LLaMA 2 fine-tuning with AWS Trainium

AWS Machine Learning Blog

OCTOBER 5, 2023

In this post, we walk through how to fine-tune Llama 2 on AWS Trainium , a purpose-built accelerator for LLM training, to reduce training times and costs. We review the fine-tuning scripts provided by the AWS Neuron SDK (using NeMo Megatron-LM), the various configurations we used, and the throughput results we saw.

AWS

AWS Machine Learning Machine Learning Deep Learning

AWS Inferentia2 builds on AWS Inferentia1 by delivering 4x higher throughput and 10x lower latency

AWS Machine Learning Blog

JUNE 13, 2023

AWS Inferentia2 was designed from the ground up to deliver higher performance while lowering the cost of LLMs and generative AI inference. In this post, we show how the second generation of AWS Inferentia builds on the capabilities introduced with AWS Inferentia1 and meets the unique demands of deploying and running LLMs and FMs.

AWS

AWS ML ML Deep Learning

Welcome to a New Era of Building in the Cloud with Generative AI on AWS

AWS Machine Learning Blog

NOVEMBER 30, 2023

The number of companies launching generative AI applications on AWS is substantial and building quickly, including adidas, Booking.com, Bridgewater Associates, Clariant, Cox Automotive, GoDaddy, and LexisNexis Legal & Professional, to name just a few. Innovative startups like Perplexity AI are going all in on AWS for generative AI.

AWS

AWS AI AI ML

Announcing new Jupyter contributions by AWS to democratize generative AI and scale ML workloads

AWS Machine Learning Blog

MAY 10, 2023

Given the importance of Jupyter to data scientists and ML developers, AWS is an active sponsor and contributor to Project Jupyter. In parallel to these open-source contributions, we have AWS product teams who are working to integrate Jupyter with products such as Amazon SageMaker.

ML

ML ML AWS AI

AWS performs fine-tuning on a Large Language Model (LLM) to classify toxic speech for a large gaming company

AWS Machine Learning Blog

AUGUST 7, 2023

In an effort to create and maintain a socially responsible gaming environment, AWS Professional Services was asked to build a mechanism that detects inappropriate language (toxic speech) within online gaming player interactions. Unfortunately, as in the real world, not all players communicate appropriately and respectfully.

AWS

AWS ML ML Data Science

Data Science News from Microsoft Ignite 2019

Data Science 101

NOVEMBER 7, 2019

It is now possible to deploy an Azure SQL Database to a virtual machine running on Amazon Web Services (AWS) and manage it from Azure. Python support has been available for a while. It’s true, I saw it happen this week. R Support for Azure Machine Learning.

Data Science

Data Science Azure SQL Machine Learning

Build a contextual chatbot application using Knowledge Bases for Amazon Bedrock

AWS Machine Learning Blog

FEBRUARY 19, 2024

Note that you can also use Knowledge Bases for Amazon Bedrock service APIs and the AWS Command Line Interface (AWS CLI) to programmatically create a knowledge base. Create a Lambda function This Lambda function is deployed using an AWS CloudFormation template available in the GitHub repo under the /cfn folder.

AWS

AWS Database Machine Learning Machine Learning

Cloud Data Science News – Beta #3

Data Science 101

NOVEMBER 22, 2019

AWS Storage Day On November 20, 2019, Amazon held AWS Storage Day. Many announcements came out regarding storage of all types at AWS. Much of this is in anticipation of AWS re:Invent, coming in early December 2019. Much of this is in anticipation of AWS re:Invent, coming in early December 2019.

Cloud Data

Cloud Data Data Science Azure AWS

Build an end-to-end MLOps pipeline for visual quality inspection at the edge – Part 2

AWS Machine Learning Blog

OCTOBER 2, 2023

On top of that, the whole process can be configured and managed via the AWS SDK, which is what we use to orchestrate our labeling workflow as part of our CI/CD pipeline. For more information about best practices, refer to the AWS re:Invent 2019 talk, Build accurate training datasets with Amazon SageMaker Ground Truth.

AWS

AWS Internet of Things ML ML

Advanced RAG patterns on Amazon SageMaker

AWS Machine Learning Blog

MARCH 28, 2024

For more information on Mixtral-8x7B Instruct on AWS, refer to Mixtral-8x7B is now available in Amazon SageMaker JumpStart. LangChain is an open source Python library designed to build applications with LLMs. Before you get started with the solution, create an AWS account. This identity is called the AWS account root user.

AWS

AWS Machine Learning Machine Learning AI

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

AWS Machine Learning Blog

SEPTEMBER 14, 2023

“Data locked away in text, audio, social media, and other unstructured sources can be a competitive advantage for firms that figure out how to use it“ Only 18% of organizations in a 2019 survey by Deloitte reported being able to take advantage of unstructured data. The majority of data, between 80% and 90%, is unstructured data.

AWS

AWS Machine Learning Machine Learning Data Scientist

Boost your forecast accuracy with time series clustering

AWS Machine Learning Blog

APRIL 4, 2023

AWS provides various services catered to time series data that are low code/no code, which both machine learning (ML) and non-ML practitioners can use for building ML solutions. For a more detailed explanation, refer to Time Series Classification and Clustering with Python. Chong En Lim is a Solutions Architect at AWS.

Clustering

Clustering ML ML AWS

Emily Webber of AWS on Pretraining Large Language Models

ODSC - Open Data Science

AUGUST 4, 2023

Recently, we spoke with Emily Webber, Principal Machine Learning Specialist Solutions Architect at AWS. She’s the author of “Pretrain Vision and Large Language Models in Python: End-to-end techniques for building and deploying foundation models on AWS.” And then I spent many years working with customers.

AWS

AWS Machine Learning Machine Learning Data Science

Transition your Amazon Forecast usage to Amazon SageMaker Canvas

AWS Machine Learning Blog

JULY 29, 2024

Launched in August 2019, Forecast predates Amazon SageMaker Canvas , a popular low-code no-code AWS tool for building, customizing, and deploying ML models, including time series forecasting models. Python script – Use a Python script to merge the datasets. SageMaker Canvas can be accessed from the SageMaker console.

ML

ML ML Algorithm AWS

A review of purpose-built accelerators for financial services

AWS Machine Learning Blog

SEPTEMBER 11, 2024

In terms of resulting speedups, the approximate order is programming hardware, then programming against PBA APIs, then programming in an unmanaged language such as C++, then a managed language such as Python. Examples of other PBAs now available include AWS Inferentia and AWS Trainium , Google TPU, and Graphcore IPU.

AWS

AWS ML ML Clustering

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

AWS Machine Learning Blog

APRIL 19, 2023

Right now, most deep learning frameworks are built for Python, but this neglects the large number of Java developers and developers who have existing Java code bases they want to integrate the increasingly powerful capabilities of deep learning into. The DJL was created at Amazon and open-sourced in 2019.

ML

ML ML Deep Learning Deep Learning

Top 10 Generative AI Companies Revealed

Towards AI

APRIL 19, 2024

Amazon (AWS) 👉Industry domain: Online retail and web services provider 👉Location: Over 175 Amazon fulfillment centers globally 👉Year founded: 1994 👉Key Products developed: Amazon Bedrock, Q, Code Whisperer, Sage Maker 👉Benefits: Fully managed generative AI service options, AWS free tier for experimentation 7.

AI

AI AI Artificial Intelligence Artificial Intelligence

Simplify data prep for generative AI with Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

NOVEMBER 27, 2023

According to a 2019 survey by Deloitte , only 18% of businesses reported being able to take advantage of unstructured data. You can create a custom transform using Pandas, PySpark, Python user-defined functions, and SQL PySpark. Choose Python (PySpark) for this use-case. And select Python (PySpark).

Data Preparation

Data Preparation AI AI Python

Generate synthetic data for evaluating RAG systems using Amazon Bedrock

AWS Machine Learning Blog

SEPTEMBER 23, 2024

Amazon Bedrock Knowledge Bases offers a streamlined approach to implement RAG on AWS, providing a fully managed solution for connecting FMs to custom data sources. LangChain is an open source Python library designed to build applications with LLMs. Amazon Bedrock makes this effortless by providing standardized API access to many FMs.

AWS

AWS Machine Learning Machine Learning AI

Unlock ML insights using the Amazon SageMaker Feature Store Feature Processor

AWS Machine Learning Blog

SEPTEMBER 19, 2023

Engineers must manually write custom data preprocessing and aggregation logic in Python or Spark for each use case. Prerequisites To follow this tutorial, you need the following: An AWS account. AWS Identity and Access Management (IAM) permissions. This undifferentiated heavy lifting is cumbersome, repetitive, and error-prone.

ML

ML ML AWS SQL

Top NLP Skills, Frameworks, Platforms, and Languages for 2023

ODSC - Open Data Science

FEBRUARY 17, 2023

BERT is still very popular over the past few years and even though the last update from Google was in late 2019 it is still widely deployed. NLP Programming Languages It shouldn’t be a surprise that Python has a strong lead as a programming language of choice for NLP. Knowing some SQL is also essential.

Deep Learning

Deep Learning Deep Learning Data Science Natural Language Processing

Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

Flipboard

FEBRUARY 2, 2023

GluonTS is a Python package for probabilistic time series modeling, but the SBP distribution is not specific to time series, and we were able to repurpose it for regression. Models were trained and cross-validated on the 2018, 2019, and 2020 seasons and tested on the 2021 season. We used the SBP distribution provided by GluonTS.

Cross Validation

Cross Validation ML ML Machine Learning

How to use Netezza Performance Server query data in Amazon Simple Storage Service (S3)

IBM Journey to AI blog

JANUARY 10, 2023

This data will be analyzed using Netezza SQL and Python code to determine if the flight delays for the first half of 2022 have increased over flight delays compared to earlier periods of time within the current data (January 2019 – December 2021). Figure 5 – Bar graph of current flight delay data (2019 – June 2022).

Data Warehouse

Data Warehouse Data Analysis Data Analysis SQL

How to Build an End-to-End Energy Price Forecasting Solution with Snowflake

phData

JANUARY 31, 2024

Python has long been the favorite programming language of data scientists. Historically, Python was only supported via a connector, so making predictions on our energy data using an algorithm created in Python would require moving data out of our Snowflake environment.

Machine Learning

Machine Learning Machine Learning Python Data Scientist

A comprehensive guide to learning LLMs (Foundational Models)

Mlearning.ai

JUNE 14, 2023

YouTube Introduction to Sequence Learning and Attention Mechanisms Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 8 — Translation, Seq2Seq, Attention — YouTube Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 7 — Translation, Seq2Seq, Attention — YouTube 2.

Natural Language Processing

Natural Language Processing ML ML Support Vector Machines

Text to Exam Generator (NLP) Using Machine Learning

Mlearning.ai

JUNE 28, 2023

But I have to say that this data is of great quality because we already converted it from messy data into the Python dictionary format that matches our type of work. This is the highest accuracy achieved by fine-tuning the model on AWS SageMaker with the training data of 30,000 sentences between sentences 40,000 and 70,000.

Machine Learning

Machine Learning Machine Learning Natural Language Processing AI

Large language models: their history, capabilities and limitations

Snorkel AI

MAY 25, 2023

BERT, the first breakout large language model In 2019, a team of researchers at Goole introduced BERT (which stands for bidirectional encoder representations from transformers). OpenAI’s GPT-2, finalized in 2019 at 1.5 The plot was boring and the acting was awful: Negative This movie was okay. For example: I love this movie.

Natural Language Processing

Natural Language Processing Python Machine Learning Machine Learning

Large language models: their history, capabilities and limitations

Snorkel AI

MAY 25, 2023

BERT, the first breakout large language model In 2019, a team of researchers at Goole introduced BERT (which stands for bidirectional encoder representations from transformers). OpenAI’s GPT-2, finalized in 2019 at 1.5 The plot was boring and the acting was awful: Negative This movie was okay. For example: I love this movie.

Natural Language Processing

Natural Language Processing Python Machine Learning Machine Learning

spaCy meets Transformers: Fine-tune BERT, XLNet and GPT-2

Explosion

AUGUST 1, 2019

Based on the (fairly vague) marketing copy, AWS might be doing something similar in SageMaker. 2019) have shown that a transformer models trained on only 1% of the IMDB sentiment analysis data (just a few dozen examples) can exceed the pre-2016 state-of-the-art. Modern transfer learning techniques are bearing this out.

Natural Language Processing

Natural Language Processing Machine Learning Machine Learning AWS

Learnings From Building the ML Platform at Stitch Fix

The MLOps Blog

AUGUST 3, 2023

For example, let’s take Airflow , AWS SageMaker pipelines. What we’re targeting first is helping you replace that procedural Python code with Hamilton code that you describe, which I can go into detail a little bit more. You could almost think of Hamilton as DBT for Python functions. Where does it [DAGWorks] fit?

ML

ML ML Data Scientist Machine Learning

How to Git Version Control dbt Cloud Jobs

phData

MAY 22, 2023

The code we will use here is Python , but you can use your programming language of choice (assuming compatibility) to call the API. This appears as a “None” value in the Python programming language. If you are using a different language, be sure to verify the NULL equivalent of that specific programming language.

Python

Python Data Warehouse Analytics Analytics

Recommend top trending items to your users using the new Amazon Personalize recipe

AWS Machine Learning Blog

MARCH 30, 2023

Please use below python code to curate interactions dataset from the MovieLens public dataset. Choose the new aws-trending-now recipe. For Solution version ID , choose the solution version that uses the aws-trending-now recipe. For the interactions data, we use ratings history from the movies review dataset, MovieLens.

AWS

AWS ML ML Machine Learning

Run secure processing jobs using PySpark in Amazon SageMaker Pipelines

AWS Machine Learning Blog

APRIL 11, 2023

It’s a fully managed on-demand service, integrated with SageMaker and other AWS services, and therefore creates and manages resources for you. Furthermore, Pipelines is supported by the SageMaker Python SDK , letting you track your data lineage and reuse steps by caching them to ease development time and cost.

AWS

AWS ML ML Data Scientist

How HSR.health is limiting risks of disease spillover from animals to humans using Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

FEBRUARY 5, 2024

According to health organizations such as the Centers for Disease Control and Prevention ( CDC ) and the World Health Organization ( WHO ), a spillover event at a wet market in Wuhan, China most likely caused the coronavirus disease 2019 (COVID-19). Janosch Woschitz is a Senior Solutions Architect at AWS, specializing in geospatial AI/ML.

ML

ML ML AWS Analytics

Game-changing moments in generative AI: Rewinding 2023

Data Science Dojo

DECEMBER 31, 2023

Following earlier collaborations in 2019 and 2021, this agreement focused on boosting AI supercomputing capabilities and research. AWS launched Bedrock Amazon Web Services unveiled its groundbreaking service, Bedrock. Microsoft increased investments in supercomputing systems and expanded Azure’s AI infrastructure. OpenAI released Dall.

AI

AI AI AWS Python

AWS DeepRacer: How to master physical racing?

AWS Machine Learning Blog

DECEMBER 1, 2024

In this blog post, I will look at what makes physical AWS DeepRacer racing—a real car on a real track—different to racing in the virtual world—a model in a simulated 3D environment. The AWS DeepRacer League is wrapping up. The original AWS DeepRacer, without modifications, has a smaller speed range of about 2 meters per second.

AWS

AWS Python Artificial Intelligence Artificial Intelligence

Faster distributed graph neural network training with GraphStorm v0.4

AWS Machine Learning Blog

FEBRUARY 11, 2025

Today, AWS AI released GraphStorm v0.4. Prerequisites To run this example, you will need an AWS account, an Amazon SageMaker Studio domain, and the necessary permissions to run BYOC SageMaker jobs. Using SageMaker Pipelines to train models provides several benefits, like reduced costs, auditability, and lineage tracking. million edges.

AWS

AWS Python ML ML

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

AWS Machine Learning Blog

DECEMBER 18, 2024

Fastweb , one of Italys leading telecommunications operators, recognized the immense potential of AI technologies early on and began investing in this area in 2019. Fine-tuning Mistral 7B on AWS Fastweb recognized the importance of developing language models tailored to the Italian language and culture.

Clustering

Clustering AWS AI AI

Dive deep into vector data stores using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

OCTOBER 11, 2024

You can set up the notebook in any AWS Region where Amazon Bedrock Knowledge Bases is available. You also need an AWS Identity and Access Management (IAM) role assigned to the SageMaker Studio domain. data # Assing local directory path to a python variable local_data_path = "./data/" On the Domains page, open your domain.

Database

Database AWS Clustering AI

Fine-tune Meta Llama 3.2 text generation models for generative AI inference using Amazon SageMaker JumpStart

AWS Machine Learning Blog

NOVEMBER 11, 2024

Prerequisites To try out this solution using SageMaker JumpStart, you’ll need the following prerequisites: An AWS account that will contain all of your AWS resources. An AWS Identity and Access Management (IAM) role to access SageMaker. We then also cover how to fine-tune the model using SageMaker Python SDK.

AI

AI AI ML ML

Racing into the future: How AWS DeepRacer fueled my AI and ML journey

Announcing New Tools for Building with Generative AI on AWS

Webinars

Trending Sources

Llama 4 family of models from Meta are now available in SageMaker JumpStart

Webinars

Develop and train large models cost-efficiently with Metaflow and AWS Trainium

Fast and cost-effective LLaMA 2 fine-tuning with AWS Trainium

AWS Inferentia2 builds on AWS Inferentia1 by delivering 4x higher throughput and 10x lower latency

Welcome to a New Era of Building in the Cloud with Generative AI on AWS

Announcing new Jupyter contributions by AWS to democratize generative AI and scale ML workloads

AWS performs fine-tuning on a Large Language Model (LLM) to classify toxic speech for a large gaming company

Data Science News from Microsoft Ignite 2019

Build a contextual chatbot application using Knowledge Bases for Amazon Bedrock

Cloud Data Science News – Beta #3

Build an end-to-end MLOps pipeline for visual quality inspection at the edge – Part 2

Advanced RAG patterns on Amazon SageMaker

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

Boost your forecast accuracy with time series clustering

Emily Webber of AWS on Pretraining Large Language Models

Transition your Amazon Forecast usage to Amazon SageMaker Canvas

A review of purpose-built accelerators for financial services

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

Top 10 Generative AI Companies Revealed

Simplify data prep for generative AI with Amazon SageMaker Data Wrangler

Generate synthetic data for evaluating RAG systems using Amazon Bedrock

Unlock ML insights using the Amazon SageMaker Feature Store Feature Processor

Top NLP Skills, Frameworks, Platforms, and Languages for 2023

Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

How to use Netezza Performance Server query data in Amazon Simple Storage Service (S3)

How to Build an End-to-End Energy Price Forecasting Solution with Snowflake

A comprehensive guide to learning LLMs (Foundational Models)

Text to Exam Generator (NLP) Using Machine Learning

Large language models: their history, capabilities and limitations

Large language models: their history, capabilities and limitations

spaCy meets Transformers: Fine-tune BERT, XLNet and GPT-2

Learnings From Building the ML Platform at Stitch Fix

How to Git Version Control dbt Cloud Jobs

Recommend top trending items to your users using the new Amazon Personalize recipe

Run secure processing jobs using PySpark in Amazon SageMaker Pipelines

How HSR.health is limiting risks of disease spillover from animals to humans using Amazon SageMaker geospatial capabilities

Game-changing moments in generative AI: Rewinding 2023

AWS DeepRacer: How to master physical racing?

Faster distributed graph neural network training with GraphStorm v0.4

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

Dive deep into vector data stores using Amazon Bedrock Knowledge Bases

Fine-tune Meta Llama 3.2 text generation models for generative AI inference using Amazon SageMaker JumpStart

Stay Connected