AWS, Data Lakes and Data Preparation

How Northpower used computer vision with AWS to automate safety inspection risk assessments

AWS Machine Learning Blog

SEPTEMBER 27, 2024

Data preparation SageMaker Ground Truth employs a human workforce made up of Northpower volunteers to annotate a set of 10,000 images. The model was then fine-tuned with training data from the data preparation stage. About the authors Scott Patterson is a Senior Solutions Architect at AWS.

AWS

AWS Data Lakes ML ML

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Flipboard

DECEMBER 11, 2024

Many of these applications are complex to build because they require collaboration across teams and the integration of data, tools, and services. Data engineers use data warehouses, data lakes, and analytics tools to load, transform, clean, and aggregate data. Choose Create VPC.

SQL

SQL AWS Data Lakes AI

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

How Marubeni is optimizing market decisions using AWS machine learning and analytics

AWS Machine Learning Blog

MARCH 8, 2023

This solution helps market analysts design and perform data-driven bidding strategies optimized for power asset profitability. In this post, you will learn how Marubeni is optimizing market decisions by using the broad set of AWS analytics and ML services, to build a robust and cost-effective Power Bid Optimization solution.

AWS

AWS Machine Learning Machine Learning Analytics

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

The solution: IBM databases on AWS To solve for these challenges, IBM’s portfolio of SaaS database solutions on Amazon Web Services (AWS), enables enterprises to scale applications, analytics and AI across the hybrid cloud landscape. Let’s delve into the database portfolio from IBM available on AWS. 

AWS

AWS Database ETL AI

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone is a data management service that makes it quick and convenient to catalog, discover, share, and govern data stored in AWS, on-premises, and third-party sources. An Amazon DataZone domain and an associated Amazon DataZone project configured in your AWS account.

Machine Learning

Machine Learning Machine Learning Data Governance ML

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

AWS Machine Learning Blog

JUNE 22, 2023

Working with AWS, Light & Wonder recently developed an industry-first secure solution, Light & Wonder Connect (LnW Connect), to stream telemetry and machine health data from roughly half a million electronic gaming machines distributed across its casino customer base globally when LnW Connect reaches its full potential.

AWS

AWS ML ML Machine Learning

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

In this post, we will talk about how BMW Group, in collaboration with AWS Professional Services, built its Jupyter Managed (JuMa) service to address these challenges. For example, teams using these platforms missed an easy migration of their AI/ML prototypes to the industrialization of the solution running on AWS.

ML

ML ML AWS AI

Simplify continuous learning of Amazon Comprehend custom models using Comprehend flywheel

AWS Machine Learning Blog

MARCH 1, 2023

Flywheel creates a data lake (in Amazon S3) in your account where all the training and test data for all versions of the model are managed and stored. Periodically, the new labeled data (to retrain the model) can be made available to flywheel by creating datasets. The data can be accessed from AWS Open Data Registry.

Data Lakes

Data Lakes AWS ML ML

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

On December 6 th -8 th 2023, the non-profit organization, Tech to the Rescue , in collaboration with AWS, organized the world’s largest Air Quality Hackathon – aimed at tackling one of the world’s most pressing health and environmental challenges, air pollution. As always, AWS welcomes your feedback.

AWS

AWS AI AI Python

The Ultimate Guide to Data Preparation for Machine Learning

DagsHub

FEBRUARY 29, 2024

Data, is therefore, essential to the quality and performance of machine learning models. This makes data preparation for machine learning all the more critical, so that the models generate reliable and accurate predictions and drive business value for the organization. Why do you need Data Preparation for Machine Learning?

Data Preparation

Data Preparation Machine Learning Machine Learning Data Governance

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the best price-performance at any scale. If you’re familiar with SageMaker and writing Spark code, option B could be your choice.

ML

ML ML AWS Data Warehouse

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

AWS Machine Learning Blog

AUGUST 4, 2023

The following steps give an overview of how to use the new capabilities launched in SageMaker for Salesforce to enable the overall integration: Set up the Amazon SageMaker Studio domain and OAuth between Salesforce and the AWS account s. Select Other type of secret. Save the secret and note the ARN of the secret.

ML

ML ML AWS AI

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

Visual modeling: Delivers easy-to-use workflows for data scientists to build data preparation and predictive machine learning pipelines that include text analytics, visualizations and a variety of modeling methods.

AI

AI AI Machine Learning Machine Learning

Popular Data Transformation Tools: Importance and Best Practices

Pickl AI

OCTOBER 10, 2024

Informatica’s AI-powered automation helps streamline data pipelines and improve operational efficiency. Common use cases include integrating data across hybrid cloud environments, managing data lakes, and enabling real-time analytics for Business Intelligence platforms.

Data Quality

Data Quality AWS Machine Learning Machine Learning

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Role of Data Engineers in the Data Ecosystem Data Engineers play a crucial role in the data ecosystem by bridging the gap between raw data and actionable insights. They are responsible for building and maintaining data architectures, which include databases, data warehouses, and data lakes.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

For example, if you use AWS, you may prefer Amazon SageMaker as an MLOps platform that integrates with other AWS services. SageMaker Studio offers built-in algorithms, automated model tuning, and seamless integration with AWS services, making it a powerful platform for developing and deploying machine learning solutions at scale.

Machine Learning

Machine Learning Machine Learning ML ML

Your guide to generative AI and ML at AWS re:Invent 2023

AWS Machine Learning Blog

NOVEMBER 22, 2023

Yes, the AWS re:Invent season is upon us and as always, the place to be is Las Vegas! are the sessions dedicated to AWS DeepRacer ! Generative AI is at the heart of the AWS Village this year. You marked your calendars, you booked your hotel, and you even purchased the airfare. And last but not least (and always fun!)

AWS

AWS ML ML AI

Getting Started With Snowflake: Best Practices For Launching

phData

DECEMBER 4, 2023

If you answer “yes” to any of these questions, you will need cloud storage, such as Amazon AWS’s S3, Azure Data Lake Storage or GCP’s Google Storage. Knowing this, you want to have data prepared in a way to optimize your load. It might be tempting to have massive files and let the system sort it out.

Clustering

Clustering Database SQL Data Pipeline

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

AWS Machine Learning Blog

SEPTEMBER 1, 2023

These teams are as follows: Advanced analytics team (data lake and data mesh) – Data engineers are responsible for preparing and ingesting data from multiple sources, building ETL (extract, transform, and load) pipelines to curate and catalog the data, and prepare the necessary historical data for the ML use cases.

AI

AI AI ML ML

How to Use Exploratory Notebooks [Best Practices]

The MLOps Blog

OCTOBER 20, 2023

Placing functions for plotting, data loading, data preparation, and implementations of evaluation metrics in plain Python modules keeps a Jupyter notebook focused on the exploratory analysis | Source: Author Using SQL directly in Jupyter cells There are some cases in which data is not in memory (e.g.,

SQL

SQL Database Data Scientist Python

Architect defense-in-depth security for generative AI applications using the OWASP Top 10 for LLMs

AWS Machine Learning Blog

JANUARY 26, 2024

We also discuss common security concerns that can undermine trust in AI, as identified by the Open Worldwide Application Security Project (OWASP) Top 10 for LLM Applications , and show ways you can use AWS to increase your security posture and confidence while innovating with generative AI.

AWS

AWS ML ML AI

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

AWS Machine Learning Blog

MAY 31, 2024

With the Amazon Bedrock serverless experience, you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using the Amazon Web Services (AWS) tools without having to manage infrastructure.

AWS

AWS Machine Learning Machine Learning Database

How to Build an End-To-End ML Pipeline

The MLOps Blog

MAY 9, 2023

3 Quickly build and deploy an end-to-end ML pipeline with Kubeflow Pipelines on AWS. The pipelines are interoperable to build a working system: Data (input) pipeline (data acquisition and feature management steps) This pipeline transports raw data from one location to another. What is a machine learning pipeline?

ML

ML ML Machine Learning Machine Learning

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 15, 2024

Importing data from the SageMaker Data Wrangler flow allows you to interact with a sample of the data before scaling the data preparation flow to the full dataset. This improves time and performance because you don’t need to work with the entirety of the data during preparation.

ML

ML ML Data Preparation AWS

3 Major Trends at Strata New York 2017

DataRobot Blog

OCTOBER 3, 2017

This highlights the two companies’ shared vision on self-service data discovery with an emphasis on collaboration and data governance. 2) When data becomes information, many (incremental) use cases surface.

Data Lakes

Data Lakes Azure Data Pipeline Hadoop

Data Science Current

How Northpower used computer vision with AWS to automate safety inspection risk assessments

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Webinars

Trending Sources

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Webinars

How Marubeni is optimizing market decisions using AWS machine learning and analytics

Tackling AI’s data challenges with IBM databases on AWS

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Simplify continuous learning of Amazon Comprehend custom models using Comprehend flywheel

Improving air quality with generative AI

The Ultimate Guide to Data Preparation for Machine Learning

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

Exploring the AI and data capabilities of watsonx

Popular Data Transformation Tools: Importance and Best Practices

Discover the Most Important Fundamentals of Data Engineering

MLOps Landscape in 2023: Top Tools and Platforms

Your guide to generative AI and ML at AWS re:Invent 2023

Getting Started With Snowflake: Best Practices For Launching

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

How to Use Exploratory Notebooks [Best Practices]

Architect defense-in-depth security for generative AI applications using the OWASP Top 10 for LLMs

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

How to Build an End-To-End ML Pipeline

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

3 Major Trends at Strata New York 2017

Stay Connected