Blog, Data Quality and ML - Data Science Current

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

AWS Machine Learning Blog

NOVEMBER 13, 2024

You can now register machine learning (ML) models in Amazon SageMaker Model Registry with Amazon SageMaker Model Cards , making it straightforward to manage governance information for specific model versions directly in SageMaker Model Registry in just a few clicks.

ML

ML ML AWS Data Preparation

Machine Learning Models: 4 Ways to Test them in Production

Data Science Dojo

JULY 5, 2024

Modern businesses are embracing machine learning (ML) models to gain a competitive edge. Hence, improving the overall efficiency of the business and allow them to make data-driven decisions. Deploying ML models in their day-to-day processes allows businesses to adopt and integrate AI-powered solutions into their businesses.

Machine Learning

Machine Learning Machine Learning ML ML

Enhanced observability for AWS Trainium and AWS Inferentia with Datadog

AWS Machine Learning Blog

NOVEMBER 26, 2024

With the increasing use of large models, requiring a large number of accelerated compute instances, observability plays a critical role in ML operations, empowering you to improve performance, diagnose and fix failures, and optimize resource utilization. This data makes sure models are being trained smoothly and reliably.

AWS

AWS ML ML Data Pipeline

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Accelerate data preparation for ML in Amazon SageMaker Canvas

AWS Machine Learning Blog

NOVEMBER 29, 2023

Data preparation is a crucial step in any machine learning (ML) workflow, yet it often involves tedious and time-consuming tasks. Amazon SageMaker Canvas now supports comprehensive data preparation capabilities powered by Amazon SageMaker Data Wrangler. Now you have a balanced target column.

Data Preparation

Data Preparation ML ML Data Quality

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 12, 2024

Instead, organizations are increasingly looking to take advantage of transformative technologies like machine learning (ML) and artificial intelligence (AI) to deliver innovative products, improve outcomes, and gain operational efficiencies at scale. Data is presented to the personas that need access using a unified interface.

ML

ML ML AWS AI

Customized model monitoring for near real-time batch inference with Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 28, 2024

Real-world applications vary in inference requirements for their artificial intelligence and machine learning (AI/ML) solutions to optimize performance and reduce costs. SageMaker Model Monitor monitors the quality of SageMaker ML models in production.

ML

ML ML AWS AI

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Businesses are under pressure to show return on investment (ROI) from AI use cases, whether predictive machine learning (ML) or generative AI. Only 54% of ML prototypes make it to production, and only 5% of generative AI use cases make it to production. Using SageMaker, you can build, train and deploy ML models.

ML

ML ML AWS AI

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

AWS Machine Learning Blog

NOVEMBER 14, 2024

We recently announced the general availability of cross-account sharing of Amazon SageMaker Model Registry using AWS Resource Access Manager (AWS RAM) , making it easier to securely share and discover machine learning (ML) models across your AWS accounts.

AWS

AWS ML ML Machine Learning

Deliver your first ML use case in 8–12 weeks

AWS Machine Learning Blog

APRIL 26, 2023

Do you need help to move your organization’s Machine Learning (ML) journey from pilot to production? Most executives think ML can apply to any business decision, but on average only half of the ML projects make it to production. Challenges Customers may face several challenges when implementing machine learning (ML) solutions.

ML

ML ML AWS Machine Learning

LLM Agents Underscore One Truth: Data Is The Real Differentiator.

Towards AI

NOVEMBER 8, 2024

Edited Photo by Taylor Vick on Unsplash In ML engineering, data quality isn’t just critical — it’s foundational. Since 2011, Peter Norvig’s words underscore the power of a data-centric approach in machine learning. Yet, this perspective often gets sidelined and there was never a consensus in the ML community about it.

ML

ML ML Data Quality Algorithm

Going beyond AI assistants: Examples from Amazon.com reinventing industries with generative AI

Flipboard

MAY 30, 2025

Non-conversational applications offer unique advantages such as higher latency tolerance, batch processing, and caching, but their autonomous nature requires stronger guardrails and exhaustive quality assurance compared to conversational applications, which benefit from real-time user feedback and supervision.

AI

AI AI AWS ML

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 15, 2024

Starting today, you can interactively prepare large datasets, create end-to-end data flows, and invoke automated machine learning (AutoML) experiments on petabytes of data—a substantial leap from the previous 5 GB limit. Organizations often struggle to extract meaningful insights and value from their ever-growing volume of data.

ML

ML ML Data Preparation AWS

When It Comes to Data Quality, Businesses Get Out What They Put In

Dataversity

MARCH 14, 2022

The post When It Comes to Data Quality, Businesses Get Out What They Put In appeared first on DATAVERSITY. The stakes are high, so you search the web and find the most revered chicken parmesan recipe around. At the grocery store, it is immediately clear that some ingredients are much more […].

Data Quality

Data Quality Data Governance Big Data Big Data

Data-centric ML benchmarking: Announcing DataPerf’s 2023 challenges

Google Research AI blog

MARCH 30, 2023

Posted by Peter Mattson, Senior Staff Engineer, ML Performance, and Praveen Paritosh, Senior Research Scientist, Google Research, Brain Team Machine learning (ML) offers tremendous potential, from diagnosing cancer to engineering safe self-driving cars to amplifying human productivity. Each step can introduce issues and biases.

ML

ML ML Algorithm Data Quality

How to Ensure Data Quality and Consistency in Master Data Management

Dataversity

APRIL 1, 2024

This reliance has spurred a significant shift across industries, driven by advancements in artificial intelligence (AI) and machine learning (ML), which thrive on comprehensive, high-quality data.

Data Quality

Data Quality Artificial Intelligence Artificial Intelligence Machine Learning

Transitioning off Amazon Lookout for Metrics

AWS Machine Learning Blog

OCTOBER 9, 2024

Amazon Lookout for Metrics is a fully managed service that uses machine learning (ML) to detect anomalies in virtually any time-series business or operational metrics—such as revenue performance, purchase transactions, and customer acquisition and retention rates—with no ML experience required. To learn more, see the documentation.

AWS

AWS ML ML Data Quality

How Axfood enables accelerated machine learning throughout the organization using Amazon SageMaker

AWS Machine Learning Blog

FEBRUARY 27, 2024

In this post, we share how Axfood, a large Swedish food retailer, improved operations and scalability of their existing artificial intelligence (AI) and machine learning (ML) operations by prototyping in close collaboration with AWS experts and using Amazon SageMaker. This is a guest post written by Axfood AB.

Machine Learning

Machine Learning Machine Learning ML ML

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

IBM Data Science in Practice

MARCH 8, 2023

The growth of the AI and Machine Learning (ML) industry has continued to grow at a rapid rate over recent years. Hidden Technical Debt in Machine Learning Systems More money, more problems — Rise of too many ML tools 2012 vs 2023 — Source: Matt Turck People often believe that money is the solution to a problem.

Machine Learning

Machine Learning Machine Learning ML ML

How Data Will Reshape Industries in 2025

Dataversity

JANUARY 13, 2025

Data has become a driving force behind change and innovation in 2025, fundamentally altering how businesses operate. Across sectors, organizations are using advancements in artificial intelligence (AI), machine learning (ML), and data-sharing technologies to improve decision-making, foster collaboration, and uncover new opportunities.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Machine Learning Machine Learning

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

Alignment to other tools in the organization’s tech stack Consider how well the MLOps tool integrates with your existing tools and workflows, such as data sources, data engineering platforms, code repositories, CI/CD pipelines, monitoring systems, etc. and Pandas or Apache Spark DataFrames.

Machine Learning

Machine Learning Machine Learning ML ML

The Weather Company enhances MLOps with Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch

AWS Machine Learning Blog

JULY 8, 2024

This blog post is co-written with Qaish Kanchwala from The Weather Company. As industries begin adopting processes dependent on machine learning (ML) technologies, it is critical to establish machine learning operations (MLOps) that scale to support growth and utilization of this technology.

AWS

AWS ML ML Data Scientist

Understanding Machine Learning Challenges: Insights for Professionals

Pickl AI

FEBRUARY 17, 2025

However, once deployed in a real-world setting, its performance plummeted due to data quality issues and unforeseen biases. This scenario highlights a common reality in the Machine Learning landscape: despite the hype surrounding ML capabilities, many projects fail to deliver expected results due to various challenges.

Machine Learning

Machine Learning Machine Learning Supervised Learning ML

Beginner’s Guide to ML-001: Introducing the Wonderful World of Machine Learning: An Introduction

Towards AI

FEBRUARY 20, 2024

Beginner’s Guide to ML-001: Introducing the Wonderful World of Machine Learning: An Introduction Everyone is using mobile or web applications which are based on one or other machine learning algorithms. Machine learning(ML) is evolving at a very fast pace. Machine learning(ML) is evolving at a very fast pace.

Machine Learning

Machine Learning Machine Learning ML ML

Unlocking the 12 Ways to Improve Data Quality

Pickl AI

OCTOBER 19, 2023

Data quality plays a significant role in helping organizations strategize their policies that can keep them ahead of the crowd. Hence, companies need to adopt the right strategies that can help them filter the relevant data from the unwanted ones and get accurate and precise output.

Data Quality

Data Quality Data Governance Data Warehouse Machine Learning

Build a multi-tenant generative AI environment for your enterprise on AWS

AWS Machine Learning Blog

NOVEMBER 7, 2024

SageMaker JumpStart is a machine learning (ML) hub that provides a wide range of publicly available and proprietary FMs from providers such as AI21 Labs, Cohere, Hugging Face, Meta, and Stability AI, which you can deploy to SageMaker endpoints in your own AWS account. It’s serverless so you don’t have to manage the infrastructure.

AWS

AWS AI AI Machine Learning

HCLTech’s AWS powered AutoWise Companion: A seamless experience for informed automotive buyer decisions with data-driven design

AWS Machine Learning Blog

JANUARY 15, 2025

The solution uses the following AWS data stores and analytics services: Unstructured data Amazon Simple Storage Service (Amazon S3) buckets are used to store the JSON-based social media feedback data, quality report PDFs (specific to OEMs), and the vehicle and its features images.

AWS

AWS SQL AI AI

Best of 2022: Top 5 Financial Services Blog Posts

Precisely

DECEMBER 20, 2022

With that data, organizations in this sector are able to better understand customers and improve experiences, fight financial crimes, reduce compliance risks, optimize branch performance, and stay ahead of the competition. That represents a huge opportunity, especially as advanced analytics, AI, and machine learning (ML) gain momentum.

Data Governance

Data Governance Data Quality Big Data Big Data

What Is Data Quality and Why Is It Important?

Alation

AUGUST 5, 2021

What is Data Quality? Data quality is defined as: the degree to which data meets a company’s expectations of accuracy, validity, completeness, and consistency. By tracking data quality , a business can pinpoint potential issues harming quality, and ensure that shared data is fit to be used for a given purpose.

Data Quality

Data Quality Data Governance Artificial Intelligence Artificial Intelligence

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone makes it straightforward for engineers, data scientists, product managers, analysts, and business users to access data throughout an organization so they can discover, use, and collaborate to derive data-driven insights.

Machine Learning

Machine Learning Machine Learning Data Governance ML

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

AWS Machine Learning Blog

DECEMBER 15, 2023

We are excited to announce the launch of Amazon DocumentDB (with MongoDB compatibility) integration with Amazon SageMaker Canvas , allowing Amazon DocumentDB customers to build and use generative AI and machine learning (ML) solutions without writing code. On the Analyses tab, choose Data Quality and Insights Report.

Machine Learning

Machine Learning Machine Learning AWS ML

Create a data labeling project with Amazon SageMaker Ground Truth Plus

AWS Machine Learning Blog

OCTOBER 15, 2024

In addition to traditional custom-tailored deep learning models, SageMaker Ground Truth also supports generative AI use cases, enabling the generation of high-quality training data for artificial intelligence and machine learning (AI/ML) models. To learn more, see Use Amazon SageMaker Ground Truth Plus to Label Data.

AWS

AWS ML ML Machine Learning

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

AWS Machine Learning Blog

AUGUST 4, 2023

If you are a returning user to SageMaker Studio, in order to ensure Salesforce Data Cloud is enabled, upgrade to the latest Jupyter and SageMaker Data Wrangler kernels. This completes the setup to enable data access from Salesforce Data Cloud to SageMaker Studio to build AI and machine learning (ML) models.

ML

ML ML AWS AI

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

Data quality is ownership of the consuming applications or data producers. Governance The two key areas of governance are model and data: Model governance Monitor model for performance, robustness, and fairness. Since 2013 he has helped AWS customers adopt AI/ML technology as a Solutions Architect.

AWS

AWS AI AI Database

Boosting developer productivity: How Deloitte uses Amazon SageMaker Canvas for no-code/low-code machine learning

AWS Machine Learning Blog

DECEMBER 1, 2023

The ability to quickly build and deploy machine learning (ML) models is becoming increasingly important in today’s data-driven world. However, building ML models requires significant time, effort, and specialized expertise. This is where the AWS suite of low-code and no-code ML services becomes an essential tool.

Machine Learning

Machine Learning Machine Learning Data Preparation ML

2024 Governance Trends for Data Leaders

phData

NOVEMBER 1, 2024

In an effort to better understand where data governance is heading, we spoke with top executives from IT, healthcare, and finance to hear their thoughts on the biggest trends, key challenges, and what insights they would recommend. This blog is a collection of those insights, but for the full trendbook, we recommend downloading the PDF.

Data Governance

Data Governance Data Quality ML ML

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Hosted on Amazon ECS with tasks run on Fargate, this platform streamlines the end-to-end ML workflow, from data ingestion to model deployment.

AWS

AWS Machine Learning Machine Learning ML

Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp

The MLOps Blog

JANUARY 26, 2024

TL;DR Feedback integration is crucial for ML models to meet user needs. A robust ML infrastructure gives teams a competitive advantage. I started my ML journey as an analyst back in 2016. Mailchimp’s ML Platform: genesis, challenges, and objectives Mailchimp is a 20-year-old bootstrapped email marketing company.

ML

ML ML Data Scientist Machine Learning

Protecting Machine Learning Systems in the GenAI Era

Dataversity

MAY 23, 2025

As GenAI and machine learning (ML) become more widespread across industries, their high levels of adoption have created a major challenge: security.

Machine Learning

Machine Learning Machine Learning ML ML

Data Quality Best Practices to Discover the Hidden Potential of Dirty Data in Health Care

Dataversity

DECEMBER 20, 2022

The post Data Quality Best Practices to Discover the Hidden Potential of Dirty Data in Health Care appeared first on DATAVERSITY. Health plans will […].

Data Quality

Data Quality Analytics Analytics Data Governance

Mastering ML Model Performance: Best Practices for Optimal Results

Iguazio

JUNE 25, 2023

Evaluating ML model performance is essential for ensuring the reliability, quality, accuracy and effectiveness of your ML models. In this blog post, we dive into all aspects of ML model performance: which metrics to use to measure performance, best practices that can help and where MLOps fits in.

ML

ML ML Clustering Cross Validation

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 16, 2023

This is a joint blog with AWS and Philips. Amazon SageMaker provides purpose-built tools for machine learning operations (MLOps) to help automate and standardize processes across the ML lifecycle. In this post, we describe how Philips partnered with AWS to develop AI ToolSuite—a scalable, secure, and compliant ML platform on SageMaker.

ML

ML ML AWS AI

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

From data processing to quick insights, robust pipelines are a must for any ML system. Often the Data Team, comprising Data and ML Engineers , needs to build this infrastructure, and this experience can be painful. However, efficient use of ETL pipelines in ML can help make their life much easier.

ETL

ETL Data Pipeline ML ML

ML | Data Preprocessing in Python

Pickl AI

DECEMBER 3, 2024

Summary: Data preprocessing in Python is essential for transforming raw data into a clean, structured format suitable for analysis. It involves steps like handling missing values, normalizing data, and managing categorical features, ultimately enhancing model performance and ensuring data quality.

Python

Python ML ML Exploratory Data Analysis

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

OMRONs data strategyrepresented on ODAPalso allowed the organization to unlock generative AI use cases focused on tangible business outcomes and enhanced productivity. The company aims to integrate additional data sources, including other mission-critical systems, into ODAP.

AWS

AWS Data Governance Data Silos SQL

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

Machine Learning Models: 4 Ways to Test them in Production

Webinars

Trending Sources

Enhanced observability for AWS Trainium and AWS Inferentia with Datadog

Webinars

Accelerate data preparation for ML in Amazon SageMaker Canvas

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

Customized model monitoring for near real-time batch inference with Amazon SageMaker

Real value, real time: Production AI with Amazon SageMaker and Tecton

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

Deliver your first ML use case in 8–12 weeks

LLM Agents Underscore One Truth: Data Is The Real Differentiator.

Going beyond AI assistants: Examples from Amazon.com reinventing industries with generative AI

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

When It Comes to Data Quality, Businesses Get Out What They Put In

Data-centric ML benchmarking: Announcing DataPerf’s 2023 challenges

How to Ensure Data Quality and Consistency in Master Data Management

Transitioning off Amazon Lookout for Metrics

How Axfood enables accelerated machine learning throughout the organization using Amazon SageMaker

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

How Data Will Reshape Industries in 2025

MLOps Landscape in 2023: Top Tools and Platforms

The Weather Company enhances MLOps with Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch

Understanding Machine Learning Challenges: Insights for Professionals

Beginner’s Guide to ML-001: Introducing the Wonderful World of Machine Learning: An Introduction

Unlocking the 12 Ways to Improve Data Quality

Build a multi-tenant generative AI environment for your enterprise on AWS

HCLTech’s AWS powered AutoWise Companion: A seamless experience for informed automotive buyer decisions with data-driven design

Best of 2022: Top 5 Financial Services Blog Posts

What Is Data Quality and Why Is It Important?

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

Create a data labeling project with Amazon SageMaker Ground Truth Plus

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

Architect a mature generative AI foundation on AWS

Boosting developer productivity: How Deloitte uses Amazon SageMaker Canvas for no-code/low-code machine learning

2024 Governance Trends for Data Leaders

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp

Protecting Machine Learning Systems in the GenAI Era

Data Quality Best Practices to Discover the Hidden Potential of Dirty Data in Health Care

Mastering ML Model Performance: Best Practices for Optimal Results

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

How to Build ETL Data Pipeline in ML

ML | Data Preprocessing in Python

Shaping the future: OMRON’s data-driven journey with AWS

Stay Connected