Data Engineering, Data Pipeline and ML

Lakehouse Monitoring: A Unified Solution for Quality of Data and AI

databricks

DECEMBER 12, 2023

Introduction Databricks Lakehouse Monitoring allows you to monitor all your data pipelines – from data to features to ML models – without additional too.

Data Pipeline

Data Pipeline ML ML AI

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Businesses are under pressure to show return on investment (ROI) from AI use cases, whether predictive machine learning (ML) or generative AI. Only 54% of ML prototypes make it to production, and only 5% of generative AI use cases make it to production. Using SageMaker, you can build, train and deploy ML models.

ML

ML ML AWS AI

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Machine learning (ML) helps organizations to increase revenue, drive business growth, and reduce costs by optimizing core business functions such as supply and demand forecasting, customer churn prediction, credit risk scoring, pricing, predicting late shipments, and many others. Let’s learn about the services we will use to make this happen.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

Webinars

Going Beyond Chatbots: Connecting AI to Your Tools, Systems, & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Smart Tech + Human Expertise = How to Modernize Manufacturing Without Losing Control

MORE WEBINARS

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 12, 2024

Instead, organizations are increasingly looking to take advantage of transformative technologies like machine learning (ML) and artificial intelligence (AI) to deliver innovative products, improve outcomes, and gain operational efficiencies at scale. Data is presented to the personas that need access using a unified interface.

ML

ML ML AWS AI

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

With that, the need for data scientists and machine learning (ML) engineers has grown significantly. Data scientists and ML engineers require capable tooling and sufficient compute for their work. Data scientists and ML engineers require capable tooling and sufficient compute for their work.

ML

ML ML AWS AI

The power of remote engine execution for ETL/ELT data pipelines

IBM Journey to AI blog

MAY 15, 2024

Data engineers build data pipelines, which are called data integration tasks or jobs, as incremental steps to perform data operations and orchestrate these data pipelines in an overall workflow. Organizations can harness the full potential of their data while reducing risk and lowering costs.

Data Pipeline

Data Pipeline ETL SQL Database

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

From data processing to quick insights, robust pipelines are a must for any ML system. Often the Data Team, comprising Data and ML Engineers , needs to build this infrastructure, and this experience can be painful. However, efficient use of ETL pipelines in ML can help make their life much easier.

ETL

ETL Data Pipeline ML ML

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

IBM Data Science in Practice

MARCH 8, 2023

The growth of the AI and Machine Learning (ML) industry has continued to grow at a rapid rate over recent years. Hidden Technical Debt in Machine Learning Systems More money, more problems — Rise of too many ML tools 2012 vs 2023 — Source: Matt Turck People often believe that money is the solution to a problem.

Machine Learning

Machine Learning Machine Learning ML ML

Boost your MLOps efficiency with these 6 must-have tools and platforms

Data Science Dojo

FEBRUARY 20, 2023

Machine learning (ML) is the technology that automates tasks and provides insights. It allows data scientists to build models that can automate specific tasks. It comes in many forms, with a range of tools and platforms designed to make working with ML more efficient. It also has ML algorithms built into the platform.

Machine Learning

Machine Learning Machine Learning AWS Azure

The 2021 Executive Guide To Data Science and AI

Applied Data Science

AUGUST 2, 2021

Automation Automating data pipelines and models ➡️ 6. Team Building the right data science team is complex. With a range of role types available, how do you find the perfect balance of Data Scientists , Data Engineers and Data Analysts to include in your team? Big Ideas What to look out for in 2022 1.

Data Science

Data Science Data Scientist ML ML

How to Build Effective Data Pipelines in Snowpark

phData

AUGUST 6, 2024

As today’s world keeps progressing towards data-driven decisions, organizations must have quality data created from efficient and effective data pipelines. For customers in Snowflake, Snowpark is a powerful tool for building these effective and scalable data pipelines.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

Build an ML Inference Data Pipeline using SageMaker and Apache Airflow

Mlearning.ai

APRIL 6, 2023

Automate and streamline our ML inference pipeline with SageMaker and Airflow Building an inference data pipeline on large datasets is a challenge many companies face. The Batch job automatically launches an ML compute instance, deploys the model, and processes the input data in batches, producing the output predictions.

Data Pipeline

Data Pipeline ML ML AWS

ML Collaboration: Best Practices From 4 ML Teams

The MLOps Blog

DECEMBER 28, 2022

The onset of the pandemic has triggered a rapid increase in the demand and adoption of ML technology. Building ML team Following the surge in ML use cases that have the potential to transform business, the leaders are making a significant investment in ML collaboration, building teams that can deliver the promise of machine learning.

ML

ML ML Data Scientist Machine Learning

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

OMRONs data strategyrepresented on ODAPalso allowed the organization to unlock generative AI use cases focused on tangible business outcomes and enhanced productivity. About the Authors Emrah Kaya is Data Engineering Manager at Omron Europe and Platform Lead for ODAP Project.

AWS

AWS Data Governance Data Silos SQL

10 highest-paying AI jobs and careers in 2024

Data Science Dojo

APRIL 16, 2024

Machine learning (ML) engineer Potential pay range – US$82,000 to 160,000/yr Machine learning engineers are the bridge between data science and engineering. Integrating the knowledge of data science with engineering skills, they can design, build, and deploy machine learning (ML) models.

AI

AI AI Machine Learning Machine Learning

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

Alignment to other tools in the organization’s tech stack Consider how well the MLOps tool integrates with your existing tools and workflows, such as data sources, data engineering platforms, code repositories, CI/CD pipelines, monitoring systems, etc. and Pandas or Apache Spark DataFrames.

Machine Learning

Machine Learning Machine Learning ML ML

Top NLP Skills, Frameworks, Platforms, and Languages for 2023

ODSC - Open Data Science

FEBRUARY 17, 2023

Cloud Computing, APIs, and Data Engineering NLP experts don’t go straight into conducting sentiment analysis on their personal laptops. TensorFlow is desired for its flexibility for ML and neural networks, PyTorch for its ease of use and innate design for NLP, and scikit-learn for classification and clustering.

Deep Learning

Deep Learning Deep Learning Data Science Natural Language Processing

Gen AI 101: Data Engineering (Part 2)

phData

JULY 19, 2024

This article was co-written by Lawrence Liu & Safwan Islam While the title ‘ Machine Learning Engineer ’ may sound more prestigious than ‘Data Engineer’ to some, the reality is that these roles share a significant overlap. Generative AI has unlocked the value of unstructured text-based data.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

AWS Machine Learning Blog

APRIL 19, 2023

Since 2018, our team has been developing a variety of ML models to enable betting products for NFL and NCAA football. These models are then pushed to an Amazon Simple Storage Service (Amazon S3) bucket using DVC, a version control tool for ML models. They use the DJL PyTorch engine to initialize the model predictor.

ML

ML ML Deep Learning Deep Learning

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

FEBRUARY 21, 2025

Previously, he was a Data & Machine Learning Engineer at AWS, where he worked closely with customers to develop enterprise-scale data infrastructure, including data lakes, analytics dashboards, and ETL pipelines. He specializes in designing, building, and optimizing large-scale data solutions.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

AWS Machine Learning Blog

FEBRUARY 13, 2024

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. Their task is to construct and oversee efficient data pipelines.

AWS

AWS ML ML Machine Learning

Performance Benefits of Snowpark for ML Workloads

phData

MARCH 22, 2023

As companies continue to adopt machine learning (ML) in their workflows, the demand for scalable and efficient tools has increased. In this blog post, we will explore the performance benefits of Snowpark for ML workloads and how it can help businesses make better use of their data. Want to learn more? Can’t wait?

ML

ML ML Machine Learning Machine Learning

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

IBM Journey to AI blog

AUGUST 12, 2024

Instead, businesses tend to rely on advanced tools and strategies—namely artificial intelligence for IT operations (AIOps) and machine learning operations (MLOps)—to turn vast quantities of data into actionable insights that can improve IT decision-making and ultimately, the bottom line.

Big Data

Big Data Big Data ML ML

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Hosted on Amazon ECS with tasks run on Fargate, this platform streamlines the end-to-end ML workflow, from data ingestion to model deployment.

AWS

AWS Machine Learning Machine Learning ML

Managing Dataset Versions in Long-Term ML Projects

The MLOps Blog

MARCH 20, 2023

Long-term ML project involves developing and sustaining applications or systems that leverage machine learning models, algorithms, and techniques. An example of a long-term ML project will be a bank fraud detection system powered by ML models and algorithms for pattern recognition. 2 Ensuring and maintaining high-quality data.

ML

ML ML Machine Learning Machine Learning

Implementing MLOps: 5 Key Steps for Successfully Managing ML Projects

Iguazio

JULY 31, 2023

MLOps accelerates the ML model deployment process to make it more efficient and scalable. In this blog post, we detail the steps you need to take to build and run a successful MLOps pipeline. An extension of DevOps, MLOps streamlines and monitors ML workflows. MLOps pipelines support a production-first approach.

ML

ML ML Machine Learning Machine Learning

40 Must-Know Data Science Skills and Frameworks for 2023

ODSC - Open Data Science

FEBRUARY 2, 2023

Just as a writer needs to know core skills like sentence structure, grammar, and so on, data scientists at all levels should know core data science skills like programming, computer science, algorithms, and so on. They’re looking for people who know all related skills, and have studied computer science and software engineering.

Data Science

Data Science Data Scientist Computer Science Computer Science

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

APRIL 5, 2024

SageMaker geospatial capabilities make it straightforward for data scientists and machine learning (ML) engineers to build, train, and deploy models using geospatial data. Janosch Woschitz is a Senior Solutions Architect at AWS, specializing in AI/ML.

Clustering

Clustering ML ML AWS

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

AWS Machine Learning Blog

DECEMBER 15, 2023

We are excited to announce the launch of Amazon DocumentDB (with MongoDB compatibility) integration with Amazon SageMaker Canvas , allowing Amazon DocumentDB customers to build and use generative AI and machine learning (ML) solutions without writing code. Let’s add some transformations to get our data ready for training an ML model.

Machine Learning

Machine Learning Machine Learning AWS ML

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

AWS Machine Learning Blog

OCTOBER 18, 2023

Purina used artificial intelligence (AI) and machine learning (ML) to automate animal breed detection at scale. The solution focuses on the fundamental principles of developing an AI/ML application workflow of data preparation, model training, model evaluation, and model monitoring. DynamoDB is used to store the pet attributes.

AWS

AWS ML ML Machine Learning

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

Despite the challenges, Afri-SET, with limited resources, envisions a comprehensive data management solution for stakeholders seeking sensor hosting on their platform, aiming to deliver accurate data from low-cost sensors. This happens only when a new data format is detected to avoid overburdening scarce Afri-SET resources.

AWS

AWS AI AI Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications. His experience spans all things data across various domains and sectors.

SQL

SQL Data Lakes Data Analyst AWS

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

AWS Machine Learning Blog

NOVEMBER 1, 2023

Organizations can search for PII using methods such as keyword searches, pattern matching, data loss prevention tools, machine learning (ML), metadata analysis, data classification software, optical character recognition (OCR), document fingerprinting, and encryption.

AWS

AWS Machine Learning Machine Learning ML

Orchestrate Machine Learning Pipelines with AWS Step Functions

Towards AI

OCTOBER 4, 2023

Advanced-Data Engineering and ML Ops with Infrastructure as Code This member-only story is on us. Photo by Markus Winkler on Unsplash This story explains how to create and orchestrate machine learning pipelines with AWS Step Functions and deploy them using Infrastructure as Code. Upgrade to access all of Medium.

Machine Learning

Machine Learning Machine Learning AWS ML

How to Version Control Data in ML for Various Data Sources

The MLOps Blog

JANUARY 23, 2023

However, there are some key differences that we need to consider: Size and complexity of the data In machine learning, we are often working with much larger data. Basically, every machine learning project needs data. Given the range of tools and data types, a separate data versioning logic will be necessary.

ML

ML ML Data Lakes Machine Learning

Step-by-step guide: Generative AI for your business

IBM Journey to AI blog

JULY 30, 2024

Data Scientists and AI experts: Historically we have seen Data Scientists build and choose traditional ML models for their use cases. Data Scientists will typically help with training, validating, and maintaining foundation models that are optimized for data tasks.

AI

AI AI Data Scientist Data Preparation

Introducing the Topic Tracks for ODSC East 2025: Spotlight on Gen AI, AI Agents, LLMs, & More

ODSC - Open Data Science

FEBRUARY 25, 2025

Topics Include: Agentic AI DesignPatterns LLMs & RAG forAgents Agent Architectures &Chaining Evaluating AI Agent Performance Building with LangChain and LlamaIndex Real-World Applications of Autonomous Agents Who Should Attend: Data Scientists, Developers, AI Architects, and ML Engineers seeking to build cutting-edge autonomous systems.

Data Scientist

Data Scientist Machine Learning Machine Learning AI

Software Engineering Patterns for Machine Learning

The MLOps Blog

SEPTEMBER 7, 2023

This situation is not different in the ML world. Data Scientists and ML Engineers typically write lots and lots of code. Applying software design principles to data engineering Dive into the integration of concrete software design principles and patterns within the realm of data engineering.

Machine Learning

Machine Learning Machine Learning ETL ML

Snowflake’s Acquisition of Datavolo: What Does it Mean for Customers?

phData

FEBRUARY 25, 2025

Over the years, businesses have increasingly turned to Snowflake AI Data Cloud for various use cases beyond just data analytics and business intelligence. From data engineering and machine learning to real-time data processing, Snowflake has become a central hub for organizations seeking to unify and leverage their data at scale.

Data Pipeline

Data Pipeline ETL Data Engineering Data Engineering

Deployment of Data and ML Pipelines for the Most Chaotic Industry: The Stirred Rivers of Crypto

The MLOps Blog

DECEMBER 7, 2022

And we at deployr , worked alongside them to find the best possible answers for everyone involved and build their Data and ML Pipelines. Building data and ML pipelines: from the ground to the cloud It was the beginning of 2022, and things were looking bright after the lockdown’s end.

ML

ML ML AWS ETL

Advancing AI Cloud with Release 7.2

DataRobot

SEPTEMBER 14, 2021

Data scientists and data engineers want full control over every aspect of their machine learning solutions and want coding interfaces so that they can use their favorite libraries and languages. At the same time, business and data analysts want to access intuitive, point-and-click tools that use automated best practices.

AI

AI AI Data Scientist Machine Learning

MLOps and the evolution of data science

IBM Journey to AI blog

AUGUST 11, 2023

Both computer scientists and business leaders have taken note of the potential of the data. Machine learning (ML), a subset of artificial intelligence (AI), is an important piece of data-driven innovation. MLOps is the next evolution of data analysis and deep learning. What is MLOps?

Data Science

Data Science Machine Learning Machine Learning ML

Lakehouse Monitoring: A Unified Solution for Quality of Data and AI

Real value, real time: Production AI with Amazon SageMaker and Tecton

Webinars

Trending Sources

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

Webinars

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

The power of remote engine execution for ETL/ELT data pipelines

How to Build ETL Data Pipeline in ML

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

Boost your MLOps efficiency with these 6 must-have tools and platforms

The 2021 Executive Guide To Data Science and AI

How to Build Effective Data Pipelines in Snowpark

Build an ML Inference Data Pipeline using SageMaker and Apache Airflow

ML Collaboration: Best Practices From 4 ML Teams

Shaping the future: OMRON’s data-driven journey with AWS

10 highest-paying AI jobs and careers in 2024

MLOps Landscape in 2023: Top Tools and Platforms

Top NLP Skills, Frameworks, Platforms, and Languages for 2023

Gen AI 101: Data Engineering (Part 2)

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

Performance Benefits of Snowpark for ML Workloads

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Managing Dataset Versions in Long-Term ML Projects

Implementing MLOps: 5 Key Steps for Successfully Managing ML Projects

40 Must-Know Data Science Skills and Frameworks for 2023

Understanding and predicting urban heat islands at Gramener using Amazon SageMaker geospatial capabilities

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

Improving air quality with generative AI

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

Orchestrate Machine Learning Pipelines with AWS Step Functions

How to Version Control Data in ML for Various Data Sources

Step-by-step guide: Generative AI for your business

Introducing the Topic Tracks for ODSC East 2025: Spotlight on Gen AI, AI Agents, LLMs, & More

Software Engineering Patterns for Machine Learning

Snowflake’s Acquisition of Datavolo: What Does it Mean for Customers?

Deployment of Data and ML Pipelines for the Most Chaotic Industry: The Stirred Rivers of Crypto

Advancing AI Cloud with Release 7.2

MLOps and the evolution of data science

Stay Connected