Blog, Data Pipeline and ML - Data Science Current

Lakehouse Monitoring: A Unified Solution for Quality of Data and AI

databricks

DECEMBER 12, 2023

Introduction Databricks Lakehouse Monitoring allows you to monitor all your data pipelines – from data to features to ML models – without additional too.

Data Pipeline

Data Pipeline ML ML AI

10 Technical Blogs for Data Scientists to Advance AI/ML Skills

DataRobot Blog

DECEMBER 6, 2022

With a goal to help data science teams learn about the application of AI and ML, DataRobot shares helpful, educational blogs based on work with the world’s most strategic companies. Explore these 10 popular blogs that help data scientists drive better data decisions. Read the blog. Read the blog.

Data Scientist

Data Scientist ML ML AI

Enhanced observability for AWS Trainium and AWS Inferentia with Datadog

AWS Machine Learning Blog

NOVEMBER 26, 2024

With the increasing use of large models, requiring a large number of accelerated compute instances, observability plays a critical role in ML operations, empowering you to improve performance, diagnose and fix failures, and optimize resource utilization. Anjali Thatte is a Product Manager at Datadog.

AWS

AWS ML ML Data Pipeline

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

From data processing to quick insights, robust pipelines are a must for any ML system. Often the Data Team, comprising Data and ML Engineers , needs to build this infrastructure, and this experience can be painful. However, efficient use of ETL pipelines in ML can help make their life much easier.

ETL

ETL Data Pipeline ML ML

The power of remote engine execution for ETL/ELT data pipelines

IBM Journey to AI blog

MAY 15, 2024

Data engineers build data pipelines, which are called data integration tasks or jobs, as incremental steps to perform data operations and orchestrate these data pipelines in an overall workflow. Organizations can harness the full potential of their data while reducing risk and lowering costs.

Data Pipeline

Data Pipeline ETL SQL Database

The ultimate guide to the Machine Learning Model Deployment

Data Science Dojo

JULY 5, 2023

Machine Learning (ML) is a powerful tool that can be used to solve a wide variety of problems. Getting your ML model ready for action: This stage involves building and training a machine learning model using efficient machine learning algorithms. Training and validation: The next step is to train the model on a subset of the data.

Machine Learning

Machine Learning Machine Learning EDA Exploratory Data Analysis

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Machine learning (ML) helps organizations to increase revenue, drive business growth, and reduce costs by optimizing core business functions such as supply and demand forecasting, customer churn prediction, credit risk scoring, pricing, predicting late shipments, and many others. Let’s learn about the services we will use to make this happen.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Businesses are under pressure to show return on investment (ROI) from AI use cases, whether predictive machine learning (ML) or generative AI. Only 54% of ML prototypes make it to production, and only 5% of generative AI use cases make it to production. Using SageMaker, you can build, train and deploy ML models.

ML

ML ML AWS AI

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 12, 2024

Instead, organizations are increasingly looking to take advantage of transformative technologies like machine learning (ML) and artificial intelligence (AI) to deliver innovative products, improve outcomes, and gain operational efficiencies at scale. Data is presented to the personas that need access using a unified interface.

ML

ML ML AWS AI

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift is the most popular cloud data warehouse that is used by tens of thousands of customers to analyze exabytes of data every day. SageMaker Studio is the first fully integrated development environment (IDE) for ML. The next step is to build ML models using features selected from one or multiple feature groups.

ML

ML ML AWS Data Warehouse

Use Snowflake as a data source to train ML models with Amazon SageMaker

AWS Machine Learning Blog

MARCH 8, 2023

Amazon SageMaker is a fully managed machine learning (ML) service. With SageMaker, data scientists and developers can quickly and easily build and train ML models, and then directly deploy them into a production-ready hosted environment. We add this data to Snowflake as a new table.

ML

ML ML AWS Python

Build an ML Inference Data Pipeline using SageMaker and Apache Airflow

Mlearning.ai

APRIL 6, 2023

Automate and streamline our ML inference pipeline with SageMaker and Airflow Building an inference data pipeline on large datasets is a challenge many companies face. The Batch job automatically launches an ML compute instance, deploys the model, and processes the input data in batches, producing the output predictions.

Data Pipeline

Data Pipeline ML ML AWS

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

Snowflake excels in efficient data storage and governance, while Dataiku provides the tooling to operationalize advanced analytics and machine learning models. Together they create a powerful, flexible, and scalable foundation for modern data applications.

Machine Learning

Machine Learning Machine Learning Data Science ML

Boost your MLOps efficiency with these 6 must-have tools and platforms

Data Science Dojo

FEBRUARY 20, 2023

In this blog, we’ll show you how to boost your MLOps efficiency with 6 essential tools and platforms. Machine learning (ML) is the technology that automates tasks and provides insights. Machine learning (ML) is the technology that automates tasks and provides insights. It also has ML algorithms built into the platform.

Machine Learning

Machine Learning Machine Learning AWS Azure

How to Build Effective Data Pipelines in Snowpark

phData

AUGUST 6, 2024

As today’s world keeps progressing towards data-driven decisions, organizations must have quality data created from efficient and effective data pipelines. For customers in Snowflake, Snowpark is a powerful tool for building these effective and scalable data pipelines.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

IBM Data Science in Practice

MARCH 8, 2023

The growth of the AI and Machine Learning (ML) industry has continued to grow at a rapid rate over recent years. Hidden Technical Debt in Machine Learning Systems More money, more problems — Rise of too many ML tools 2012 vs 2023 — Source: Matt Turck People often believe that money is the solution to a problem. Spark, Flink, etc.)

Machine Learning

Machine Learning Machine Learning ML ML

How to establish lineage transparency for your machine learning initiatives

IBM Journey to AI blog

MAY 20, 2024

Machine learning (ML) has become a critical component of many organizations’ digital transformation strategy. From predicting customer behavior to optimizing business processes, ML algorithms are increasingly being used to make decisions that impact business outcomes.

Machine Learning

Machine Learning Machine Learning Data Scientist ML

Build Data Pipelines: Comprehensive Step-by-Step Guide

Pickl AI

JULY 8, 2024

Summary: This blog explains how to build efficient data pipelines, detailing each step from data collection to final delivery. Introduction Data pipelines play a pivotal role in modern data architecture by seamlessly transporting and transforming raw data into valuable insights.

Data Pipeline

Data Pipeline Data Quality Database Apache Kafka

ML Collaboration: Best Practices From 4 ML Teams

The MLOps Blog

DECEMBER 28, 2022

The onset of the pandemic has triggered a rapid increase in the demand and adoption of ML technology. Building ML team Following the surge in ML use cases that have the potential to transform business, the leaders are making a significant investment in ML collaboration, building teams that can deliver the promise of machine learning.

ML

ML ML Data Scientist Machine Learning

10 highest-paying AI jobs and careers in 2024

Data Science Dojo

APRIL 16, 2024

In this blog, we will explore the top 10 AI jobs and careers that are also the highest-paying opportunities for individuals in 2024. Machine learning (ML) engineer Potential pay range – US$82,000 to 160,000/yr Machine learning engineers are the bridge between data science and engineering.

AI

AI AI Machine Learning Machine Learning

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

Alignment to other tools in the organization’s tech stack Consider how well the MLOps tool integrates with your existing tools and workflows, such as data sources, data engineering platforms, code repositories, CI/CD pipelines, monitoring systems, etc. and Pandas or Apache Spark DataFrames.

Machine Learning

Machine Learning Machine Learning ML ML

AWS Machine Learning: A Beginner’s Guide

How to Learn Machine Learning

DECEMBER 24, 2024

You can easily: Store and process data using S3 and RedShift Create data pipelines with AWS Glue Deploy models through API Gateway Monitor performance with CloudWatch Manage access control with IAM This integrated ecosystem makes it easier to build end-to-end machine learning solutions.

Machine Learning

Machine Learning Machine Learning AWS ML

Discovering the Role of Data Science in a Cloud World

Pickl AI

DECEMBER 26, 2024

Advancements in data processing, storage, and analysis technologies power this transformation. In Data Science in a Cloud World, we explore how cloud computing has revolutionised Data Science. Key Features Tailored for Data Science These platforms offer specialised features to enhance productivity.

Data Science

Data Science Cloud Computing Machine Learning Machine Learning

ML Days in Tashkent — Day 3: Demos and Workshops

PyImageSearch

DECEMBER 18, 2023

This format made for a fast-paced and diverse showcase of ideas and applications in AI and ML. In just 3 minutes, each participant managed to highlight the core of their work, offering insights into the innovative ways in which AI and ML are being applied across various fields. But again, stick around for a surprise demo at the end. ?

ML

ML ML Deep Learning Deep Learning

Streamlining Process Configuration in Machine Learning with Hydra

Pickl AI

NOVEMBER 29, 2024

It enhances scalability, experimentation, and reproducibility, allowing ML teams to focus on innovation. This blog highlights the importance of organised, flexible configurations in ML workflows and introduces Hydra. It also simplifies managing configuration dependencies in Deep Learning projects and large-scale data pipelines.

Machine Learning

Machine Learning Machine Learning ML ML

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning Blog

DECEMBER 4, 2024

Building generative AI applications presents significant challenges for organizations: they require specialized ML expertise, complex infrastructure management, and careful orchestration of multiple services. An expert in AI/ML and generative AI, Ameer helps customers unlock the potential of these cutting-edge technologies.

AWS

AWS AI AI SQL

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

AWS Machine Learning Blog

MARCH 1, 2023

Statistical methods and machine learning (ML) methods are actively developed and adopted to maximize the LTV. In this post, we share how Kakao Games and the Amazon Machine Learning Solutions Lab teamed up to build a scalable and reliable LTV prediction solution by using AWS data and ML services such as AWS Glue and Amazon SageMaker.

AWS

AWS ML ML ETL

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

AWS Machine Learning Blog

FEBRUARY 24, 2023

AWS recently released Amazon SageMaker geospatial capabilities to provide you with satellite imagery and geospatial state-of-the-art machine learning (ML) models, reducing barriers for these types of use cases. For more information, refer to Preview: Use Amazon SageMaker to Build, Train, and Deploy ML Models Using Geospatial Data.

ML

ML ML AWS Data Pipeline

Performance Benefits of Snowpark for ML Workloads

phData

MARCH 22, 2023

As companies continue to adopt machine learning (ML) in their workflows, the demand for scalable and efficient tools has increased. In this blog post, we will explore the performance benefits of Snowpark for ML workloads and how it can help businesses make better use of their data. Want to learn more? Can’t wait?

ML

ML ML Machine Learning Machine Learning

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

IBM Journey to AI blog

AUGUST 12, 2024

Instead, businesses tend to rely on advanced tools and strategies—namely artificial intelligence for IT operations (AIOps) and machine learning operations (MLOps)—to turn vast quantities of data into actionable insights that can improve IT decision-making and ultimately, the bottom line.

Big Data

Big Data Big Data ML ML

OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

AWS Machine Learning Blog

FEBRUARY 5, 2025

In this two-part blog post series, we explore the key opportunities OfferUp embraced on their journey to boost and transform their existing search solution from traditional lexical search to modern multimodal search powered by Amazon Bedrock and Amazon OpenSearch Service.

K-nearest Neighbors

K-nearest Neighbors Machine Learning Machine Learning Database

Supercharging Your Data Pipeline with Apache Airflow (Part 2)

Heartbeat

NOVEMBER 6, 2023

Image Source — Pixel Production Inc In the previous article, you were introduced to the intricacies of data pipelines, including the two major types of existing data pipelines. You might be curious how a simple tool like Apache Airflow can be powerful for managing complex data pipelines.

Data Pipeline

Data Pipeline Clean Data ETL Python

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

AWS Machine Learning Blog

APRIL 19, 2023

Since 2018, our team has been developing a variety of ML models to enable betting products for NFL and NCAA football. After reading a few blog posts and DJL’s official documentation, we were sure DJL would provide the best solution to our problem. Business requirements We are the US squad of the Sportradar AI department.

ML

ML ML Deep Learning Deep Learning

Managing Dataset Versions in Long-Term ML Projects

The MLOps Blog

MARCH 20, 2023

Long-term ML project involves developing and sustaining applications or systems that leverage machine learning models, algorithms, and techniques. An example of a long-term ML project will be a bank fraud detection system powered by ML models and algorithms for pattern recognition. 2 Ensuring and maintaining high-quality data.

ML

ML ML Machine Learning Machine Learning

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

OMRONs data strategyrepresented on ODAPalso allowed the organization to unlock generative AI use cases focused on tangible business outcomes and enhanced productivity. Xinyi Zhou is a Data Engineer at Omron Europe, bringing her expertise to the ODAP team led by Emrah Kaya.

AWS

AWS Data Governance Data Silos SQL

Implementing MLOps: 5 Key Steps for Successfully Managing ML Projects

Iguazio

JULY 31, 2023

MLOps accelerates the ML model deployment process to make it more efficient and scalable. In this blog post, we detail the steps you need to take to build and run a successful MLOps pipeline. An extension of DevOps, MLOps streamlines and monitors ML workflows. MLOps pipelines support a production-first approach.

ML

ML ML Machine Learning Machine Learning

Organizing ML Monorepo With Pants

The MLOps Blog

AUGUST 4, 2023

Situations described above arise way too often in ML teams, and their consequences vary from a single developer’s annoyance to the team’s inability to ship their code as needed. Let’s dive into the world of monorepos, an architecture widely adopted in major tech companies like Google, and how they can enhance your ML workflows.

ML

ML ML Machine Learning Machine Learning

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

AWS Machine Learning Blog

AUGUST 22, 2024

This makes managing and deploying these updates across a large-scale deployment pipeline while providing consistency and minimizing downtime a significant undertaking. Generative AI applications require continuous ingestion, preprocessing, and formatting of vast amounts of data from various sources.

ML

ML ML AI AI

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Hosted on Amazon ECS with tasks run on Fargate, this platform streamlines the end-to-end ML workflow, from data ingestion to model deployment.

AWS

AWS Machine Learning Machine Learning ML

Designing generative AI workloads for resilience

AWS Machine Learning Blog

FEBRUARY 1, 2024

Data pipelines In cases where you need to provide contextual data to the foundation model using the RAG pattern, you need a data pipeline that can ingest the source data, convert it to embedding vectors, and store the embedding vectors in a vector database.

AWS

AWS AI AI Database

What Lays Ahead in 2024? AI/ML Predictions for the New Year

Iguazio

DECEMBER 18, 2023

For data science practitioners, productization is key, just like any other AI or ML technology. However, it's important to contextualize generative AI within the broader landscape of AI and ML technologies. By thinking about the ML process in advance: preparing, managing, and versioning data, reusing components, etc.,

ML

ML ML AI AI

How HR Tech Company Sense Scaled their ML Operations using Iguazio

Iguazio

JANUARY 16, 2024

Since AI is a central pillar of their value offering, Sense has invested heavily in a robust engineering organization including a large number of data and AI professionals. This includes a data team, an analytics team, DevOps, AI/ML, and a data science team. Gennaro Frazzingaro, Head of AI/ML at Sense.

ML

ML ML DataOps Data Scientist

How Sense Uses Iguazio as a Key Component of Their ML Stack

Iguazio

JANUARY 16, 2024

This includes a data team, an analytics team, DevOps, AI/ML, and a data science team. The AI/Ml team is made up of ML engineers, data scientists and backend product engineers. With Iguazio, Sense’s data professionals can pull data, analyze it, train and run experiments.

ML

ML ML DataOps Data Scientist

Mastering ML Model Performance: Best Practices for Optimal Results

Iguazio

JUNE 25, 2023

Evaluating ML model performance is essential for ensuring the reliability, quality, accuracy and effectiveness of your ML models. In this blog post, we dive into all aspects of ML model performance: which metrics to use to measure performance, best practices that can help and where MLOps fits in.

ML

ML ML Clustering Cross Validation

Lakehouse Monitoring: A Unified Solution for Quality of Data and AI

10 Technical Blogs for Data Scientists to Advance AI/ML Skills

Webinars

Trending Sources

Enhanced observability for AWS Trainium and AWS Inferentia with Datadog

Webinars

How to Build ETL Data Pipeline in ML

The power of remote engine execution for ETL/ELT data pipelines

The ultimate guide to the Machine Learning Model Deployment

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

Real value, real time: Production AI with Amazon SageMaker and Tecton

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Use Snowflake as a data source to train ML models with Amazon SageMaker

Build an ML Inference Data Pipeline using SageMaker and Apache Airflow

How Dataiku and Snowflake Strengthen the Modern Data Stack

Boost your MLOps efficiency with these 6 must-have tools and platforms

How to Build Effective Data Pipelines in Snowpark

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

How to establish lineage transparency for your machine learning initiatives

Build Data Pipelines: Comprehensive Step-by-Step Guide

ML Collaboration: Best Practices From 4 ML Teams

10 highest-paying AI jobs and careers in 2024

MLOps Landscape in 2023: Top Tools and Platforms

AWS Machine Learning: A Beginner’s Guide

Discovering the Role of Data Science in a Cloud World

ML Days in Tashkent — Day 3: Demos and Workshops

Streamlining Process Configuration in Machine Learning with Hydra

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

Performance Benefits of Snowpark for ML Workloads

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service

Supercharging Your Data Pipeline with Apache Airflow (Part 2)

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

Managing Dataset Versions in Long-Term ML Projects

Shaping the future: OMRON’s data-driven journey with AWS

Implementing MLOps: 5 Key Steps for Successfully Managing ML Projects

Organizing ML Monorepo With Pants

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Designing generative AI workloads for resilience

What Lays Ahead in 2024? AI/ML Predictions for the New Year

How HR Tech Company Sense Scaled their ML Operations using Iguazio

How Sense Uses Iguazio as a Key Component of Their ML Stack

Mastering ML Model Performance: Best Practices for Optimal Results

Stay Connected