AI, Data Pipeline and Data Quality - Data Science Current

Innovations in Analytics: Elevating Data Quality with GenAI

Towards AI

OCTOBER 31, 2024

Last Updated on October 31, 2024 by Editorial Team Author(s): Jonas Dieckmann Originally published on Towards AI. Data analytics has become a key driver of commercial success in recent years. The ability to turn large data sets into actionable insights can mean the difference between a successful campaign and missed opportunities.

Data Quality

Data Quality Analytics Analytics Clean Data

Securing the data pipeline, from blockchain to AI

Dataconomy

OCTOBER 8, 2024

Almost every tech company today is up to its neck in generative AI, with Google focused on enhancing search, Microsoft betting the house on business productivity gains with its family of copilots, and startups like Runway AI and Stability AI going all-in on video and image creation. Why is data integrity important?

Data Pipeline

Data Pipeline AI AI Data Warehouse

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Businesses are under pressure to show return on investment (ROI) from AI use cases, whether predictive machine learning (ML) or generative AI. Only 54% of ML prototypes make it to production, and only 5% of generative AI use cases make it to production. This post is cowritten with Isaac Cameron and Alex Gnibus from Tecton.

ML

ML ML AWS AI

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

The power of remote engine execution for ETL/ELT data pipelines

IBM Journey to AI blog

MAY 15, 2024

Business leaders risk compromising their competitive edge if they do not proactively implement generative AI (gen AI). However, businesses scaling AI face entry barriers. This situation will exacerbate data silos, increase costs and complicate the governance of AI and data workloads.

Data Pipeline

Data Pipeline ETL SQL Database

Enhanced observability for AWS Trainium and AWS Inferentia with Datadog

AWS Machine Learning Blog

NOVEMBER 26, 2024

AWS AI chips, Trainium and Inferentia, enable you to build and deploy generative AI models at higher performance and lower cost. The Datadog dashboard offers a detailed view of your AWS AI chip (Trainium or Inferentia) performance, such as the number of instances, availability, and AWS Region.

AWS

AWS ML ML Data Pipeline

Data Integration for AI: Top Use Cases and Steps for Success

Precisely

FEBRUARY 20, 2025

Key Takeaways Trusted data is critical for AI success. Data integration ensures your AI initiatives are fueled by complete, relevant, and real-time enterprise data, minimizing errors and unreliable outcomes that could harm your business. Data integration solves key business challenges.

Data Silos

Data Silos AI AI Data Quality

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

MAY 16, 2023

But with the sheer amount of data continually increasing, how can a business make sense of it? Robust data pipelines. What is a Data Pipeline? A data pipeline is a series of processing steps that move data from its source to its destination. The answer?

Data Pipeline

Data Pipeline Data Governance Data Lakes Data Warehouse

Data integrity vs. data quality: Is there a difference?

IBM Journey to AI blog

JULY 13, 2023

When we talk about data integrity, we’re referring to the overarching completeness, accuracy, consistency, accessibility, and security of an organization’s data. Together, these factors determine the reliability of the organization’s data. Data quality Data quality is essentially the measure of data integrity.

Data Quality

Data Quality Data Profiling Data Governance Machine Learning

5 Data Quality Best Practices

Precisely

SEPTEMBER 30, 2024

Key Takeaways By deploying technologies that can learn and improve over time, companies that embrace AI and machine learning can achieve significantly better results from their data quality initiatives. Here are five data quality best practices which business leaders should focus.

Data Quality

Data Quality Data Governance Machine Learning Machine Learning

Build Data Pipelines: Comprehensive Step-by-Step Guide

Pickl AI

JULY 8, 2024

Summary: This blog explains how to build efficient data pipelines, detailing each step from data collection to final delivery. Introduction Data pipelines play a pivotal role in modern data architecture by seamlessly transporting and transforming raw data into valuable insights.

Data Pipeline

Data Pipeline Data Quality Database Apache Kafka

Data Quality in Machine Learning

Pickl AI

JULY 24, 2024

Summary: Data quality is a fundamental aspect of Machine Learning. Poor-quality data leads to biased and unreliable models, while high-quality data enables accurate predictions and insights. What is Data Quality in Machine Learning? Bias in data can result in unfair and discriminatory outcomes.

Data Quality

Data Quality Machine Learning Machine Learning Clean Data

Data Fabric and Address Verification Interface

IBM Data Science in Practice

NOVEMBER 28, 2022

Implementing a data fabric architecture is the answer. What is a data fabric? Data fabric is defined by IBM as “an architecture that facilitates the end-to-end integration of various data pipelines and cloud environments through the use of intelligent and automated systems.”

Data Pipeline

Data Pipeline Data Quality Data Preparation Data Governance

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

At the heart of this transformation is the OMRON Data & Analytics Platform (ODAP), an innovative initiative designed to revolutionize how the company harnesses its data assets. The robust security features provided by Amazon S3, including encryption and durability, were used to provide data protection.

AWS

AWS Data Governance Data Silos SQL

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

Poor data quality is one of the top barriers faced by organizations aspiring to be more data-driven. Ill-timed business decisions and misinformed business processes, missed revenue opportunities, failed business initiatives and complex data systems can all stem from data quality issues.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

Data Quality Framework: What It Is, Components, and Implementation

DagsHub

AUGUST 23, 2024

As such, the quality of their data can make or break the success of the company. This article will guide you through the concept of a data quality framework, its essential components, and how to implement it effectively within your organization. What is a data quality framework?

Data Quality

Data Quality Data Governance Machine Learning Machine Learning

Unfolding the difference between Data Observability and Data Quality

Pickl AI

OCTOBER 10, 2023

In this blog, we are going to unfold the two key aspects of data management that is Data Observability and Data Quality. Data is the lifeblood of the digital age. Today, every organization tries to explore the significant aspects of data and its applications.

Data Observability

Data Observability Data Quality Data Governance Data Pipeline

Why You Need Data Observability to Improve Data Quality

Precisely

MAY 4, 2023

Systems and data sources are more interconnected than ever before. A broken data pipeline might bring operational systems to a halt, or it could cause executive dashboards to fail, reporting inaccurate KPIs to top management. Is your data governance structure up to the task? Read What Is Data Observability?

Data Observability

Data Observability Data Quality Data Pipeline Machine Learning

Supercharge your data strategy: Integrate and innovate today leveraging data integration

IBM Journey to AI blog

OCTOBER 22, 2024

Data is the differentiator as business leaders look to utilize their competitive edge as they implement generative AI (gen AI). Leaders feel the pressure to infuse their processes with artificial intelligence (AI) and are looking for ways to harness the insights in their data platforms to fuel this movement.

Data Silos

Data Silos Data Pipeline DataOps Business Intelligence

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

phData

OCTOBER 25, 2024

“Quality over Quantity” is a phrase we hear regularly in life, but when it comes to the world of data, we often fail to adhere to this rule. Data Quality Monitoring implements quality checks in operational data processes to ensure that the data meets pre-defined standards and business rules.

Data Quality

Data Quality Data Pipeline Data Governance Database

McKinsey QuantumBlack on automating data quality remediation with AI

Snorkel AI

JUNE 22, 2023

Jacomo Corbo is a Partner and Chief Scientist, and Bryan Richardson is an Associate Partner and Senior Data Scientist, for QuantumBlack AI by McKinsey. They presented “Automating Data Quality Remediation With AI” at Snorkel AI’s The Future of Data-Centric AI Summit in 2022.

Data Quality

Data Quality ML ML AI

McKinsey QuantumBlack on automating data quality remediation with AI

Snorkel AI

JUNE 22, 2023

Jacomo Corbo is a Partner and Chief Scientist, and Bryan Richardson is an Associate Partner and Senior Data Scientist, for QuantumBlack AI by McKinsey. They presented “Automating Data Quality Remediation With AI” at Snorkel AI’s The Future of Data-Centric AI Summit in 2022.

Data Quality

Data Quality ML ML AI

McKinsey QuantumBlack on automating data quality remediation with AI

Snorkel AI

JUNE 22, 2023

Jacomo Corbo is a Partner and Chief Scientist, and Bryan Richardson is an Associate Partner and Senior Data Scientist, for QuantumBlack AI by McKinsey. They presented “Automating Data Quality Remediation With AI” at Snorkel AI’s The Future of Data-Centric AI Summit in 2022.

Data Quality

Data Quality ML ML AI

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

IBM Data Science in Practice

MARCH 8, 2023

The United States published a Blueprint for the AI Bill of Rights. The growth of the AI and Machine Learning (ML) industry has continued to grow at a rapid rate over recent years. Source: A Chat with Andrew on MLOps: From Model-centric to Data-centric AI So how does this data-centric approach fit in with Machine Learning? — Features

Machine Learning

Machine Learning Machine Learning ML ML

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

With Azure Machine Learning, data scientists can leverage pre-built models, automate machine learning tasks, and seamlessly integrate with other Azure services, making it an efficient and scalable solution for machine learning projects in the cloud. Might be useful Unlike manual, homegrown, or open-source solutions, neptune.ai

Machine Learning

Machine Learning Machine Learning ML ML

Step-by-step guide: Generative AI for your business

IBM Journey to AI blog

JULY 30, 2024

Generative artificial intelligence (gen AI) is transforming the business world by creating new opportunities for innovation, productivity and efficiency. This guide offers a clear roadmap for businesses to begin their gen AI journey. Most teams should include at least four types of team members.

AI

AI AI Data Scientist Data Preparation

Building a Capability Roadmap: The Maturity Stages of Data & AI

ODSC - Open Data Science

MAY 15, 2023

Enterprises spend an average of $15M annually on data & AI initiatives. Yet, last year, 90% of AI investments by enterprises saw zero return, according to VentureBeat. This means a lot of money and effort is going into advancing data & AI capabilities, but companies are still struggling to see the business value.

AI

AI AI Data Quality Data Pipeline

The importance of data ingestion and integration for enterprise AI

IBM Journey to AI blog

JANUARY 9, 2024

The emergence of generative AI prompted several prominent companies to restrict its use because of the mishandling of sensitive internal data. According to CNN, some companies imposed internal bans on generative AI tools while they seek to better understand the technology and many have also blocked the use of internal ChatGPT.

AI

AI AI Data Quality Data Pipeline

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

MAY 16, 2023

But with the sheer amount of data continually increasing, how can a business make sense of it? Robust data pipelines. What is a Data Pipeline? A data pipeline is a series of processing steps that move data from its source to its destination. The answer?

Data Pipeline

Data Pipeline Data Governance Data Lakes Data Warehouse

With generative AI, don’t believe the hype (or the anti-hype)

IBM Journey to AI blog

SEPTEMBER 3, 2024

No technology in human history has seen as much interest in such a short time as generative AI (gen AI). How might generative AI achieve this? Because of the wide-ranging applications and complexity of generative AI, many media reports might lead readers to believe that the technology is an almost magical cure-all.

AI

AI AI Algorithm Artificial Intelligence

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

AWS Machine Learning Blog

AUGUST 12, 2024

Instead, organizations are increasingly looking to take advantage of transformative technologies like machine learning (ML) and artificial intelligence (AI) to deliver innovative products, improve outcomes, and gain operational efficiencies at scale. To facilitate this, an automated data engineering pipeline is built using AWS Step Functions.

ML

ML ML AWS AI

How the right data and AI foundation can empower a successful ESG strategy

IBM Journey to AI blog

APRIL 10, 2023

A well-designed data architecture should support business intelligence and analysis, automation, and AI—all of which can help organizations to quickly seize market opportunities, build customer value, drive major efficiencies, and respond to risks such as supply chain disruptions.

AI

AI AI Data Governance Data Pipeline

Gain an AI Advantage with Data Governance and Quality

Precisely

AUGUST 29, 2024

Key Takeaways Data quality ensures your data is accurate, complete, reliable, and up to date – powering AI conclusions that reduce costs and increase revenue and compliance. Data observability continuously monitors data pipelines and alerts you to errors and anomalies.

Data Governance

Data Governance Data Quality Data Observability AI

AI-Powered Digital Transformation: Get Your Data and AI Ready

Precisely

AUGUST 15, 2024

Key Takeaways Leverage AI to achieve digital transformation goals: enhanced efficiency, decision-making, customer experiences, and more. Address common challenges in managing SAP master data by using AI tools to automate SAP processes and ensure data quality. This involves various professionals.

AI

AI AI Data Quality Data Engineering

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

FEBRUARY 21, 2025

Whether youre new to AI development or an experienced practitioner, this post provides step-by-step guidance and code examples to help you build more reliable AI applications. Chaithanya Maisagoni is a Senior Software Development Engineer (AI/ML) in Amazons Worldwide Returns and ReCommerce organization.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Data Observability Tools and Its Key Applications

Pickl AI

OCTOBER 11, 2023

Data Observability and Data Quality are two key aspects of data management. The focus of this blog is going to be on Data Observability tools and their key framework. The growing landscape of technology has motivated organizations to adopt newer ways to harness the power of data. What is Data Observability?

Data Observability

Data Observability Data Quality Data Pipeline Data Governance

Your Guide to Unlocking Trusted AI with Reliable Data

Precisely

MARCH 4, 2024

What if every decision, recommendation, and prediction made by artificial intelligence (AI) was as reliable as your most trusted team members? This isn’t a distant dream – it’s a tangible reality with trusted AI. But how can you make sure your AI can be trusted? Remember some of the newsworthy AI mishaps of 2023?

AI

AI AI Data Quality Artificial Intelligence

How data stores and governance impact your AI initiatives

IBM Journey to AI blog

OCTOBER 12, 2023

But the implementation of AI is only one piece of the puzzle. The tasks behind efficient, responsible AI lifecycle management The continuous application of AI and the ability to benefit from its ongoing use require the persistent management of a dynamic and intricate AI lifecycle—and doing so efficiently and responsibly.

AI

AI AI Data Scientist Data Governance

Data observability: The missing piece in your data integration puzzle

IBM Journey to AI blog

SEPTEMBER 2, 2024

Historically, data engineers have often prioritized building data pipelines over comprehensive monitoring and alerting. Delivering projects on time and within budget often took precedence over long-term data health. Even if you can spot the issue, it becomes a challenge to identify the origin of the data quality problem.

Data Observability

Data Observability Data Pipeline Data Engineering Data Engineer

Future-Proofing Your App: Strategies for Building Long-Lasting Apps

Iguazio

MAY 29, 2024

The generative AI industry is changing fast. To ensure AI applications remain relevant, effective, secure and capable of delivering value, teams need to keep up with the latest research, technological developments and potential use cases. The 4 Gen AI Architecture Pipelines The four pipelines are: 1.

Data Pipeline

Data Pipeline AI AI ML

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

AWS Machine Learning Blog

DECEMBER 15, 2023

We are excited to announce the launch of Amazon DocumentDB (with MongoDB compatibility) integration with Amazon SageMaker Canvas , allowing Amazon DocumentDB customers to build and use generative AI and machine learning (ML) solutions without writing code. Analyze data using generative AI. Prepare data for machine learning.

Machine Learning

Machine Learning Machine Learning AWS ML

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

In this post, you will learn about the 10 best data pipeline tools, their pros, cons, and pricing. A typical data pipeline involves the following steps or processes through which the data passes before being consumed by a downstream process, such as an ML model training process.

Data Pipeline

Data Pipeline ETL SQL Data Quality

Scale knowledge management use cases with generative AI

IBM Journey to AI blog

JULY 27, 2023

According to IBM’s Institute of Business Value (IBV) , AI can contain contact center cases, enhancing customer experience by 70%. Additionally, AI can increase productivity in HR by 40% and in application modernization by 30%. Overall placing emphasis on establishing a trusted and integrated data platform for AI.

AI

AI AI Data Scientist Data Quality

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Effective data governance enhances quality and security throughout the data lifecycle. What is Data Engineering? Data Engineering is designing, constructing, and managing systems that enable data collection, storage, and analysis. ETL is vital for ensuring data quality and integrity.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

What is the Pile Dataset

Pickl AI

DECEMBER 25, 2024

It integrates diverse, high-quality content from 22 sources, enabling robust AI research and development. Its accessibility and scalability make it essential for applications like text generation, summarisation, and domain-specific AI solutions. Its diverse content includes academic papers, web data, books, and code.

Natural Language Processing

Natural Language Processing Machine Learning Machine Learning AI

Innovations in Analytics: Elevating Data Quality with GenAI

Securing the data pipeline, from blockchain to AI

Webinars

Trending Sources

Real value, real time: Production AI with Amazon SageMaker and Tecton

Webinars

The power of remote engine execution for ETL/ELT data pipelines

Enhanced observability for AWS Trainium and AWS Inferentia with Datadog

Data Integration for AI: Top Use Cases and Steps for Success

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Data integrity vs. data quality: Is there a difference?

5 Data Quality Best Practices

Build Data Pipelines: Comprehensive Step-by-Step Guide

Data Quality in Machine Learning

Data Fabric and Address Verification Interface

Shaping the future: OMRON’s data-driven journey with AWS

Data architecture strategy for data quality

Data Quality Framework: What It Is, Components, and Implementation

Unfolding the difference between Data Observability and Data Quality

Why You Need Data Observability to Improve Data Quality

Supercharge your data strategy: Integrate and innovate today leveraging data integration

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

McKinsey QuantumBlack on automating data quality remediation with AI

McKinsey QuantumBlack on automating data quality remediation with AI

McKinsey QuantumBlack on automating data quality remediation with AI

Feature Platforms?—?A New Paradigm in Machine Learning Operations (MLOps)

MLOps Landscape in 2023: Top Tools and Platforms

Step-by-step guide: Generative AI for your business

Building a Capability Roadmap: The Maturity Stages of Data & AI

The importance of data ingestion and integration for enterprise AI

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

With generative AI, don’t believe the hype (or the anti-hype)

Harness the power of AI and ML using Splunk and Amazon SageMaker Canvas

How the right data and AI foundation can empower a successful ESG strategy

Gain an AI Advantage with Data Governance and Quality

AI-Powered Digital Transformation: Get Your Data and AI Ready

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Data Observability Tools and Its Key Applications

Your Guide to Unlocking Trusted AI with Reliable Data

How data stores and governance impact your AI initiatives

Data observability: The missing piece in your data integration puzzle

Future-Proofing Your App: Strategies for Building Long-Lasting Apps

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

Comparing Tools For Data Processing Pipelines

Scale knowledge management use cases with generative AI

Discover the Most Important Fundamentals of Data Engineering

What is the Pile Dataset

Stay Connected