Blog, Data Governance and Data Pipeline

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. They transform data into a consistent format for users to consume.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

How to Assess Data Quality Readiness for Modern Data Pipelines

Dataversity

FEBRUARY 13, 2023

The key to being truly data-driven is having access to accurate, complete, and reliable data. In fact, Gartner recently found that organizations believe […] The post How to Assess Data Quality Readiness for Modern Data Pipelines appeared first on DATAVERSITY.

Data Pipeline

Data Pipeline Data Quality Data Silos Data Governance

Why data governance is essential for enterprise AI

IBM Journey to AI blog

AUGUST 23, 2023

Because of this, when we look to manage and govern the deployment of AI models, we must first focus on governing the data that the AI models are trained on. This data governance requires us to understand the origin, sensitivity, and lifecycle of all the data that we use. and watsonx.data.

Data Governance

Data Governance AI AI Artificial Intelligence

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Choosing Tools for Data Pipeline Test Automation (Part 1)

Dataversity

NOVEMBER 15, 2023

Those who want to design universal data pipelines and ETL testing tools face a tough challenge because of the vastness and variety of technologies: Each data pipeline platform embodies a unique philosophy, architectural design, and set of operations.

Data Pipeline

Data Pipeline ETL Data Governance Data Quality

Testing and Monitoring Data Pipelines: Part One

Dataversity

MAY 26, 2023

Suppose you’re in charge of maintaining a large set of data pipelines from cloud storage or streaming data into a data warehouse. How can you ensure that your data meets expectations after every transformation? That’s where data quality testing comes in.

Data Pipeline

Data Pipeline Data Warehouse Data Quality Data Observability

Testing and Monitoring Data Pipelines: Part Two

Dataversity

JUNE 19, 2023

In part one of this article, we discussed how data testing can specifically test a data object (e.g., table, column, metadata) at one particular point in the data pipeline.

Data Pipeline

Data Pipeline Database Data Modeling Data Models

5 Ways Data Engineers Can Support Data Governance

Alation

JANUARY 26, 2023

That’s why many organizations invest in technology to improve data processes, such as a machine learning data pipeline. However, data needs to be easily accessible, usable, and secure to be useful — yet the opposite is too often the case. These data requirements could be satisfied with a strong data governance strategy.

Data Governance

Data Governance Data Engineering Data Engineering Data Engineer

6 benefits of data lineage for financial services

IBM Journey to AI blog

FEBRUARY 26, 2024

The financial services industry has been in the process of modernizing its data governance for more than a decade. But as we inch closer to global economic downturn, the need for top-notch governance has become increasingly urgent. That’s why data pipeline observability is so important.

Data Pipeline

Data Pipeline Data Engineering Data Engineering Data Engineer

Leveraging Data Pipelines to Meet the Needs of the Business: Why the Speed of Data Matters

Dataversity

JUNE 26, 2023

The same expectation applies to data, […] The post Leveraging Data Pipelines to Meet the Needs of the Business: Why the Speed of Data Matters appeared first on DATAVERSITY. Today, businesses and individuals expect instant access to information and swift delivery of services.

Data Pipeline

Data Pipeline Data Observability Data Quality Data Governance

Data Observability vs. Monitoring vs. Testing

Dataversity

MARCH 13, 2023

Companies are spending a lot of money on data and analytics capabilities, creating more and more data products for people inside and outside the company. These products rely on a tangle of data pipelines, each a choreography of software executions transporting data from one place to another.

Data Observability

Data Observability Data Pipeline Analytics Analytics

How the right data and AI foundation can empower a successful ESG strategy

IBM Journey to AI blog

APRIL 10, 2023

To further the above, organizations should have the right foundation that consists of a modern data governance approach and data architecture. It’s becoming critical that organizations should adopt a data architecture that supports AI governance. The time for data professionals to meet this challenge is now.

AI

AI AI Data Governance Data Pipeline

Build trust in banking with data lineage

IBM Journey to AI blog

APRIL 20, 2023

This trust depends on an understanding of the data that inform risk models: where does it come from, where is it being used, and what are the ripple effects of a change? Moreover, banks must stay in compliance with industry regulations like BCBS 239, which focus on improving banks’ risk data aggregation and risk reporting capabilities.

Database

Database Data Engineering Data Engineering Data Engineer

It’s Essential – Verifying the Results of Data Transformations (Part 1)

Dataversity

NOVEMBER 20, 2024

Today’s data pipelines use transformations to convert raw data into meaningful insights. Yet, ensuring the accuracy and reliability of these transformations is no small feat – tools and methods to test the variety of data and transformation can be daunting.

Data Pipeline

Data Pipeline Data Quality Data Governance

Data Observability Tools and Its Key Applications

Pickl AI

OCTOBER 11, 2023

Data Observability and Data Quality are two key aspects of data management. The focus of this blog is going to be on Data Observability tools and their key framework. The growing landscape of technology has motivated organizations to adopt newer ways to harness the power of data. What is Data Observability?

Data Observability

Data Observability Data Quality Data Pipeline Data Governance

How data stores and governance impact your AI initiatives

IBM Journey to AI blog

OCTOBER 12, 2023

Securing AI models and their access to data While AI models need flexibility to access data across a hybrid infrastructure, they also need safeguarding from tampering (unintentional or otherwise) and, especially, protected access to data. And that makes sense.

AI

AI AI Data Scientist Data Governance

Unfolding the difference between Data Observability and Data Quality

Pickl AI

OCTOBER 10, 2023

In this blog, we are going to unfold the two key aspects of data management that is Data Observability and Data Quality. Data is the lifeblood of the digital age. Today, every organization tries to explore the significant aspects of data and its applications.

Data Observability

Data Observability Data Quality Data Governance Data Pipeline

Alation and Fivetran Partner to Bring Greater Visibility to the Modern Data Stack

Alation

SEPTEMBER 22, 2022

This new partnership will unify governed, quality data into a single view, granting all stakeholders total visibility into pipelines and providing them with a superior ability to make data-driven decisions. For people to understand and trust data, they need to see it in context. Data Pipeline Strategy.

Data Pipeline

Data Pipeline Data Quality Data Governance Data Engineering

Four starting points to transform your organization into a data-driven enterprise

IBM Journey to AI blog

JANUARY 17, 2023

IBM Cloud Pak for Data Express solutions provide new clients with affordable and high impact capabilities to expeditiously explore and validate the path to become a data-driven enterprise. IBM Cloud Pak for Data Express solutions offer clients a simple on ramp to start realizing the business value of a modern architecture.

Data Governance

Data Governance Data Science AI AI

The importance of data ingestion and integration for enterprise AI

IBM Journey to AI blog

JANUARY 9, 2024

The groundwork of training data in an AI model is comparable to piloting an airplane. The entire generative AI pipeline hinges on the data pipelines that empower it, making it imperative to take the correct precautions. This may also entail working with new data through methods like web scraping or uploading.

AI

AI AI Data Quality Data Pipeline

How Does Fivetran Drive Business Value?

phData

APRIL 23, 2024

Data teams are now tasked with designing and maintaining scaleable, flexible data architecture to support a wide variety of business-critical data-driven reports and analytics. Engineering teams must maintain a complex web of ingestion pipelines capable of supporting many different sources, each with its own intricacies.

Data Governance

Data Governance Data Pipeline Data Warehouse Cloud Data

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

In our previous blog, Top 5 Fivetran Connectors for Financial Services , we explored Fivetran’s capabilities that address the data integration needs of the finance industry. Now, let’s cover the healthcare industry, which also has a surging demand for data and analytics, along with the underlying processes to make it happen.

SQL

SQL Data Warehouse Azure Cloud Data

What is Snowflake Horizon?

phData

AUGUST 5, 2024

How can my analysts discover where data is located? All of these questions describe a concept known as data governance. The Snowflake AI Data Cloud has built an entire blanket of features called Horizon, which tackles all of these questions and more. We will begin with compliance. appeared first on phData.

Data Governance

Data Governance Data Quality Data Lakes ML

Alation + Soda: Dynamic Data Quality with the Data Catalog

Alation

DECEMBER 7, 2021

Do we have end-to-end data pipeline control? What can we learn about our data quality issues? How can we improve and deliver trusted data to the organization? One major obstacle presented to data quality is data silos , as they obstruct transparency and make collaboration tough. Subscribe to Alation's Blog.

Data Quality

Data Quality Data Pipeline Data Silos Data Governance

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

IBM Journey to AI blog

AUGUST 12, 2024

It helps companies streamline and automate the end-to-end ML lifecycle, which includes data collection, model creation (built on data sources from the software development lifecycle), model deployment, model orchestration, health monitoring and data governance processes.

Big Data

Big Data Big Data ML ML

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

It sits between the data lake and cloud object storage, allowing you to version and control changes to data lakes at scale. LakeFS facilitates data reproducibility, collaboration, and data governance within the data lake environment. Flyte Flyte is a platform for orchestrating ML pipelines at scale.

Machine Learning

Machine Learning Machine Learning ML ML

Data Profiling: What It Is and How to Perfect It

Alation

APRIL 18, 2023

For any data user in an enterprise today, data profiling is a key tool for resolving data quality issues and building new data solutions. In this blog, we’ll cover the definition of data profiling, top use cases, and share important techniques and best practices for data profiling today.

Data Profiling

Data Profiling Data Quality Data Governance Data Pipeline

DataOps vs. DevOps: What’s the Difference?

Alation

AUGUST 3, 2021

DataOps then works to continuously improve and adjust data models, visualizations, reports, and dashboards to achieve business goals. DataOps fosters cross-functional collaboration and automation to build fast, trustworthy data pipelines so your business can wring the most value from your data. The Agile Connection.

DataOps

DataOps Data Pipeline Data Analyst Analytics

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Read more to know. Cloud Platforms: AWS, Azure, Google Cloud, etc.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Alation & Bigeye: A Potent Partnership for Data Quality

Alation

DECEMBER 7, 2021

Data teams use Bigeye’s data observability platform to detect data quality issues and ensure reliable data pipelines. If there is an issue with the data or data pipeline, the data team is immediately alerted, enabling them to proactively address the issue. Subscribe to Alation's Blog.

Data Quality

Data Quality Data Pipeline Data Observability Data Profiling

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

Semantics, context, and how data is tracked and used mean even more as you stretch to reach post-migration goals. This is why, when data moves, it’s imperative for organizations to prioritize data discovery. Data discovery is also critical for data governance , which, when ineffective, can actually hinder organizational growth.

Data Governance

Data Governance ML ML Cloud Data

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

phData

AUGUST 2, 2024

Snowflake AI Data Cloud is one of the most powerful platforms, including storage services supporting complex data. Integrating Snowflake with dbt adds another layer of automation and control to the data pipeline. In this blog, we’ll explore: Overview of Snowflake Stored Procedures & dbt Hooks.

Data Pipeline

Data Pipeline Python Database SQL

Data integrity vs. data quality: Is there a difference?

IBM Journey to AI blog

JULY 13, 2023

This is the practice of creating, updating and consistently enforcing the processes, rules and standards that prevent errors, data loss, data corruption, mishandling of sensitive or regulated data, and data breaches. Learn more about designing the right data architecture to elevate your data quality here.

Data Quality

Data Quality Data Profiling Data Governance Machine Learning

Turnkey Cloud DataOps: Solution from Alation and Accenture

Alation

MARCH 22, 2022

They created each capability as modules, which can either be used independently or together to build automated data pipelines. In essence, Alation is acting as a foundational data fabric that Gartner describes as being required for DataOps. How the IDF Supports a Smarter Data Pipeline. Subscribe to Alation's Blog.

DataOps

DataOps Data Pipeline Data Engineering Data Engineering

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

AWS Machine Learning Blog

FEBRUARY 13, 2024

Let’s demystify this using the following personas and a real-world analogy: Data and ML engineers (owners and producers) – They lay the groundwork by feeding data into the feature store Data scientists (consumers) – They extract and utilize this data to craft their models Data engineers serve as architects sketching the initial blueprint.

AWS

AWS ML ML Machine Learning

Implementing Gen AI for Financial Services

Iguazio

FEBRUARY 20, 2024

Risk, compliance, data privacy and escalating costs are just a few of the acute concerns that financial services companies are grappling with today. The required architecture includes a data pipeline, ML pipeline, application pipeline and a multi-stage pipeline. Let’s dive into the data management pipeline.

AI

AI AI Data Pipeline Data Quality

Fivetran Modern Data Stack Conference 2023: Key Takeaways

Alation

APRIL 14, 2023

Practitioners and hands-on data users were thrilled to be there, and many connected as they shared their progress on their own data stack journeys. People were familiar with the value of a data catalog (and the growing need for data governance ), though many admitted to being somewhat behind on their journeys.

Data Pipeline

Data Pipeline Data Warehouse Cloud Data ETL

Using Snowflake Data as an Insurance Company

phData

FEBRUARY 14, 2023

Insurance companies often face challenges with data silos and inconsistencies among their legacy systems. To address these issues, they need a centralized and integrated data platform that serves as a single source of truth, preferably with strong data governance capabilities.

Data Governance

Data Governance Data Silos Predictive Analytics Data Wrangling

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

phData

OCTOBER 25, 2024

This results in poor credibility and data consistency after some time, leading businesses to mistrust the data pipelines and processes. Hence, a new feature allows for natively implementing data quality monitoring in the Snowflake AI Data Cloud without using any additional tools.

Data Quality

Data Quality Data Pipeline Data Governance Database

Five benefits of a data catalog

IBM Journey to AI blog

DECEMBER 16, 2022

And because data assets within the catalog have quality scores and social recommendations, Alex has greater trust and confidence in the data she’s using for her decision-making recommendations. This is especially helpful when handling massive amounts of big data. Protected and compliant data.

Data Quality

Data Quality Data Governance Data Wrangling Data Scientist

How to Shift from Data Science to Data Engineering

ODSC - Open Data Science

JANUARY 18, 2024

If you are a data scientist, you may be wondering if you can transition into data engineering. The good news is that there are many skills that data scientists already have that are transferable to data engineering. In this blog post, we will discuss how you can become a data engineer if you are a data scientist.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Performance Benefits of Snowpark for ML Workloads

phData

MARCH 22, 2023

Snowpark , an innovative technology from the Snowflake Data Cloud , promises to meet this demand by allowing data scientists to develop complex data transformation logic using familiar programming languages such as Java, Scala, and Python. Checkout these blogs and reach out to our Data Science and ML team today!

ML

ML ML Machine Learning Machine Learning

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

phData

AUGUST 10, 2023

In data vault implementations, critical components encompass the storage layer, ELT technology, integration platforms, data observability tools, Business Intelligence and Analytics tools, Data Governance , and Metadata Management solutions. The most important reason for using DBT in Data Vault 2.0

SQL

SQL Data Observability Data Quality Data Pipeline

Data Intelligence in DataOps: Navigating the Journey to Continuous Data Value

Alation

SEPTEMBER 21, 2021

Continuously testing of data definitions, values, and context of data flowing within pipelines against acceptable tolerances, policies and thresholds can stop bad data from being used to make decisions and protect against data governance and compliance exceptions. Subscribe to Alation's Blog.

DataOps

DataOps Data Governance Data Pipeline

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

What does a modern data architecture do for your business? A modern data architecture like Data Mesh and Data Fabric aims to easily connect new data sources and accelerate development of use case specific data pipelines across on-premises, hybrid and multicloud environments.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

How to Assess Data Quality Readiness for Modern Data Pipelines

Webinars

Trending Sources

Why data governance is essential for enterprise AI

Webinars

Choosing Tools for Data Pipeline Test Automation (Part 1)

Testing and Monitoring Data Pipelines: Part One

Testing and Monitoring Data Pipelines: Part Two

5 Ways Data Engineers Can Support Data Governance

6 benefits of data lineage for financial services

Leveraging Data Pipelines to Meet the Needs of the Business: Why the Speed of Data Matters

Data Observability vs. Monitoring vs. Testing

How the right data and AI foundation can empower a successful ESG strategy

Build trust in banking with data lineage

It’s Essential – Verifying the Results of Data Transformations (Part 1)

Data Observability Tools and Its Key Applications

How data stores and governance impact your AI initiatives

Unfolding the difference between Data Observability and Data Quality

Alation and Fivetran Partner to Bring Greater Visibility to the Modern Data Stack

Four starting points to transform your organization into a data-driven enterprise

The importance of data ingestion and integration for enterprise AI

How Does Fivetran Drive Business Value?

Top 5 Fivetran Connectors for Healthcare

What is Snowflake Horizon?

Alation + Soda: Dynamic Data Quality with the Data Catalog

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

MLOps Landscape in 2023: Top Tools and Platforms

Data Profiling: What It Is and How to Perfect It

DataOps vs. DevOps: What’s the Difference?

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Alation & Bigeye: A Potent Partnership for Data Quality

The Cloud Connection: How Governance Supports Security

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

Data integrity vs. data quality: Is there a difference?

Turnkey Cloud DataOps: Solution from Alation and Accenture

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

Implementing Gen AI for Financial Services

Fivetran Modern Data Stack Conference 2023: Key Takeaways

Using Snowflake Data as an Insurance Company

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

Five benefits of a data catalog

How to Shift from Data Science to Data Engineering

Performance Benefits of Snowpark for ML Workloads

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

Data Intelligence in DataOps: Navigating the Journey to Continuous Data Value

Data architecture strategy for data quality

Stay Connected