Data Governance, Data Pipeline and SQL

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. They transform data into a consistent format for users to consume.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

Data governance challenges Maintaining consistent data governance across different systems is crucial but complex. This tool democratizes data access across the organization, enabling even nontechnical users to gain valuable insights. To power these advanced AI features, OMRON chose Amazon Bedrock.

AWS

AWS Data Governance Data Silos SQL

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Data engineering tools are software applications or frameworks specifically designed to facilitate the process of managing, processing, and transforming large volumes of data. Spark offers a rich set of libraries for data processing, machine learning, graph processing, and stream processing.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Introducing Agile Data Governance – Alation TrustCheck

Alation

FEBRUARY 20, 2020

The rise of data lakes, IOT analytics, and big data pipelines has introduced a new world of fast, big data. How Data Catalogs Can Help. Data catalogs evolved as a key component of the data governance revolution by creating a bridge between the new world and old world of data governance.

Data Governance

Data Governance Tableau Analytics Analytics

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Key components include data modelling, warehousing, pipelines, and integration. Effective data governance enhances quality and security throughout the data lifecycle. What is Data Engineering? They are crucial in ensuring data is readily available for analysis and reporting. from 2025 to 2030.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

Connecting directly to this semantic layer will help give customers access to critical business data in a safe, governed manner. This partnership makes data more accessible and trusted. Our continued investments in connectivity with Google technologies help ensure your data is secure, governed, and scalable.

Tableau

Tableau Analytics Analytics Machine Learning

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Data Visualization: Matplotlib, Seaborn, Tableau, etc.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

phData

AUGUST 2, 2024

Snowflake AI Data Cloud is one of the most powerful platforms, including storage services supporting complex data. Integrating Snowflake with dbt adds another layer of automation and control to the data pipeline. Snowflake stored procedures and dbt Hooks are essential to modern data engineering and analytics workflows.

Data Pipeline

Data Pipeline Python Database SQL

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

Understanding Fivetran Fivetran is a popular Software-as-a-Service platform that enables users to automate the movement of data and ETL processes across diverse sources to a target destination. The phData team achieved a major milestone by successfully setting up a secure end-to-end data pipeline for a substantial healthcare enterprise.

SQL

SQL Data Warehouse Azure Cloud Data

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

phData

AUGUST 10, 2023

In data vault implementations, critical components encompass the storage layer, ELT technology, integration platforms, data observability tools, Business Intelligence and Analytics tools, Data Governance , and Metadata Management solutions. The most important reason for using DBT in Data Vault 2.0

SQL

SQL Data Observability Data Quality Data Pipeline

What is Snowflake Horizon?

phData

AUGUST 5, 2024

Who should have access to sensitive data? How can my analysts discover where data is located? All of these questions describe a concept known as data governance. The Snowflake AI Data Cloud has built an entire blanket of features called Horizon, which tackles all of these questions and more.

Data Governance

Data Governance Data Quality Data Lakes ML

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

It sits between the data lake and cloud object storage, allowing you to version and control changes to data lakes at scale. LakeFS facilitates data reproducibility, collaboration, and data governance within the data lake environment. Flyte Flyte is a platform for orchestrating ML pipelines at scale.

Machine Learning

Machine Learning Machine Learning ML ML

phData Toolkit June 2023 Update

phData

JUNE 26, 2023

While many of our customers leverage our UI for tools like our SQL Translation or Privilege Audit tooling, there are limitations when it comes to using a UI. You wouldn’t want to pay someone (or perform yourself) to manually copy/paste each file into a browser window and copy/paste the translated SQL back.

SQL

SQL Data Profiling Data Pipeline Data Governance

phData Toolkit December 2022 Update

phData

DECEMBER 29, 2022

The phData Toolkit continues to have additions made to it as we work with customers to accelerate their migrations , build a data governance practice , and ensure quality data products are built. Some of the major improvements that have been made are within the data profiling and validation components of the Toolkit CLI.

SQL

SQL Database Database Administration Data Profiling

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

Connecting directly to this semantic layer will help give customers access to critical business data in a safe, governed manner. This partnership makes data more accessible and trusted. Our continued investments in connectivity with Google technologies help ensure your data is secure, governed, and scalable. .

Tableau

Tableau Analytics Analytics Machine Learning

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Journey to AI blog

AUGUST 4, 2023

Data democratization instead refers to the simplification of all processes related to data, from storage architecture to data management to data security. It also requires an organization-wide data governance approach, from adopting new types of employee training to creating new policies for data storage.

Data Lakes

Data Lakes AI AI Data Governance

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

Semantics, context, and how data is tracked and used mean even more as you stretch to reach post-migration goals. This is why, when data moves, it’s imperative for organizations to prioritize data discovery. Data discovery is also critical for data governance , which, when ineffective, can actually hinder organizational growth.

Data Governance

Data Governance ML ML Cloud Data

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

phData

OCTOBER 25, 2024

It’s common to have terabytes of data in most data warehouses, data quality monitoring is often challenging and cost-intensive due to dependencies on multiple tools and eventually ignored. This results in poor credibility and data consistency after some time, leading businesses to mistrust the data pipelines and processes.

Data Quality

Data Quality Data Pipeline Data Governance Database

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Automation Automation plays a pivotal role in streamlining ETL processes, reducing the need for manual intervention, and ensuring consistent data availability. By automating key tasks, organisations can enhance efficiency and accuracy, ultimately improving the quality of their data pipelines.

ETL

ETL Data Warehouse Data Quality Data Governance

Building a Data Culture with Snowflake: A Guide for CIOs

phData

JUNE 20, 2024

This oftentimes leads to shadow IT processes and duplicated data pipelines. Data is siloed, and there is no singular source of truth but fragmented data spread across the organization. Establishing a data culture changes this paradigm. Promoting Data Literacy Snowflake is an accessible platform.

Data Governance

Data Governance Analytics Analytics Power BI

Who is a BI Developer: Role, Responsibilities & Skills

Pickl AI

JULY 3, 2023

Here are steps you can follow to pursue a career as a BI Developer: Acquire a solid foundation in data and analytics: Start by building a strong understanding of data concepts, relational databases, SQL (Structured Query Language), and data modeling.

Business Intelligence

Business Intelligence Business Intelligence SQL Data Visualization

Snowflake Cortex vs. Snowpark – What’s the difference?

phData

MAY 28, 2024

With Cortex, business analysts, data engineers, and developers can easily incorporate Predictive and Generative AI into their workflows using simple SQL commands and intuitive interfaces. This not only streamlines operations but also ensures consistency and reliability across the entire data processing lifecycle.

Machine Learning

Machine Learning Machine Learning Data Engineer Data Engineering

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

What does a modern data architecture do for your business? A modern data architecture like Data Mesh and Data Fabric aims to easily connect new data sources and accelerate development of use case specific data pipelines across on-premises, hybrid and multicloud environments.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

Data Quality Framework: What It Is, Components, and Implementation

DagsHub

AUGUST 23, 2024

We already know that a data quality framework is basically a set of processes for validating, cleaning, transforming, and monitoring data. Data Governance Data governance is the foundation of any data quality framework. It primarily caters to large organizations with complex data environments.

Data Quality

Data Quality Data Governance Machine Learning Machine Learning

What Industries are Hiring for Different Jobs in AI

ODSC - Open Data Science

APRIL 26, 2023

Though scripted languages such as R and Python are at the top of the list of required skills for a data analyst, Excel is still one of the most important tools to be used. Because they are the most likely to communicate data insights, they’ll also need to know SQL, and visualization tools such as Power BI and Tableau as well.

Data Analyst

Data Analyst Machine Learning Machine Learning Power BI

How to Shift from Data Science to Data Engineering

ODSC - Open Data Science

JANUARY 18, 2024

This individual is responsible for building and maintaining the infrastructure that stores and processes data; the kinds of data can be diverse, but most commonly it will be structured and unstructured data. They’ll also work with software engineers to ensure that the data infrastructure is scalable and reliable.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

ETL Process Explained: Essential Steps for Effective Data Management

Pickl AI

OCTOBER 17, 2024

Databases These are structured data collections managed by Database Management Systems (DBMS). Organisations often extract data from relational databases like MySQL, Oracle, or SQL Server to facilitate analysis. Choosing the right ETL tool enhances data management efficiency and supports organisational growth.

ETL

ETL Data Warehouse SQL Data Quality

AI-Powered Digital Transformation: Get Your Data and AI Ready

Precisely

AUGUST 15, 2024

An increasing number of GenAI tools use large language models that automate key data engineering, governance, and master data management tasks. These tools can generate automated outputs including SQL and Python code, synthetic datasets, data visualizations, and predictions – significantly streamlining your data pipeline.

AI

AI AI Data Quality Data Engineer

How Fivetran + dbt provides Enterprise Scale to ELT Pipelines

phData

OCTOBER 12, 2023

When the data or pipeline configuration needs to be changed, tools like Fivetran and dbt reduce the time required to make the change, and increase the confidence your team can have around the change. These allow you to scale your pipelines quickly. Governance doesn’t have to be scary or preventative to your cloud data warehouse.

Data Warehouse

Data Warehouse Database Cloud Data Data Pipeline

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

Thus, the solution allows for scaling data workloads independently from one another and seamlessly handling data warehousing, data lakes , data sharing, and engineering. Data warehousing is a vital constituent of any business intelligence operation. Simplify and Win Experienced data engineers value simplicity.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

When To Use Internal vs. External Stages in Snowflake

phData

AUGUST 4, 2023

The shared-nothing architecture ensures that users don’t have to worry about distributing data across multiple cluster nodes. Snowflake hides user data objects and makes them accessible only through SQL queries through the compute layer. This includes tasks such as data cleansing, enrichment, and aggregation.

Database

Database Azure SQL AWS

Alation Launches Open Data Quality Framework

Alation

MAY 24, 2022

In a sea of questionable data, how do you know what to trust? Data quality tells you the answer. It signals what data is trustworthy, reliable, and safe to use. It empowers engineers to oversee data pipelines that deliver trusted data to the wider organization. Read the blog, Alation 2022.2:

Data Quality

Data Quality Data Pipeline DataOps Analytics

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

Support for Numerous Data Sources: Fivetran supports over 200 data sources, including popular databases, applications, and cloud platforms like Salesforce, Google Analytics, SQL Server, Snowflake, and many more. Additionally, unsupported data sources can be integrated using Fivetran’s cloud function connectors.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

7 Best Machine Learning Workflow and Pipeline Orchestration Tools 2024

DagsHub

APRIL 7, 2024

Image generated with Midjourney In today’s fast-paced world of data science, building impactful machine learning models relies on much more than selecting the best algorithm for the job. Data scientists and machine learning engineers need to collaborate to make sure that together with the model, they develop robust data pipelines.

Machine Learning

Machine Learning Machine Learning ML ML

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

A legacy data stack usually refers to the traditional relational database management system (RDBMS), which uses a structured query language (SQL) to store and process data. While an RDBMS can still be used in a modern data stack, it is not as common because it is not as well-suited for managing big data.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

Here’s the structured equivalent of this same data in tabular form: With structured data, you can use query languages like SQL to extract and interpret information. In contrast, such traditional query languages struggle to interpret unstructured data. It also aids in identifying the source of any data quality issues.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

Flipboard

MARCH 21, 2025

To establish trust between the data producers and data consumers, SageMaker Catalog also integrates the data quality metrics and data lineage events to track and drive transparency in data pipelines. Data analysts discover the data and subscribe to the data.

SQL

SQL Data Analyst Data Warehouse AWS

The Data Engineer’s Roadmap

Dataversity

SEPTEMBER 28, 2022

Data engineering is a fascinating and fulfilling career – you are at the helm of every business operation that requires data, and as long as users generate data, businesses will always need data engineers. The journey to becoming a successful data engineer […]. In other words, job security is guaranteed.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

What Orchestration Tools Help Data Engineers in Snowflake

phData

AUGUST 17, 2023

Data pipeline orchestration tools are designed to automate and manage the execution of data pipelines. These tools help streamline and schedule data movement and processing tasks, ensuring efficient and reliable data flow. What are Orchestration Tools?

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

Some modern CDPs are starting to incorporate these concepts, allowing for more flexible and evolving customer data models. It also requires a shift in how we query our customer data. Instead of simple SQL queries, we often need to use more complex temporal query languages or rely on derived views for simpler querying.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Shaping the future: OMRON’s data-driven journey with AWS

Webinars

Trending Sources

Essential data engineering tools for 2023: Empowering for management and analysis

Webinars

Introducing Agile Data Governance – Alation TrustCheck

Discover the Most Important Fundamentals of Data Engineering

Self-Service Analytics for Google Cloud, now with Looker and Tableau

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

Top 5 Fivetran Connectors for Healthcare

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

What is Snowflake Horizon?

MLOps Landscape in 2023: Top Tools and Platforms

phData Toolkit June 2023 Update

phData Toolkit December 2022 Update

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Data democratization: How data architecture can drive business decisions and AI initiatives

The Cloud Connection: How Governance Supports Security

What is Snowflake’s Data Quality Monitoring Feature and How is it Used?

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Building a Data Culture with Snowflake: A Guide for CIOs

Who is a BI Developer: Role, Responsibilities & Skills

Snowflake Cortex vs. Snowpark – What’s the difference?

Data architecture strategy for data quality

Data Quality Framework: What It Is, Components, and Implementation

What Industries are Hiring for Different Jobs in AI

How to Shift from Data Science to Data Engineering

ETL Process Explained: Essential Steps for Effective Data Management

AI-Powered Digital Transformation: Get Your Data and AI Ready

How Fivetran + dbt provides Enterprise Scale to ELT Pipelines

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

When To Use Internal vs. External Stages in Snowflake

Alation Launches Open Data Quality Framework

The Ultimate Modern Data Stack Migration Guide

7 Best Machine Learning Workflow and Pipeline Orchestration Tools 2024

The Modern Data Stack Explained: What The Future Holds

How to Manage Unstructured Data in AI and Machine Learning Projects

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

The Data Engineer’s Roadmap

What Orchestration Tools Help Data Engineers in Snowflake

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Stay Connected