Cloud Data, Data Engineering and Data Warehouse

Cloud Data

Data Engineering

Data Warehouse

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Conventional ML development cycles take weeks to many months and requires sparse data science understanding and ML development skills. Business analysts’ ideas to use ML models often sit in prolonged backlogs because of data engineering and data science team’s bandwidth and data preparation activities.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

How a Delta Lake is Process with Azure Synapse Analytics

Analytics Vidhya

JULY 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction We are all pretty much familiar with the common modern cloud data warehouse model, which essentially provides a platform comprising a data lake (based on a cloud storage account such as Azure Data Lake Storage Gen2) AND a data warehouse compute engine […].

Azure

Azure Data Warehouse Data Lakes Analytics

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

In the contemporary age of Big Data, Data Warehouse Systems and Data Science Analytics Infrastructures have become an essential component for organizations to store, analyze, and make data-driven decisions. So why using IaC for Cloud Data Infrastructures?

Data Warehouse

Data Warehouse Azure SQL Database

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Advantages of Using Cloud Data Platform Snowflake

Analytics Vidhya

NOVEMBER 19, 2022

Today, data controls a significant portion of our lives as consumers due to advancements in wireless connectivity, processing power, and […]. The post Advantages of Using Cloud Data Platform Snowflake appeared first on Analytics Vidhya.

Cloud Data

Cloud Data Data Science Analytics Analytics

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Data Science Dojo

SEPTEMBER 11, 2024

These experiences facilitate professionals from ingesting data from different sources into a unified environment and pipelining the ingestion, transformation, and processing of data to developing predictive models and analyzing the data by visualization in interactive BI reports. In the menu bar on the left, select Workspaces.

Power BI

Power BI Data Pipeline Data Warehouse Data Engineer

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads.

AWS

AWS Data Warehouse ETL SQL

On-Prem vs. The Cloud: Key Considerations

phData

FEBRUARY 21, 2025

In this post, we will be particularly interested in the impact that cloud computing left on the modern data warehouse. We will explore the different options for data warehousing and how you can leverage this information to make the right decisions for your organization. Understanding the Basics What is a Data Warehouse?

Data Warehouse

Data Warehouse Cloud Data ETL Cloud Computing

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Learn Machine Learning

APRIL 26, 2025

The field of data science is now one of the most preferred and lucrative career options available in the area of data because of the increasing dependence on data for decision-making in businesses, which makes the demand for data science hires peak. Their insights must be in line with real-world goals.

Data Science

Data Science Data Analyst Data Scientist Machine Learning

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

Snowflake’s Data Cloud has emerged as a leader in cloud data warehousing. As a fundamental piece of the modern data stack , Snowflake is helping thousands of businesses store, transform, and derive insights from their data easier, faster, and more efficiently than ever before.

Data Warehouse

Data Warehouse Data Lakes Clustering Cloud Data

What Is Fivetran and How Much Does It Cost?

phData

MARCH 8, 2023

Fivetran is an automated data integration platform that offers a convenient solution for businesses to consolidate and sync data from disparate data sources. With over 160 data connectors available, Fivetran makes it easy to move data out of, into, and across any cloud data platform in the market.

Data Warehouse

Data Warehouse Data Engineering Data Engineering Data Engineering

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

When needed, the system can access an ODAP data warehouse to retrieve additional information. About the Authors Emrah Kaya is Data Engineering Manager at Omron Europe and Platform Lead for ODAP Project. Xinyi Zhou is a Data Engineer at Omron Europe, bringing her expertise to the ODAP team led by Emrah Kaya.

AWS

AWS Data Governance Data Silos SQL

11 Open-Source Data Engineering Tools Every Pro Should Use

ODSC - Open Data Science

FEBRUARY 6, 2024

Data engineering has become an integral part of the modern tech landscape, driving advancements and efficiencies across industries. So let’s explore the world of open-source tools for data engineers, shedding light on how these resources are shaping the future of data handling, processing, and visualization.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Why Open Table Format Architecture is Essential for Modern Data Systems

phData

NOVEMBER 8, 2024

Data Versioning and Time Travel Open Table Formats empower users with time travel capabilities, allowing them to access previous dataset versions. Versioning also ensures a safer experimentation environment, where data scientists can test new models or hypotheses on historical data snapshots without impacting live data.

Data Lakes

Data Lakes Data Warehouse Database Azure

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

Over the past few decades, the corporate data landscape has changed significantly. The shift from on-premise databases and spreadsheets to the modern era of cloud data warehouses and AI/ LLMs has transformed what businesses can do with data. What is the Modern Data Stack? Data modeling, data cleanup, etc.

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

Smart Data Collective

OCTOBER 17, 2022

Engineering teams, in particular, can quickly get overwhelmed by the abundance of information pertaining to competition data, new product and service releases, market developments, and industry trends, resulting in information anxiety. Explosive data growth can be too much to handle. Can’t get to the data.

Big Data

Big Data Big Data Data Engineer Data Engineering

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. Data ingestion/integration services. Data orchestration tools.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

How to Connect Snowflake to Python

phData

JANUARY 5, 2023

Python is the top programming language used by data engineers in almost every industry. Python has proven proficient in setting up pipelines, maintaining data flows, and transforming data with its simple syntax and proficiency in automation. Truly a must-have tool in your data engineering arsenal!

Python

Python Data Engineer Data Engineering Data Engineering

List of ETL Tools: Explore the Top ETL Tools for 2025

Pickl AI

APRIL 9, 2025

By 2025, global data volumes are expected to reach 181 zettabytes, according to IDC. To harness this data effectively, businesses rely on ETL (Extract, Transform, Load) tools to extract, transform, and load data into centralized systems like data warehouses.

ETL

ETL Data Warehouse AWS Business Intelligence

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

The demand for information repositories enabling business intelligence and analytics is growing exponentially, giving birth to cloud solutions. The ultimate need for vast storage spaces manifests in data warehouses: specialized systems that aggregate data coming from numerous sources for centralized management and consistency.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

Best Practices When Developing Matillion Jobs

phData

SEPTEMBER 2, 2024

Best practices are a pivotal part of any software development, and data engineering is no exception. This ensures the data pipelines we create are robust, durable, and secure, providing the desired data to the organization effectively and consistently. Below are the best practices. What are Matillion's limitations?

ETL

ETL Data Warehouse SQL Database

How to Maximize Fan Engagement with Fan 360

phData

FEBRUARY 15, 2024

With the advent of cloud data warehouses and the ability to (seemingly) infinitely scale analytics on an organization’s data, centralizing and using that data to discover what drives customer engagement has become a top priority for executives across all industries and verticals.

Data Warehouse

Data Warehouse Cloud Data Data Silos Analytics

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Data integration is essentially the Extract and Load portion of the Extract, Load, and Transform (ELT) process. Data ingestion involves connecting your data sources, including databases, flat files, streaming data, etc, to your data warehouse. Snowflake provides native ways for data ingestion.

Data Warehouse

Data Warehouse Azure AWS Database

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

With the birth of cloud data warehouses, data applications, and generative AI , processing large volumes of data faster and cheaper is more approachable and desired than ever. First up, let’s dive into the foundation of every Modern Data Stack, a cloud-based data warehouse.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Alation

AUGUST 11, 2022

The data warehouse and analytical data stores moved to the cloud and disaggregated into the data mesh. Today, the brightest minds in our industry are targeting the massive proliferation of data volumes and the accompanying but hard-to-find value locked within all that data.

Data Warehouse

Data Warehouse Data Engineer Data Engineering Data Engineering

Fivetran Supports the Automation of the Modern Data Lake on Amazon S3

phData

APRIL 4, 2023

Customers who don’t necessarily want to put their data directly into a data warehouse like the Snowflake Data Cloud can now use Fivetran to build a performant, governed, managed dataset on top of S3 which can still be efficiently queried and manipulated from within their query engine of choice.

Data Lakes

Data Lakes Data Warehouse Cloud Data Data Science

Alation 2022.1: Customize Your Data Catalog

Alation

MARCH 1, 2022

Through Impact Analysis, users can determine if a problem occurred with data upstream, and locate the impacted data downstream. With robust data lineage, data engineers can find and fix issues fast and prevent them from recurring. Similarly, analysts gain a clear view of how data is created.

Data Warehouse

Data Warehouse Data Lakes Cloud Data Database

Getting Started With Matillion Data Productivity Cloud

phData

NOVEMBER 28, 2023

Matillion is also built for scalability and future data demands, with support for cloud data platforms such as Snowflake Data Cloud , Databricks, Amazon Redshift, Microsoft Azure Synapse, and Google BigQuery, making it future-ready, everyone-ready, and AI-ready. That process will not take longer than 3 minutes!

Data Warehouse

Data Warehouse Data Pipeline ETL Azure

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

Within watsonx.ai, users can take advantage of open-source frameworks like PyTorch, TensorFlow and scikit-learn alongside IBM’s entire machine learning and data science toolkit and its ecosystem tools for code-based and visual data science capabilities. Savings may vary depending on configurations, workloads and vendor.

AI AI Machine Learning Machine Learning

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Alation

OCTOBER 18, 2022

Data analysts and engineers use dbt to transform, test, and document data in the cloud data warehouse. Making this data visible in the data catalog will let data teams share their work, support re-use, and empower everyone to better understand and trust data.

Data Analyst

Data Analyst Data Engineer Data Engineering Data Engineering

What is a Customer Data Platform (CDP)?

phData

MARCH 11, 2024

For years, marketing teams across industries have turned to implementing traditional Customer Data Platforms (CDPs) as separate systems purpose-built to unlock growth with first-party data. For behavioral data , Hightouch offers an event tracking SDK to deploy an SDK across your web, server, and mobile apps.

Data Warehouse

Data Warehouse Cloud Data Data Models Data Modeling

Retail & CPG Questions phData Can Answer with Data

phData

JUNE 26, 2024

Cleaning and preparing the data Raw data typically shouldn’t be used in machine learning models as it’ll throw off the prediction. Data engineers can prepare the data by removing duplicates, dealing with outliers, standardizing data types and precision between data sets, and joining data sets together.

Machine Learning

Machine Learning Machine Learning Data Engineer Data Engineering

Why Migrate From Netezza to Snowflake?

phData

JANUARY 4, 2023

Why You Should Consider Migrating from Netezza to Snowflake With Netezza, IBM was one of the first companies to provide a data warehousing solution to allow organizations to analyze and manage large amounts of data. However, as technology has evolved, the need for more advanced, agile data warehousing solutions has become apparent.

Data Warehouse

Data Warehouse SQL Database ETL

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

This article explores the importance of ETL pipelines in machine learning, a hands-on example of building ETL pipelines with a popular tool, and suggests the best ways for data engineers to enhance and sustain their pipelines. It comprises three main areas: Landing area, Staging area, and Data Warehouse area.

ETL

ETL Data Pipeline ML ML

How Fifth Third Bank Democratizes Data Access via a Data Mesh with Alation and Snowflake

Alation

JUNE 7, 2022

. “We’re never going to be able to hire enough data engineers, data scientists, and cloud architects to support the growth that we want to achieve. Transparency into data lineage and the data supply chain empowers people to know when data was first created, and who to ask for answers.

Data Governance

Data Governance Data Scientist Data Engineer Data Engineering

Top 5 Fivetran Connectors For Financial Services

phData

JANUARY 24, 2024

Understanding Fivetran Fivetran is a user-friendly, code-free platform enabling customers to easily synchronize their data by automating extraction, transformation, and loading from many sources. Fivetran automates the time-consuming steps of the ELT process so your data engineers can focus on more impactful projects.

Data Warehouse

Data Warehouse Data Pipeline Data Governance Cloud Data

Top 5 Use Cases of phData’s Advisor Tool

phData

MARCH 29, 2024

Founded in 2014 by three leading cloud engineers, phData focuses on solving real-world data engineering, operations, and advanced analytics problems with the best cloud platforms and products. Over the years, one of our primary focuses became Snowflake and migrating customers to this leading cloud data platform.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Alation 2022.3: Alation Anywhere Connecting the Modern Data Stack

Alation

AUGUST 30, 2022

These range from data sources , including SaaS applications like Salesforce; ELT like Fivetran; cloud data warehouses like Snowflake; and data science and BI tools like Tableau. This expansive map of tools constitutes today’s modern data stack. But different users have different needs.

Data Governance

Data Governance Tableau Data Quality Data Analyst

Top 10 Reasons for Alation with Snowflake: Reduce Risk with Active Data Governance

Alation

SEPTEMBER 7, 2021

With Snowflake, data stewards have a choice to leverage Snowflake’s governance policies. First, stewards are dependent on data warehouse admins to provide information and to create and edit enforcement policies in Snowflake. Alation’s data lineage helps organizations to secure their data in the Snowflake Data Cloud.

Data Governance

Data Governance Data Scientist Data Quality Data Profiling

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

phData

JUNE 14, 2023

In recent years, data engineering teams working with the Snowflake Data Cloud platform have embraced the continuous integration/continuous delivery (CI/CD) software development process to develop data products and manage ETL/ELT workloads more efficiently. What Are the Benefits of CI/CD Pipeline For Snowflake?

Data Pipeline

Data Pipeline Database SQL Data Engineer

How to Maximize Time to Value with Fivetran and dbt

phData

OCTOBER 17, 2023

In our previous blog , we discussed how Fivetran and dbt scale for any data volume and workload, both small and large. Now, you might be wondering what these tools can do for your data team and the efficiency of your organization as a whole. Can these tools help reduce the time our data engineers spend fixing things?

ETL

ETL Data Pipeline Data Engineering Data Engineering

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. But what does this mean from a practitioner perspective?

Data Analyst

Data Analyst Data Scientist Analytics Analytics

How You Can Create and Share Your Own Matillion Shared Job

phData

SEPTEMBER 27, 2024

Modern business operations rely heavily on data engineering and transformation processes to turn raw data into valuable insights. Matillion, a robust ELT (Extract, Load, Transform) platform, simplifies data integration and transformation complexities with a no-code or high-code experience. What is a Matillion Job?

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

How to Build a Data Mesh in Snowflake

phData

SEPTEMBER 20, 2023

A data mesh is a conceptual architectural approach for managing data in large organizations. Traditional data management approaches often involve centralizing data in a data warehouse or data lake, leading to challenges like data silos, data ownership issues, and data access and processing bottlenecks.

Data Silos

Data Silos Database Data Quality Data Engineering

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

How a Delta Lake is Process with Azure Synapse Analytics

Webinars

Trending Sources

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Webinars

Advantages of Using Cloud Data Platform Snowflake

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

AWS re:Invent 2023 Amazon Redshift Sessions Recap

On-Prem vs. The Cloud: Key Considerations

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

What is the Snowflake Data Cloud and How Much Does it Cost?

What Is Fivetran and How Much Does It Cost?

Shaping the future: OMRON’s data-driven journey with AWS

11 Open-Source Data Engineering Tools Every Pro Should Use

Why Open Table Format Architecture is Essential for Modern Data Systems

Where Does Fivetran Fit into The Modern Data Stack?

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

Top 6 Snowflake Interview Questions

The Modern Data Stack Explained: What The Future Holds

How to Connect Snowflake to Python

List of ETL Tools: Explore the Top ETL Tools for 2025

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Best Practices When Developing Matillion Jobs

How to Maximize Fan Engagement with Fan 360

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

The Ultimate Modern Data Stack Migration Guide

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Fivetran Supports the Automation of the Modern Data Lake on Amazon S3

Alation 2022.1: Customize Your Data Catalog

Getting Started With Matillion Data Productivity Cloud

Exploring the AI and data capabilities of watsonx

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

What is a Customer Data Platform (CDP)?

Retail & CPG Questions phData Can Answer with Data

Why Migrate From Netezza to Snowflake?

How to Build ETL Data Pipeline in ML

How Fifth Third Bank Democratizes Data Access via a Data Mesh with Alation and Snowflake

Top 5 Fivetran Connectors For Financial Services

Top 5 Use Cases of phData’s Advisor Tool

Alation 2022.3: Alation Anywhere Connecting the Modern Data Stack

Top 10 Reasons for Alation with Snowflake: Reduce Risk with Active Data Governance

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

How to Maximize Time to Value with Fivetran and dbt

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

How You Can Create and Share Your Own Matillion Shared Job

How to Build a Data Mesh in Snowflake

Stay Connected