Cloud Data, Data Pipeline and Data Warehouse

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Built into Data Wrangler, is the Chat for data prep option, which allows you to use natural language to explore, visualize, and transform your data in a conversational interface. Amazon QuickSight powers data-driven organizations with unified (BI) at hyperscale. A provisioned or serverless Amazon Redshift data warehouse.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

ETL

ETL Data Warehouse Analytics Analytics

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Data Science Dojo

SEPTEMBER 11, 2024

These experiences facilitate professionals from ingesting data from different sources into a unified environment and pipelining the ingestion, transformation, and processing of data to developing predictive models and analyzing the data by visualization in interactive BI reports.

Power BI

Power BI Data Pipeline Data Warehouse Data Engineering

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

On-Prem vs. The Cloud: Key Considerations

phData

FEBRUARY 21, 2025

In this post, we will be particularly interested in the impact that cloud computing left on the modern data warehouse. We will explore the different options for data warehousing and how you can leverage this information to make the right decisions for your organization. Understanding the Basics What is a Data Warehouse?

Data Warehouse

Data Warehouse Cloud Data ETL Cloud Computing

Top 5 Tools for Building an Interactive Analytics App

Smart Data Collective

OCTOBER 27, 2021

Snowflake provides the right balance between the cloud and data warehousing, especially when data warehouses like Teradata and Oracle are becoming too expensive for their users. It is also easy to get started with Snowflake as the typical complexity of data warehouses like Teradata and Oracle are hidden from the users. .

Analytics

Analytics Analytics Data Warehouse Business Intelligence

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

We also discuss different types of ETL pipelines for ML use cases and provide real-world examples of their use to help data engineers choose the right one. What is an ETL data pipeline in ML? Xoriant It is common to use ETL data pipeline and data pipeline interchangeably.

ETL

ETL Data Pipeline ML ML

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Tableau

JUNE 8, 2021

Domain experts, for example, feel they are still overly reliant on core IT to access the data assets they need to make effective business decisions. In all of these conversations there is a sense of inertia: Data warehouses and data lakes feel cumbersome and data pipelines just aren't agile enough.

Tableau

Tableau Data Lakes Data Warehouse SQL

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

Over the past few decades, the corporate data landscape has changed significantly. The shift from on-premise databases and spreadsheets to the modern era of cloud data warehouses and AI/ LLMs has transformed what businesses can do with data. This is where Fivetran and the Modern Data Stack come in.

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

What Is Fivetran and How Much Does It Cost?

phData

MARCH 8, 2023

Fivetran is an automated data integration platform that offers a convenient solution for businesses to consolidate and sync data from disparate data sources. With over 160 data connectors available, Fivetran makes it easy to move data out of, into, and across any cloud data platform in the market.

Data Warehouse

Data Warehouse Data Engineering Data Engineering Data Engineering

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

phData

JUNE 14, 2023

which play a crucial role in building end-to-end data pipelines, to be included in your CI/CD pipelines. End-To-End Data Pipeline Use Case & Flyway Configuration Let’s consider a scenario where you have the requirement to ingest and process inventory data on an hourly basis.

Data Pipeline

Data Pipeline Database SQL Data Engineering

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. Data ingestion/integration services. Data orchestration tools.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

Smart Data Collective

OCTOBER 17, 2022

If you haven’t already, moving to the cloud can be a realistic alternative. Cloud data warehouses provide various advantages, including the ability to be more scalable and elastic than conventional warehouses. Can’t get to the data. Data pipeline maintenance.

Big Data

Big Data Big Data Data Engineering Data Engineering

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift is the most popular cloud data warehouse that is used by tens of thousands of customers to analyze exabytes of data every day. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, ML, and application development.

ML

ML ML AWS Data Warehouse

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

Fivetran enables healthcare organizations to ingest data securely and effectively from a variety of sources into their target destinations, such as Snowflake or other cloud data platforms, for further analytics or curation for sharing data with external providers or customers.

SQL

SQL Data Warehouse Azure Cloud Data

Getting Started With Matillion Data Productivity Cloud

phData

NOVEMBER 28, 2023

In July 2023, Matillion launched their fully SaaS platform called Data Productivity Cloud, aiming to create a future-ready, everyone-ready, and AI-ready environment that companies can easily adopt and start automating their data pipelines coding, low-coding, or even no-coding at all. Or would you even go to that directly?

Data Warehouse

Data Warehouse Data Pipeline ETL Azure

Best Practices When Developing Matillion Jobs

phData

SEPTEMBER 2, 2024

Best practices are a pivotal part of any software development, and data engineering is no exception. This ensures the data pipelines we create are robust, durable, and secure, providing the desired data to the organization effectively and consistently. Below are the best practices.

ETL

ETL Data Warehouse SQL Database

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Tableau

JUNE 8, 2021

Domain experts, for example, feel they are still overly reliant on core IT to access the data assets they need to make effective business decisions. In all of these conversations there is a sense of inertia: Data warehouses and data lakes feel cumbersome and data pipelines just aren't agile enough.

Tableau

Tableau Data Lakes Data Warehouse SQL

Fivetran Modern Data Stack Conference 2023: Key Takeaways

Alation

APRIL 14, 2023

Last week, the Alation team had the privilege of joining IT professionals, business leaders, and data analysts and scientists for the Modern Data Stack Conference in San Francisco. In “The modern data stack is dead, long live the modern data stack!” Cloud costs are growing prohibitive.

Data Pipeline

Data Pipeline Data Warehouse Cloud Data ETL

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

The demand for information repositories enabling business intelligence and analytics is growing exponentially, giving birth to cloud solutions. The ultimate need for vast storage spaces manifests in data warehouses: specialized systems that aggregate data coming from numerous sources for centralized management and consistency.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Data integration is essentially the Extract and Load portion of the Extract, Load, and Transform (ELT) process. Data ingestion involves connecting your data sources, including databases, flat files, streaming data, etc, to your data warehouse. Snowflake provides native ways for data ingestion.

Data Warehouse

Data Warehouse Azure AWS Database

How Fivetran + dbt provides Enterprise Scale to ELT Pipelines

phData

OCTOBER 12, 2023

When the data or pipeline configuration needs to be changed, tools like Fivetran and dbt reduce the time required to make the change, and increase the confidence your team can have around the change. These allow you to scale your pipelines quickly. Governance doesn’t have to be scary or preventative to your cloud data warehouse.

Data Warehouse

Data Warehouse Database Cloud Data Data Pipeline

11 Open-Source Data Engineering Tools Every Pro Should Use

ODSC - Open Data Science

FEBRUARY 6, 2024

This open-source streaming platform enables the handling of high-throughput data feeds, ensuring that data pipelines are efficient, reliable, and capable of handling massive volumes of data in real-time. Its open-source nature means it’s continually evolving, thanks to contributions from its user community.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

How to Connect Snowflake to Python

phData

JANUARY 5, 2023

Python has proven proficient in setting up pipelines, maintaining data flows, and transforming data with its simple syntax and proficiency in automation. Having been built completely for and in the cloud, the Snowflake Data Cloud has become an industry leader in cloud data platforms.

Python

Python Data Engineering Data Engineering Data Engineer

What is Salesforce Data Cloud for Tableau?

Tableau

DECEMBER 7, 2022

But good data—and actionable insights—are hard to get. Traditionally, organizations built complex data pipelines to replicate data. Those data architectures were brittle, complex, and time intensive to build and maintain, requiring data duplication and bloated data warehouse investments.

Tableau

Tableau Data Warehouse Data Pipeline Data Visualization

The Data Integration Solution Checklist: Top 10 Considerations

Precisely

MAY 13, 2024

As enterprise technology landscapes grow more complex, the role of data integration is more critical than ever before. Wide support for enterprise-grade sources and targets Large organizations with complex IT landscapes must have the capability to easily connect to a wide variety of data sources.

Data Governance

Data Governance Data Pipeline Cloud Data Data Quality

Using Matillion Data Productivity Cloud to call APIs

phData

JANUARY 19, 2024

Matillion’s Data Productivity Cloud is a versatile platform designed to increase the productivity of data teams. It provides a unified platform for creating and managing data pipelines that are effective for both coders and non-coders. Additional setup is typically optional.

Data Pipeline

Data Pipeline Data Warehouse ETL Azure

Orchestrating Coalesce Pipelines Using Fivetran

phData

AUGUST 14, 2024

It provides businesses with an efficient way to move and centralize data from all their sources. Boasting over 500 pre-built data connectors, Fivetran simplifies transferring data to, from, and within any cloud data platform available today. What Are the Benefits of Fivetran’s Coalesce Orchestration Integration?

Data Warehouse

Data Warehouse Data Pipeline SQL Cloud Data

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

It is supported by querying, governance, and open data formats to access and share data across the hybrid cloud. Through workload optimization across multiple query engines and storage tiers, organizations can reduce data warehouse costs by up to 50 percent.

AI

AI AI Machine Learning Machine Learning

How Does Fivetran Drive Business Value?

phData

APRIL 23, 2024

Designing New Data Pipelines Takes a Considerable Amount of Time and Knowledge Designing new ingestion pipelines is a complex undertaking that demands significant time and expertise. Engineering teams must maintain a complex web of ingestion pipelines capable of supporting many different sources, each with its own intricacies.

Data Governance

Data Governance Data Pipeline Data Warehouse Cloud Data

Why a Streaming-First Approach to Digital Modernization Matters

Precisely

APRIL 3, 2023

It simply wasn’t practical to adopt an approach in which all of an organization’s data would be made available in one central location, for all-purpose business analytics. To speed analytics, data scientists implemented pre-processing functions to aggregate, sort, and manage the most important elements of the data.

ETL

ETL Analytics Analytics Database

Top 5 Fivetran Connectors For Financial Services

phData

JANUARY 24, 2024

Fivetran includes features like data movement, transformations, robust security, and compatibility with third-party tools like DBT, Airflow, Atlan, and more. Its seamless integration with popular cloud data warehouses like Snowflake can provide the scalability needed as your business grows.

Data Warehouse

Data Warehouse Data Pipeline Data Governance Cloud Data

Using Fivetran’s New Hybrid Architecture to Replicate Data In Your Cloud Environment

phData

SEPTEMBER 18, 2024

As data and AI continue to dominate today’s marketplace, the ability to securely and accurately process and centralize that data is crucial to an organization’s long-term success. Fivetran’s Hybrid Architecture allows an organization to maintain ownership and control of its data through the entire data pipeline.

Data Warehouse

Data Warehouse System Architecture Data Pipeline Cloud Data

Retail & CPG Questions phData Can Answer with Data

phData

JUNE 26, 2024

Cleaning and preparing the data Raw data typically shouldn’t be used in machine learning models as it’ll throw off the prediction. This can be achieved by, you guessed it, analyzing the data. phData Retail Case Study phData helps many retail businesses answer these questions and more by utilizing their data to the fullest.

Machine Learning

Machine Learning Machine Learning Data Engineering Data Engineering

Choosing the Right ETL Platform: Benefits for Data Integration

Pickl AI

OCTOBER 15, 2024

ETL (Extract, Transform, Load) is a core process in data integration that involves extracting data from various sources, transforming it into a usable format, and loading it into a target system, such as a data warehouse. It supports both batch and real-time data processing , making it highly versatile.

ETL

ETL Azure AWS Data Governance

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

This two-part series will explore how data discovery, fragmented data governance , ongoing data drift, and the need for ML explainability can all be overcome with a data catalog for accurate data and metadata record keeping. The Cloud Data Migration Challenge. Data pipeline orchestration.

Data Governance

Data Governance ML ML Cloud Data

Top 5 Use Cases of phData’s Advisor Tool

phData

MARCH 29, 2024

Founded in 2014 by three leading cloud engineers, phData focuses on solving real-world data engineering, operations, and advanced analytics problems with the best cloud platforms and products. Over the years, one of our primary focuses became Snowflake and migrating customers to this leading cloud data platform.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

What are the Biggest Challenges with Migrating to Snowflake?

phData

FEBRUARY 5, 2024

There are many different third-party tools that work with Snowflake: Fivetran Fivetran is a tool dedicated to replicating applications, databases, events, and files into a high-performance data warehouse, such as Snowflake. How can organizations minimize downtime and ensure business continuity during the migration process to Snowflake?

SQL

SQL Database Data Quality Data Warehouse

How to Optimize Power BI and Snowflake for Advanced Analytics

phData

MAY 25, 2023

One big issue that contributes to this resistance is that although Snowflake is a great cloud data warehousing platform, Microsoft has a data warehousing tool of its own called Synapse. Gateways are being used as another layer of security between Snowflake or cloud data source and Power BI users.

Power BI

Power BI Analytics Analytics Azure

A Look Inside the Modern Analytics Stack

Dataversity

APRIL 1, 2021

In the data-driven world we live in today, the field of analytics has become increasingly important to remain competitive in business. In fact, a study by McKinsey Global Institute shows that data-driven organizations are 23 times more likely to outperform competitors in customer acquisition and nine times […].

Analytics

Analytics Analytics Data Silos Data Lakes

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

Both persistent staging and data lakes involve storing large amounts of raw data. But persistent staging is typically more structured and integrated into your overall customer data pipeline. It’s not just a dumping ground for data, but a crucial step in your customer data processing workflow.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

What’s really important in the before part is having production-grade machine learning data pipelines that can feed your model training and inference processes. And that’s really key for taking data science experiments into production. And so that’s where we got started as a cloud data warehouse.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

What’s really important in the before part is having production-grade machine learning data pipelines that can feed your model training and inference processes. And that’s really key for taking data science experiments into production. And so that’s where we got started as a cloud data warehouse.

SQL

SQL ML ML Python

Top 10 Python Scripts for use in Matillion for Snowflake

phData

OCTOBER 28, 2024

However, if the tool supposes an option where we can write our custom programming code to implement features that cannot be achieved using the drag-and-drop components, it broadens the horizon of what we can do with our data pipelines. The default value is 360 seconds.

Python

Python ETL AWS Database

What are the Advantages of Using Fivetran?

phData

JULY 19, 2023

In this blog post, we’ll dive into the amazing advantages of using Fivetran , a powerful data integration platform that will revolutionize the way you handle your data pipelines. They established an Information Architecture for Snowflake Data Cloud , enabling automated database and role creation.

Data Warehouse

Data Warehouse Database Data Pipeline SQL

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

Trending Sources

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Webinars

On-Prem vs. The Cloud: Key Considerations

Top 5 Tools for Building an Interactive Analytics App

How to Build ETL Data Pipeline in ML

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Where Does Fivetran Fit into The Modern Data Stack?

What Is Fivetran and How Much Does It Cost?

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

The Modern Data Stack Explained: What The Future Holds

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Top 5 Fivetran Connectors for Healthcare

Getting Started With Matillion Data Productivity Cloud

Best Practices When Developing Matillion Jobs

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Fivetran Modern Data Stack Conference 2023: Key Takeaways

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

How Fivetran + dbt provides Enterprise Scale to ELT Pipelines

11 Open-Source Data Engineering Tools Every Pro Should Use

How to Connect Snowflake to Python

What is Salesforce Data Cloud for Tableau?

The Data Integration Solution Checklist: Top 10 Considerations

Using Matillion Data Productivity Cloud to call APIs

Orchestrating Coalesce Pipelines Using Fivetran

Exploring the AI and data capabilities of watsonx

How Does Fivetran Drive Business Value?

Why a Streaming-First Approach to Digital Modernization Matters

Top 5 Fivetran Connectors For Financial Services

Using Fivetran’s New Hybrid Architecture to Replicate Data In Your Cloud Environment

Retail & CPG Questions phData Can Answer with Data

Choosing the Right ETL Platform: Benefits for Data Integration

The Cloud Connection: How Governance Supports Security

Top 5 Use Cases of phData’s Advisor Tool

What are the Biggest Challenges with Migrating to Snowflake?

How to Optimize Power BI and Snowflake for Advanced Analytics

A Look Inside the Modern Analytics Stack

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

Top 10 Python Scripts for use in Matillion for Snowflake

What are the Advantages of Using Fivetran?

Stay Connected