Blog, Data Warehouse and ETL - Data Science Current

Snowflake Architecture & Key Concepts for Data Warehouse

Analytics Vidhya

JUNE 11, 2022

Introduction on Snowflake Architecture This article helps to focus on an in-depth understanding of Snowflake architecture, how it stores and manages data, as well as its conceptual fragmentation concepts. By the end of this blog, you will also be able to understand how Snowflake […].

Data Warehouse

Data Warehouse Data Science Analytics Analytics

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis. Create dbt models in dbt Cloud.

ETL

ETL Data Warehouse Analytics Analytics

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

In the contemporary age of Big Data, Data Warehouse Systems and Data Science Analytics Infrastructures have become an essential component for organizations to store, analyze, and make data-driven decisions. So why using IaC for Cloud Data Infrastructures?

Data Warehouse

Data Warehouse Azure SQL Database

Webinars

How to Achieve High-Accuracy Results When Using LLMs

Maximizing Profit and Productivity: The New Era of AI-Powered Accounting

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Enter AnalyticsCreator AnalyticsCreator, a powerful tool for data management, brings a new level of efficiency and reliability to the CI/CD process. It offers full BI-Stack Automation, from source to data warehouse through to frontend. It supports a holistic data model, allowing for rapid prototyping of various models.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Serverless High Volume ETL data processing on Code Engine

IBM Data Science in Practice

JANUARY 13, 2025

By Santhosh Kumar Neerumalla , Niels Korschinsky & Christian Hoeboer Introduction This blogpost describes how to manage and orchestrate high volume Extract-Transform-Load (ETL) loads using a serverless process based on Code Engine. The source data is unstructured JSON, while the target is a structured, relational database.

ETL

ETL Data Pipeline Database Data Warehouse

Understanding ETL Tools as a Data-Centric Organization

Smart Data Collective

SEPTEMBER 8, 2021

The ETL process is defined as the movement of data from its source to destination storage (typically a Data Warehouse) for future use in reports and analyzes. The data is initially extracted from a vast array of sources before transforming and converting it to a specific format based on business requirements.

ETL

ETL Hadoop Data Warehouse Data Pipeline

Data warehouse architecture

Dataconomy

OCTOBER 17, 2023

Want to create a robust data warehouse architecture for your business? The sheer volume of data that companies are now gathering is incredible, and understanding how best to store and use this information to extract top performance can be incredibly overwhelming.

Data Warehouse

Data Warehouse Big Data Big Data ETL

DataOps Highlights the Need for Automated ETL Testing (Part 2)

Dataversity

SEPTEMBER 27, 2021

DataOps, which focuses on automated tools throughout the ETL development cycle, responds to a huge challenge for data integration and ETL projects in general. ETL projects are increasingly based on agile processes and automated testing. extract, transform, load) projects are often devoid of automated testing.

DataOps

DataOps ETL Data Pipeline Data Warehouse

5 strategies for data security and governance in data warehousing: ensuring data protection and compliance

Data Science Dojo

SEPTEMBER 6, 2023

M aintaining the security and governance of data within a data warehouse is of utmost importance. Data Security: A Multi-layered Approach In data warehousing, data security is not a single barrier but a well-constructed series of layers, each contributing to protecting valuable information.

Data Warehouse

Data Warehouse Data Governance Data Quality ETL

Avoid These Mistakes on Your Data Warehouse and BI Projects

Dataversity

DECEMBER 7, 2020

Data warehousing (DW) and business intelligence (BI) projects are a high priority for many organizations who seek to empower more and better data-driven decisions and actions throughout their enterprises. These groups want to expand their user base for data discovery, BI, and analytics so that their business […].

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Analytics

Learn the Differences Between ETL and ELT

Pickl AI

OCTOBER 6, 2024

Summary: This blog explores the key differences between ETL and ELT, detailing their processes, advantages, and disadvantages. Understanding these methods helps organizations optimize their data workflows for better decision-making. What is ETL? ETL stands for Extract, Transform, and Load.

ETL

ETL Data Warehouse Data Quality Data Lakes

CI/CD für Datenpipelines – Ein Game-Changer mit AnalyticsCreator

Data Science Blog

JULY 20, 2024

Es bietet vollständige Automatisierung des BI-Stacks und unterstützt ein breites Spektrum an Data Warehouses, analytischen Datenbanken und Frontends. Automatisierung: Erstellt SQL-Code, DACPAC-Dateien, SSIS-Pakete, Data Factory-ARM-Vorlagen und XMLA-Dateien. Data Lakes: Unterstützt MS Azure Blob Storage.

Azure

Azure SQL Power BI Data Lakes

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

However, efficient use of ETL pipelines in ML can help make their life much easier. This article explores the importance of ETL pipelines in machine learning, a hands-on example of building ETL pipelines with a popular tool, and suggests the best ways for data engineers to enhance and sustain their pipelines.

ETL

ETL Data Pipeline ML ML

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Pickl AI

NOVEMBER 15, 2023

Discover the nuanced dissimilarities between Data Lakes and Data Warehouses. Data management in the digital age has become a crucial aspect of businesses, and two prominent concepts in this realm are Data Lakes and Data Warehouses. It acts as a repository for storing all the data.

Data Lakes

Data Lakes Data Warehouse Database ETL

Snowflake ETL Face-Off: Alteryx Designer vs. Matillion ETL

phData

MARCH 14, 2024

In the data analytics processes, choosing the right tools is crucial for ensuring efficiency and scalability. Two popular players in this area are Alteryx Designer and Matillion ETL , both offering strong solutions for handling data workflows with Snowflake Data Cloud integration.

ETL

ETL SQL Data Warehouse Data Pipeline

Becoming a Prized Data Warehouse and Data Integration Tester

Dataversity

MARCH 1, 2021

Data warehouse (DW) testers with data integration QA skills are in demand. Data warehouse disciplines and architectures are well established and often discussed in the press, books, and conferences. Each business often uses one or more data […]. Each business often uses one or more data […].

Data Warehouse

Data Warehouse ETL Data Governance Data Quality

Avoid These Mistakes on Your Data Warehouse and BI Projects: Part 3

Dataversity

FEBRUARY 1, 2021

Project sponsors seek to empower more and better data-driven decisions and actions throughout their enterprise; they intend to expand their […]. The post Avoid These Mistakes on Your Data Warehouse and BI Projects: Part 3 appeared first on DATAVERSITY.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Data Profiling

Avoid These Mistakes on Your Data Warehouse and BI Projects: Part 2

Dataversity

JANUARY 11, 2021

Project sponsors seek to empower more and better data-driven decisions and actions throughout their enterprise; they intend to expand their user base for […]. The post Avoid These Mistakes on Your Data Warehouse and BI Projects: Part 2 appeared first on DATAVERSITY.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Data Profiling

DataOps Highlights the Need for Automated ETL Testing (Part 1)

Dataversity

AUGUST 30, 2021

DataOps, which focuses on automated tools throughout the ETL development cycle, responds to a huge challenge for data integration and ETL projects in general. ETL projects are increasingly based on agile processes and automated testing. extract, transform, load) projects are often devoid of automated testing.

DataOps

DataOps ETL Data Pipeline Data Warehouse

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Pickl AI

JUNE 7, 2024

Summary: Choosing the right ETL tool is crucial for seamless data integration. Top contenders like Apache Airflow and AWS Glue offer unique features, empowering businesses with efficient workflows, high data quality, and informed decision-making capabilities. Choosing the right ETL tool is crucial for smooth data management.

ETL

ETL Data Quality Data Pipeline Data Warehouse

Understanding the ETL vs. ELT Alphabet Soup and When to Use Each

Dataversity

MAY 17, 2021

There are advantages and disadvantages to both ETL and ELT. The post Understanding the ETL vs. ELT Alphabet Soup and When to Use Each appeared first on DATAVERSITY. To understand which method is a better fit, it’s important to understand what it means when one letter comes before the other.

ETL

ETL Data Lakes Data Warehouse Database

How Reverse ETL Powers Modern Customer Marketing: Concrete Examples

Dataversity

JANUARY 27, 2023

Up until recently, feedback forms and […] The post How Reverse ETL Powers Modern Customer Marketing: Concrete Examples appeared first on DATAVERSITY. If you’re part of a customer marketing team, you know that most people would say “not very often.” This is precisely the plight of the average customer marketer.

ETL

ETL Data Warehouse Analytics Analytics

What Is Fivetran and How Much Does It Cost?

phData

MARCH 8, 2023

Fivetran, a cloud-based automated data integration platform, has emerged as a leading choice among businesses looking for an easy and cost-effective way to unify their data from various sources. Fivetran is used by businesses to centralize data from various sources into a single, comprehensive data warehouse.

Data Warehouse

Data Warehouse Data Engineering Data Engineering Data Engineering

How Fivetran and dbt Help With ELT

phData

AUGUST 9, 2023

If you’ve been watching how Snowflake Data Cloud has been growing and changing over the years, you’ll see that two tools have made very large impacts on the Modern Data Stack: Fivetran and dbt. This is unlike the more traditional ETL method, where data is transformed before loading into the data warehouse.

ETL

ETL Data Warehouse Cloud Data Big Data

Optimizing Snowflake’s Performance for Data Vault Modeling

phData

OCTOBER 9, 2023

However, to harness the full potential of Snowflake’s performance capabilities, it is essential to adopt strategies tailored explicitly for data vault modeling. Hash keys provide all key types’ best data load performance, consistency, and audibility.

ETL

ETL Clustering Data Warehouse SQL

Best Practices When Developing Matillion Jobs

phData

SEPTEMBER 2, 2024

In this blog, we will cover the best practices for developing jobs in Matillion, an ETL/ELT tool built specifically for cloud database platforms. The blog will be divided into three broad sections: Design, SDLC, and Security, each with its best practices. What Are Matillion Jobs and Why Do They Matter?

ETL

ETL Data Warehouse SQL Database

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Flipboard

DECEMBER 11, 2024

Organizations are building data-driven applications to guide business decisions, improve agility, and drive innovation. Many of these applications are complex to build because they require collaboration across teams and the integration of data, tools, and services. Big Data Architect. option("multiLine", "true").option("header",

SQL

SQL AWS Data Lakes AI

How to Best Leverage Outsourced Call Center Data with Snowflake

phData

FEBRUARY 3, 2023

More and more businesses are looking to better leverage their outsourced call center data to make more data-driven decisions. To do this on your own, you would need to create a data warehouse, optimize the reporting performance, and very clearly visualize the data. Another way to think of it is as Data Activation.

ETL

ETL Data Warehouse Analytics Analytics

Supercharge your data strategy: Integrate and innovate today leveraging data integration

IBM Journey to AI blog

OCTOBER 22, 2024

A flexible approach that enables tooling coexistence as well as flexibility with locality of pipeline execution with targeted data planes or pushdown of transformation logic to data warehouses or lakehouses decreases unnecessary data movement to reduce or eliminate data egress charges.

Data Silos

Data Silos Data Pipeline DataOps Business Intelligence

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

It is known to have benefits in handling data due to its robustness, speed, and scalability. A typical modern data stack consists of the following: A data warehouse. Data ingestion/integration services. Reverse ETL tools. Data orchestration tools. A Note on the Shift from ETL to ELT.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

Over the past few decades, the corporate data landscape has changed significantly. The shift from on-premise databases and spreadsheets to the modern era of cloud data warehouses and AI/ LLMs has transformed what businesses can do with data. This is where Fivetran and the Modern Data Stack come in.

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

Db2 Warehouse fully supports open formats such as Parquet, Avro, ORC and Iceberg table format to share data and extract new insights across teams without duplication or additional extract, transform, load (ETL). This allows you to scale all analytics and AI workloads across the enterprise with trusted data. 

AWS

AWS Database ETL AI

How to Use Fivetran to Ingest Data for a Composable CDP (Customer Data Platform)

phData

JUNE 6, 2024

Marketing and business professionals must effectively manage and leverage their customer data to stay competitive. In this blog, we will explore how marketing professionals have approached the challenge of effectively using their vast amount of customer data using Composable CDPs. Why use Fivetran for Composable CDP?

Data Warehouse

Data Warehouse Cloud Data ETL Data Modeling

How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize

AWS Machine Learning Blog

JANUARY 6, 2023

TR has a wealth of data that could be used for personalization that has been collected from customer interactions and stored within a centralized data warehouse. The user interactions data from various sources is persisted in their data warehouse. The following diagram illustrates the ML training pipeline.

AWS

AWS Data Warehouse ML ML

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

In our previous blog, Top 5 Fivetran Connectors for Financial Services , we explored Fivetran’s capabilities that address the data integration needs of the finance industry. Now, let’s cover the healthcare industry, which also has a surging demand for data and analytics, along with the underlying processes to make it happen.

SQL

SQL Data Warehouse Azure Cloud Data

What is Data Integration in Data Mining with Example?

Pickl AI

JUNE 28, 2023

But, this data is often stored in disparate systems and formats. Here comes the role of Data Mining. Read this blog to know more about Data Integration in Data Mining, The process encompasses various techniques that help filter useful data from the resource. Thereby, improving data quality and consistency.

Data Mining

Data Mining Data Mining Data Mining ETL

What exactly is Data Profiling: It’s Examples & Types

Pickl AI

AUGUST 31, 2023

Accordingly, the need for Data Profiling in ETL becomes important for ensuring higher data quality as per business requirements. The following blog will provide you with complete information and in-depth understanding on what is data profiling and its benefits and the various tools used in the method.

Data Profiling

Data Profiling ETL Data Quality Data Wrangling

Transitioning off Amazon Lookout for Metrics

AWS Machine Learning Blog

OCTOBER 9, 2024

Using Amazon Redshift ML for anomaly detection Amazon Redshift ML makes it easy to create, train, and apply machine learning models using familiar SQL commands in Amazon Redshift data warehouses. To capture unanticipated, less obvious data patterns, you can enable anomaly detection.

AWS

AWS ML ML Data Quality

Data platform trinity: Competitive or complementary?

IBM Journey to AI blog

JANUARY 18, 2023

They defined it as : “ A data lakehouse is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of data lakes with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data. ”.

Data Lakes

Data Lakes Data Warehouse Azure Apache Hadoop

Considerations and Approaches to Loading Reference Data into Snowflake

phData

AUGUST 9, 2024

Typically, this data is scattered across Excel files on business users’ desktops. Snowflake can not natively read files on these services, so an ETL service is needed to upload the data. ETL applications are often expensive and require some level of expertise to run.

ETL

ETL Data Warehouse Data Governance Tableau

Getting Started With Matillion Data Productivity Cloud

phData

NOVEMBER 28, 2023

In this blog, we will show you how easy it is to get your Data Productivity Cloud environment up and running and how you can start your studies on the platform. What is Matillion Data Productivity Cloud? Now, you can start to develop your own Matillion job using the Data Productivity Cloud connected to your data warehouse.

Data Warehouse

Data Warehouse Data Pipeline ETL Azure

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

By employing robust data modeling techniques, businesses can unlock the true value of their data lake and transform it into a strategic asset. With many data modeling methodologies and processes available, choosing the right approach can be daunting. Want to learn more about data governance?

Data Lakes

Data Lakes Data Models Data Modeling Data Warehouse

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

AWS Machine Learning Blog

JUNE 25, 2024

The raw data is processed by an LLM using a preconfigured user prompt. The processed output is stored in a database or data warehouse, such as Amazon Relational Database Service (Amazon RDS). The stored data is visualized in a BI dashboard using QuickSight. The LLM generates output based on the user prompt.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Snowflake Architecture & Key Concepts for Data Warehouse

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

Trending Sources

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Webinars

Top 20 Data Warehouse Interview Questions You Must Know in 2025

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Serverless High Volume ETL data processing on Code Engine

Understanding ETL Tools as a Data-Centric Organization

Data warehouse architecture

DataOps Highlights the Need for Automated ETL Testing (Part 2)

5 strategies for data security and governance in data warehousing: ensuring data protection and compliance

Avoid These Mistakes on Your Data Warehouse and BI Projects

Learn the Differences Between ETL and ELT

CI/CD für Datenpipelines – Ein Game-Changer mit AnalyticsCreator

How to Build ETL Data Pipeline in ML

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Snowflake ETL Face-Off: Alteryx Designer vs. Matillion ETL

Becoming a Prized Data Warehouse and Data Integration Tester

Avoid These Mistakes on Your Data Warehouse and BI Projects: Part 3

Avoid These Mistakes on Your Data Warehouse and BI Projects: Part 2

DataOps Highlights the Need for Automated ETL Testing (Part 1)

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Understanding the ETL vs. ELT Alphabet Soup and When to Use Each

How Reverse ETL Powers Modern Customer Marketing: Concrete Examples

What Is Fivetran and How Much Does It Cost?

How Fivetran and dbt Help With ELT

Optimizing Snowflake’s Performance for Data Vault Modeling

Best Practices When Developing Matillion Jobs

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

How to Best Leverage Outsourced Call Center Data with Snowflake

Supercharge your data strategy: Integrate and innovate today leveraging data integration

The Modern Data Stack Explained: What The Future Holds

Where Does Fivetran Fit into The Modern Data Stack?

Tackling AI’s data challenges with IBM databases on AWS

How to Use Fivetran to Ingest Data for a Composable CDP (Customer Data Platform)

How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize

Top 5 Fivetran Connectors for Healthcare

What is Data Integration in Data Mining with Example?

What exactly is Data Profiling: It’s Examples & Types

Transitioning off Amazon Lookout for Metrics

Data platform trinity: Competitive or complementary?

Considerations and Approaches to Loading Reference Data into Snowflake

Getting Started With Matillion Data Productivity Cloud

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

Stay Connected