Cloud Data, Data Pipeline and ETL - Data Science Current

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis. Create dbt models in dbt Cloud.

ETL

ETL Data Warehouse Analytics Analytics

The power of remote engine execution for ETL/ELT data pipelines

IBM Journey to AI blog

MAY 15, 2024

Two of the more popular methods, extract, transform, load (ETL ) and extract, load, transform (ELT) , are both highly performant and scalable. Data engineers build data pipelines, which are called data integration tasks or jobs, as incremental steps to perform data operations and orchestrate these data pipelines in an overall workflow.

Data Pipeline

Data Pipeline ETL SQL Database

How Cloud Data Platforms improve Shopfloor Management

Data Science Blog

FEBRUARY 4, 2023

The fusion of data in a central platform enables smooth analysis to optimize processes and increase business efficiency in the world of Industry 4.0 using methods from business intelligence , process mining and data science. Cloud Data Platform for shopfloor management and data sources such like MES, ERP, PLM and machine data.

Cloud Data

Cloud Data Data Science Business Intelligence Business Intelligence

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How to Build ETL Data Pipeline in ML

The MLOps Blog

MAY 17, 2023

However, efficient use of ETL pipelines in ML can help make their life much easier. This article explores the importance of ETL pipelines in machine learning, a hands-on example of building ETL pipelines with a popular tool, and suggests the best ways for data engineers to enhance and sustain their pipelines.

ETL

ETL Data Pipeline ML ML

List of ETL Tools: Explore the Top ETL Tools for 2025

Pickl AI

APRIL 9, 2025

Summary: This guide explores the top list of ETL tools, highlighting their features and use cases. It provides insights into considerations for choosing the right tool, ensuring businesses can optimize their data integration processes for better analytics and decision-making. What is ETL? What are ETL Tools?

ETL

ETL Data Warehouse AWS Business Intelligence

Choosing the Right ETL Platform: Benefits for Data Integration

Pickl AI

OCTOBER 15, 2024

Summary: Selecting the right ETL platform is vital for efficient data integration. Consider your business needs, compare features, and evaluate costs to enhance data accuracy and operational efficiency. Introduction In today’s data-driven world, businesses rely heavily on ETL platforms to streamline data integration processes.

ETL

ETL Azure AWS Data Governance

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

phData

JUNE 14, 2023

In recent years, data engineering teams working with the Snowflake Data Cloud platform have embraced the continuous integration/continuous delivery (CI/CD) software development process to develop data products and manage ETL/ELT workloads more efficiently.

Data Pipeline

Data Pipeline Database SQL Data Engineering

What Is Fivetran and How Much Does It Cost?

phData

MARCH 8, 2023

Fivetran is an automated data integration platform that offers a convenient solution for businesses to consolidate and sync data from disparate data sources. With over 160 data connectors available, Fivetran makes it easy to move data out of, into, and across any cloud data platform in the market.

Data Warehouse

Data Warehouse Data Engineering Data Engineer Data Engineering

On-Prem vs. The Cloud: Key Considerations

phData

FEBRUARY 21, 2025

With a traditional on-prem data warehouse, an organization will face more substantial Capital Expenditures (CapEx), or one-time costs, such as infrastructure setup, network configuration, and investments in servers and storage devices. When investing in a cloud data warehouse, the Operational Expenditures (OpEx) will be larger.

Data Warehouse

Data Warehouse Cloud Data ETL Cloud Computing

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

Over the past few decades, the corporate data landscape has changed significantly. The shift from on-premise databases and spreadsheets to the modern era of cloud data warehouses and AI/ LLMs has transformed what businesses can do with data. This is where Fivetran and the Modern Data Stack come in.

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

Best Practices When Developing Matillion Jobs

phData

SEPTEMBER 2, 2024

Best practices are a pivotal part of any software development, and data engineering is no exception. This ensures the data pipelines we create are robust, durable, and secure, providing the desired data to the organization effectively and consistently. What Are Matillion Jobs and Why Do They Matter?

ETL

ETL Data Warehouse SQL Database

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

Smart Data Collective

OCTOBER 17, 2022

If you haven’t already, moving to the cloud can be a realistic alternative. Cloud data warehouses provide various advantages, including the ability to be more scalable and elastic than conventional warehouses. Can’t get to the data. Data pipeline maintenance.

Big Data

Big Data Big Data Data Engineering Data Engineer

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

Snowflake’s Data Cloud has emerged as a leader in cloud data warehousing. As a fundamental piece of the modern data stack , Snowflake is helping thousands of businesses store, transform, and derive insights from their data easier, faster, and more efficiently than ever before.

Data Warehouse

Data Warehouse Data Lakes Clustering Cloud Data

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

phData

APRIL 28, 2025

For those unfamiliar with GIT or GIT practices, please refer Git for Business Users with Matillion DPC What is a Matillion Pipeline? A Matillion pipeline is a collection of jobs that extract, load, and transform (ETL/ELT) data from various sources into a target system, such as a cloud data warehouse like Snowflake.

AI

AI AI SQL ETL

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

Data ingestion/integration services. Reverse ETL tools. Data orchestration tools. These tools are used to manage big data, which is defined as data that is too large or complex to be processed by traditional means. How Did the Modern Data Stack Get Started? A Note on the Shift from ETL to ELT.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Big Data Technologies : Handling and processing large datasets using tools like Hadoop, Spark, and cloud platforms such as AWS and Google Cloud. Data Processing and Analysis : Techniques for data cleaning, manipulation, and analysis using libraries such as Pandas and Numpy in Python.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

Understanding Fivetran Fivetran is a popular Software-as-a-Service platform that enables users to automate the movement of data and ETL processes across diverse sources to a target destination. For a longer overview, along with insights and best practices, please feel free to jump back to the previous blog.

SQL

SQL Data Warehouse Azure Cloud Data

Why a Streaming-First Approach to Digital Modernization Matters

Precisely

APRIL 3, 2023

How can an organization enable flexible digital modernization that brings together information from multiple data sources, while still maintaining trust in the integrity of that data? To speed analytics, data scientists implemented pre-processing functions to aggregate, sort, and manage the most important elements of the data.

ETL

ETL Analytics Analytics Database

How to Maximize Time to Value with Fivetran and dbt

phData

OCTOBER 17, 2023

The story is all too common – a business user requests some data, the data team creates/prioritizes a ticket, and said ticket is completed after some number of months (or weeks if you’re lucky) – just to have the data be wrong, and the whole process starts again. Those are scary for data teams to change.

ETL

ETL Data Pipeline Data Engineering Data Engineer

Getting Started With Matillion Data Productivity Cloud

phData

NOVEMBER 28, 2023

In July 2023, Matillion launched their fully SaaS platform called Data Productivity Cloud, aiming to create a future-ready, everyone-ready, and AI-ready environment that companies can easily adopt and start automating their data pipelines coding, low-coding, or even no-coding at all. Why Does it Matter?

Data Warehouse

Data Warehouse Data Pipeline ETL Azure

Ensure Success with Trusted Data When Moving To The Cloud

Precisely

JUNE 2, 2023

As companies strive to leverage AI/ML, location intelligence, and cloud analytics into their portfolio of tools, siloed mainframe data often stands in the way of forward momentum. The right data integration technology can vastly simplify things. Streaming data pipelines help to make data available and accessible in real time.

Data Silos

Data Silos ETL Data Quality Data Pipeline

How to Connect Snowflake to Python

phData

JANUARY 5, 2023

Python has proven proficient in setting up pipelines, maintaining data flows, and transforming data with its simple syntax and proficiency in automation. Having been built completely for and in the cloud, the Snowflake Data Cloud has become an industry leader in cloud data platforms.

Python

Python Data Engineering Data Engineering Data Engineering

Using Matillion Data Productivity Cloud to call APIs

phData

JANUARY 19, 2024

Matillion’s Data Productivity Cloud is a versatile platform designed to increase the productivity of data teams. It provides a unified platform for creating and managing data pipelines that are effective for both coders and non-coders. With that, you can cover most of the necessary connections.

Data Pipeline

Data Pipeline Data Warehouse ETL Azure

Turnkey Cloud DataOps: Solution from Alation and Accenture

Alation

MARCH 22, 2022

As the latest iteration in this pursuit of high-quality data sharing, DataOps combines a range of disciplines. It synthesizes all we’ve learned about agile, data quality , and ETL/ELT. They created each capability as modules, which can either be used independently or together to build automated data pipelines.

DataOps

DataOps Data Pipeline Data Engineering Data Engineer

How to Choose a Futureproof Data Integration Solution

Precisely

MAY 23, 2024

The sudden popularity of cloud data platforms like Databricks , Snowflake , Amazon Redshift, Amazon RDS, Confluent Cloud , and Azure Synapse has accelerated the need for powerful data integration tools that can deliver large volumes of information from transactional applications to the cloud reliably, at scale, and in real time.

Data Governance

Data Governance ETL Data Pipeline Azure

Alation 2023.1: Easing Self-Service for the Modern Data Stack with Databricks and dbt Labs

Alation

APRIL 4, 2023

However, the race to the cloud has also created challenges for data users everywhere, including: Cloud migration is expensive, migrating sensitive data is risky, and navigating between on-prem sources is often confusing for users. To build effective data pipelines, they need context (or metadata) on every source.

DataOps

DataOps Data Engineering Data Engineer Data Engineering

How to Choose a Futureproof Data Integration Solution

Precisely

MAY 23, 2024

The sudden popularity of cloud data platforms like Databricks , Snowflake , Amazon Redshift, Amazon RDS, Confluent Cloud , and Azure Synapse has accelerated the need for powerful data integration tools that can deliver large volumes of information from transactional applications to the cloud reliably, at scale, and in real time.

Data Governance

Data Governance ETL Data Pipeline Azure

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Source data formats can only be Parquer, JSON, or Delimited Text (CSV, TSV, etc.). Streamsets Data Collector StreamSets Data Collector Engine is an easy-to-use data pipeline engine for streaming, CDC, and batch ingestion from any source to any destination.

Data Warehouse

Data Warehouse Azure AWS Database

How Does Snowpark Work?

phData

FEBRUARY 7, 2024

The Snowflake Data Cloud is a leading cloud data platform that provides various features and services for data storage, processing, and analysis. A new feature that Snowflake offers is called Snowpark, which provides an intuitive library for querying and processing data at scale in Snowflake.

Python

Python ML ML SQL

What are the Biggest Challenges with Migrating to Snowflake?

phData

FEBRUARY 5, 2024

Replicate can interact with a wide variety of databases, data warehouses, and data lakes (on-premise or based in the cloud). Migrating Your Pipelines and Code It’s more than likely that your business has years of code being used in its data pipelines.

SQL

SQL Database Data Quality Data Warehouse

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

With the birth of cloud data warehouses, data applications, and generative AI , processing large volumes of data faster and cheaper is more approachable and desired than ever. This typically results in long-running ETL pipelines that cause decisions to be made on stale or old data.

Data Warehouse

Data Warehouse Analytics Analytics SQL

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

The next generation of Db2 Warehouse SaaS and Netezza SaaS on AWS fully support open formats such as Parquet and Iceberg table format, enabling the seamless combination and sharing of data in watsonx.data without the need for duplication or additional ETL. Savings may vary depending on configurations, workloads and vendor.

AI

AI AI Machine Learning Machine Learning

A Look Inside the Modern Analytics Stack

Dataversity

APRIL 1, 2021

In the data-driven world we live in today, the field of analytics has become increasingly important to remain competitive in business. In fact, a study by McKinsey Global Institute shows that data-driven organizations are 23 times more likely to outperform competitors in customer acquisition and nine times […].

Analytics

Analytics Analytics Data Silos Data Lakes

10 Advantages of Real-Time Data Streaming in Commerce

Dataversity

MARCH 12, 2024

While we haven’t built technology that enables real-time matter transfer yet, modern science is pursuing concepts like superposition and quantum teleportation to facilitate information transfer across any distance […] The post 10 Advantages of Real-Time Data Streaming in Commerce appeared first on DATAVERSITY.

Data Pipeline

Data Pipeline ETL Cloud Data Analytics

Top 10 Python Scripts for use in Matillion for Snowflake

phData

OCTOBER 28, 2024

Modern low-code/no-code ETL tools allow data engineers and analysts to build pipelines seamlessly using a drag-and-drop and configure approach with minimal coding. One such option is the availability of Python Components in Matillion ETL, which allows us to run Python code inside the Matillion instance.

Python

Python ETL AWS Database

Fivetran Modern Data Stack Conference 2023: Key Takeaways

Alation

APRIL 14, 2023

Last week, the Alation team had the privilege of joining IT professionals, business leaders, and data analysts and scientists for the Modern Data Stack Conference in San Francisco. In “The modern data stack is dead, long live the modern data stack!” Cloud costs are growing prohibitive.

Data Pipeline

Data Pipeline Data Warehouse Cloud Data ETL

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

If the event log is your customer’s diary, think of persistent staging as their scrapbook – a place where raw customer data is collected, organized, and kept for future reference. In traditional ETL (Extract, Transform, Load) processes in CDPs, staging areas were often temporary holding pens for data.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

The power of remote engine execution for ETL/ELT data pipelines

Webinars

Trending Sources

How Cloud Data Platforms improve Shopfloor Management

Webinars

How to Build ETL Data Pipeline in ML

List of ETL Tools: Explore the Top ETL Tools for 2025

Choosing the Right ETL Platform: Benefits for Data Integration

How to Set up a CICD Pipeline for Snowflake to Automate Data Pipelines

What Is Fivetran and How Much Does It Cost?

On-Prem vs. The Cloud: Key Considerations

Where Does Fivetran Fit into The Modern Data Stack?

Best Practices When Developing Matillion Jobs

How The Explosive Growth Of Data Access Affects Your Engineer’s Team Efficiency

What is the Snowflake Data Cloud and How Much Does it Cost?

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

The Modern Data Stack Explained: What The Future Holds

A Guide to Choose the Best Data Science Bootcamp

Top 5 Fivetran Connectors for Healthcare

Why a Streaming-First Approach to Digital Modernization Matters

How to Maximize Time to Value with Fivetran and dbt

Getting Started With Matillion Data Productivity Cloud

Ensure Success with Trusted Data When Moving To The Cloud

How to Connect Snowflake to Python

Using Matillion Data Productivity Cloud to call APIs

Turnkey Cloud DataOps: Solution from Alation and Accenture

How to Choose a Futureproof Data Integration Solution

Alation 2023.1: Easing Self-Service for the Modern Data Stack with Databricks and dbt Labs

How to Choose a Futureproof Data Integration Solution

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

How Does Snowpark Work?

What are the Biggest Challenges with Migrating to Snowflake?

The Ultimate Modern Data Stack Migration Guide

Exploring the AI and data capabilities of watsonx

A Look Inside the Modern Analytics Stack

10 Advantages of Real-Time Data Streaming in Commerce

Top 10 Python Scripts for use in Matillion for Snowflake

Fivetran Modern Data Stack Conference 2023: Key Takeaways

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Stay Connected