Cloud Data, Data Warehouse and Document

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

In the contemporary age of Big Data, Data Warehouse Systems and Data Science Analytics Infrastructures have become an essential component for organizations to store, analyze, and make data-driven decisions. So why using IaC for Cloud Data Infrastructures?

Data Warehouse

Data Warehouse Azure SQL Database

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

ETL

ETL Data Warehouse Analytics Analytics

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

When needed, the system can access an ODAP data warehouse to retrieve additional information. Document management Documents are securely stored in Amazon S3, and when new documents are added, a Lambda function processes them into chunks.

AWS

AWS Data Governance Data Silos SQL

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Beyond data: Cloud analytics mastery for business brilliance

Dataconomy

SEPTEMBER 4, 2023

Text analytics: Text analytics, also known as text mining, deals with unstructured text data, such as customer reviews, social media comments, or documents. It uses natural language processing (NLP) techniques to extract valuable insights from textual data. Poor data integration can lead to inaccurate insights.

Analytics

Analytics Analytics Big Data Analytics Big Data Analytics

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

Snowflake’s cloud-agnosticism, separation of storage and compute resources, and ability to handle semi-structured data have exemplified Snowflake as the best-in-class cloud data warehousing solution. Snowflake supports data sharing and collaboration across organizations without the need for complex data pipelines.

Machine Learning

Machine Learning Machine Learning Data Science ML

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

Usually the term refers to the practices, techniques and tools that allow access and delivery through different fields and data structures in an organisation. Data management approaches are varied and may be categorised in the following: Cloud data management. Master data management. Data transformation.

Data Warehouse

Data Warehouse SQL Azure ETL

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

phData

APRIL 28, 2025

A Matillion pipeline is a collection of jobs that extract, load, and transform (ETL/ELT) data from various sources into a target system, such as a cloud data warehouse like Snowflake. Document business rules and assumptions directly within the workflow. Data tables used and their role in the workflow.

AI

AI AI SQL ETL

Introducing watsonx: The future of AI for business

IBM Journey to AI blog

MAY 9, 2023

is not just for data scientists and developers — business users can also access it via an easy-to-use interface that responds to natural language prompts for different tasks. With watsonx.data , businesses can quickly connect to data, get trusted insights and reduce data warehouse costs. Watsonx.ai

AI

AI AI Data Warehouse Machine Learning

What Is Fivetran and How Much Does It Cost?

phData

MARCH 8, 2023

Fivetran is an automated data integration platform that offers a convenient solution for businesses to consolidate and sync data from disparate data sources. With over 160 data connectors available, Fivetran makes it easy to move data out of, into, and across any cloud data platform in the market.

Data Warehouse

Data Warehouse Data Engineering Data Engineering Data Engineering

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. Data ingestion/integration services. Data orchestration tools.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

The demand for information repositories enabling business intelligence and analytics is growing exponentially, giving birth to cloud solutions. The ultimate need for vast storage spaces manifests in data warehouses: specialized systems that aggregate data coming from numerous sources for centralized management and consistency.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

How to Split Text For Vector Embeddings in Snowflake

phData

NOVEMBER 28, 2024

“ Vector Databases are completely different from your cloud data warehouse.” – You might have heard that statement if you are involved in creating vector embeddings for your RAG-based Gen AI applications. When documents are split into smaller chunks, search systems can find relevant sections more precisely and quickly.

Python

Python Database SQL Machine Learning

List of ETL Tools: Explore the Top ETL Tools for 2025

Pickl AI

APRIL 9, 2025

By 2025, global data volumes are expected to reach 181 zettabytes, according to IDC. To harness this data effectively, businesses rely on ETL (Extract, Transform, Load) tools to extract, transform, and load data into centralized systems like data warehouses.

ETL

ETL Data Warehouse AWS Business Intelligence

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

Fivetran enables healthcare organizations to ingest data securely and effectively from a variety of sources into their target destinations, such as Snowflake or other cloud data platforms, for further analytics or curation for sharing data with external providers or customers.

SQL

SQL Data Warehouse Azure Cloud Data

Alation 2022.1: Customize Your Data Catalog

Alation

MARCH 1, 2022

Lineage helps them identify the source of bad data to fix the problem fast. Manual lineage will give ARC a fuller picture of how data was created between AWS S3 data lake, Snowflake cloud data warehouse and Tableau (and how it can be fixed). Time is money,” said Leonard Kwok, Senior Data Analyst, ARC.

Data Warehouse

Data Warehouse Data Lakes Cloud Data Database

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

With the birth of cloud data warehouses, data applications, and generative AI , processing large volumes of data faster and cheaper is more approachable and desired than ever. First up, let’s dive into the foundation of every Modern Data Stack, a cloud-based data warehouse.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Data integration is essentially the Extract and Load portion of the Extract, Load, and Transform (ELT) process. Data ingestion involves connecting your data sources, including databases, flat files, streaming data, etc, to your data warehouse. Snowflake provides native ways for data ingestion.

Data Warehouse

Data Warehouse Azure AWS Database

Why Upgrade to dbt Cloud over dbt Core?

phData

OCTOBER 12, 2022

Hosted Doc Site for Documentation One of the most powerful features of dbt can be the documentation you generate. This documentation can give different users insight into where data came from, what the profile of the data is, what the SQL looked like, and the DAG to know where the data is being used.

SQL

SQL Data Warehouse Data Visualization Cloud Data

Getting Started With Matillion Data Productivity Cloud

phData

NOVEMBER 28, 2023

Matillion is also built for scalability and future data demands, with support for cloud data platforms such as Snowflake Data Cloud , Databricks, Amazon Redshift, Microsoft Azure Synapse, and Google BigQuery, making it future-ready, everyone-ready, and AI-ready. That process will not take longer than 3 minutes!

Data Warehouse

Data Warehouse Data Pipeline ETL Azure

Configure cross-account access of Amazon Redshift clusters in Amazon SageMaker Studio using VPC peering

AWS Machine Learning Blog

JULY 17, 2023

Amazon Redshift is a fully managed, fast, secure, and scalable cloud data warehouse. Organizations often want to use SageMaker Studio to get predictions from data stored in a data warehouse such as Amazon Redshift. She is passionate about data-driven AI and the area of depth in machine learning.

Clustering

Clustering AWS ML ML

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

These encoder-only architecture models are fast and effective for many enterprise NLP tasks, such as classifying customer feedback and extracting information from large documents. While they require task-specific labeled data for fine tuning, they also offer clients the best cost performance trade-off for non-generative use cases.

AI

AI AI Machine Learning Machine Learning

Child support systems modernization: The time is now

IBM Journey to AI blog

SEPTEMBER 26, 2023

States’ existing investments in modernizing and enhancing ancillary supportive technologies (such as document management, web portals, mobile applications, data warehouses and location services) could negate the need for certain system requirements as part of the child support system modernization initiative.

Data Warehouse

Data Warehouse Cloud Computing Cloud Data Analytics

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. But what does this mean from a practitioner perspective?

Data Analyst

Data Analyst Data Scientist Analytics Analytics

Top 5 Fivetran Connectors For Financial Services

phData

JANUARY 24, 2024

Fivetran includes features like data movement, transformations, robust security, and compatibility with third-party tools like DBT, Airflow, Atlan, and more. Its seamless integration with popular cloud data warehouses like Snowflake can provide the scalability needed as your business grows.

Data Warehouse

Data Warehouse Data Pipeline Data Governance Cloud Data

Why a Streaming-First Approach to Digital Modernization Matters

Precisely

APRIL 3, 2023

It simply wasn’t practical to adopt an approach in which all of an organization’s data would be made available in one central location, for all-purpose business analytics. To speed analytics, data scientists implemented pre-processing functions to aggregate, sort, and manage the most important elements of the data.

ETL

ETL Analytics Analytics Database

Modernizing child support enforcement with IBM and AWS

IBM Journey to AI blog

JUNE 2, 2023

improved document management capabilities, web portals, mobile applications, data warehouses, enhanced location services, etc.) .” For example, the core systems technology landscape for each state could be a mainframe legacy system with varying degrees of maturity, portability, reliability and scalability.

AWS

AWS Data Warehouse Cloud Data Database

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

This two-part series will explore how data discovery, fragmented data governance , ongoing data drift, and the need for ML explainability can all be overcome with a data catalog for accurate data and metadata record keeping. The Cloud Data Migration Challenge. Data pipeline orchestration.

Data Governance

Data Governance ML ML Cloud Data

Why Migrate From Teradata to Snowflake

phData

MAY 4, 2023

To date, the company’s data warehousing solutions are largely built from the same template used in 1979. In short, they are still the model of multiple processors and massive disk storage with data warehouse software on the top layer managing it all.

SQL

SQL Data Warehouse Azure Big Data

What Are the Key Features of Fivetran & dbt?

phData

AUGUST 21, 2023

They provide loose coupling between the business logic that processes your data and the platform and data that it is executed upon. With Fivetran, you can quickly and easily switch between different data warehouse technologies in which to land your data, as well as popular open-source lake formats such as Apache Iceberg.

Data Warehouse

Data Warehouse SQL Cloud Data Database

Top 10 Reasons for Alation with Snowflake: Reduce Risk with Active Data Governance

Alation

SEPTEMBER 7, 2021

With Snowflake, data stewards have a choice to leverage Snowflake’s governance policies. First, stewards are dependent on data warehouse admins to provide information and to create and edit enforcement policies in Snowflake. Alation’s data lineage helps organizations to secure their data in the Snowflake Data Cloud.

Data Governance

Data Governance Data Scientist Data Quality Data Profiling

The First Pillar of Data Culture: Data Search & Discovery

Alation

JUNE 9, 2021

We have an explosion, not only in the raw amount of data, but in the types of database systems for storing it ( db-engines.com ranks over 340) and architectures for managing it (from operational datastores to data lakes to cloud data warehouses). Organizations are drowning in a deluge of data.

Data Governance

Data Governance Database Cloud Data Machine Learning

How to Pass the dbt Cloud Administrator Exam: Your Comprehensive Guide

phData

AUGUST 15, 2023

dbt Labs is a robust platform that allows individuals comfortable with SQL to incorporate software engineering’s best practices into their data transformation pipelines. These practices encompass aspects such as code versioning, testing, documentation, and modular programming. Know when they are updated.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

What is Fivetran LDP?

phData

AUGUST 22, 2023

LDP Locations In Fivetran’s LDP, a location refers to a specific storage space (database or file storage) where it can replicate data from a source location or storage space where LDP can replicate data to the Target location. For more information on HVA, visit the official documentation. Image from FiveTran documentation.

Database

Database Data Warehouse Cloud Data Analytics

How to Setup Your HVR / Fivetran LDP Architecture

phData

AUGUST 22, 2023

LDP (HVR) Locations In Fivetran’s LDP (HVR), a location refers to a specific storage space (database or file storage) where it can replicate data from a source location or storage space where LDP (HVR) can replicate data to the Target location. For more information on HVA, visit the official documentation.

Database

Database Data Warehouse Cloud Data Analytics

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Alation

OCTOBER 18, 2022

Alation is pleased to be named a dbt Metrics Partner and to announce the start of a partnership with dbt, which will bring dbt data into the Alation data catalog. In the modern data stack, dbt is a key tool to make data ready for analysis.

Data Analyst

Data Analyst Data Engineering Data Engineer Data Engineering

How to Optimize Power BI and Snowflake for Advanced Analytics

phData

MAY 25, 2023

One big issue that contributes to this resistance is that although Snowflake is a great cloud data warehousing platform, Microsoft has a data warehousing tool of its own called Synapse. For more information on composite models, check out Microsoft’s official documentation.

Power BI

Power BI Analytics Analytics Azure

Top 5 Use Cases of phData’s Advisor Tool

phData

MARCH 29, 2024

Founded in 2014 by three leading cloud engineers, phData focuses on solving real-world data engineering, operations, and advanced analytics problems with the best cloud platforms and products. Over the years, one of our primary focuses became Snowflake and migrating customers to this leading cloud data platform.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Using Matillion Data Productivity Cloud to call APIs

phData

JANUARY 19, 2024

Matillion is also built for scalability and future data demands, with support for cloud data platforms such as Snowflake Data Cloud , Databricks, Amazon Redshift, Microsoft Azure Synapse, and Google BigQuery, making it future-ready, everyone-ready, and AI-ready. Check out the API documentation for our sample.

Data Pipeline

Data Pipeline Data Warehouse ETL Azure

Picking the Right Notebook for Your Data Science Team

DataRobot Blog

FEBRUARY 21, 2022

Data analysts spent many hours converting assets into reports or refactoring them in more graphic native tools, such as Tableau. By implementing open source notebooks like Jupyter in a browser, data science can join programming, some documentation (using Markdown), tables, and graphics all in the same environment.

Data Science

Data Science Python Data Scientist Machine Learning

What is Identity Resolution? A Comprehensive Guide

phData

MAY 6, 2024

Another benefit of deterministic matching is that the process to build these identities is relatively simple, and tools your teams might already use, like SQL and dbt , can efficiently manage this process within your cloud data warehouse. However, targeted web advertising may only require linkage to a browser or device ID.

Data Lakes

Data Lakes Data Warehouse Cloud Data SQL

Beginner’s Guide To GCP BigQuery (Part 1)

Mlearning.ai

JULY 10, 2023

In my 7 years of Data Science journey, I’ve been exposed to a number of different databases including but not limited to Oracle Database, MS SQL, MySQL, EDW, and Apache Hadoop. The single most common way to create a view in a dataset is by CREATE VIEW DDL statement and you can refer to the official documentation to explore more options.

SQL

SQL Database Apache Hadoop Data Science

How to Build a Data Mesh in Snowflake

phData

SEPTEMBER 20, 2023

A data mesh is a conceptual architectural approach for managing data in large organizations. Traditional data management approaches often involve centralizing data in a data warehouse or data lake, leading to challenges like data silos, data ownership issues, and data access and processing bottlenecks.

Data Silos

Data Silos Database Data Quality Data Engineering

Advance environmental sustainability in clinical trials using AWS

AWS Machine Learning Blog

NOVEMBER 1, 2024

Much of these greenhouse gas emissions can be attributed to travel (such as air travel, hotel, meetings), distribution associated for drugs and documents, and electricity used in coordination centers. Instead, a core component of decentralized clinical trials is a secure, scalable data infrastructure with strong data analytics capabilities.

AWS

AWS Data Lakes Machine Learning Machine Learning

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

Trending Sources

Shaping the future: OMRON’s data-driven journey with AWS

Webinars

Beyond data: Cloud analytics mastery for business brilliance

How Dataiku and Snowflake Strengthen the Modern Data Stack

The Best Data Management Tools For Small Businesses

Optimizing Matillion Workflows: A Guide to Visual Design and Best Practices

Introducing watsonx: The future of AI for business

What Is Fivetran and How Much Does It Cost?

The Modern Data Stack Explained: What The Future Holds

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

How to Split Text For Vector Embeddings in Snowflake

List of ETL Tools: Explore the Top ETL Tools for 2025

Top 5 Fivetran Connectors for Healthcare

Alation 2022.1: Customize Your Data Catalog

The Ultimate Modern Data Stack Migration Guide

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

Why Upgrade to dbt Cloud over dbt Core?

Getting Started With Matillion Data Productivity Cloud

Configure cross-account access of Amazon Redshift clusters in Amazon SageMaker Studio using VPC peering

Exploring the AI and data capabilities of watsonx

Child support systems modernization: The time is now

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Top 5 Fivetran Connectors For Financial Services

Why a Streaming-First Approach to Digital Modernization Matters

Modernizing child support enforcement with IBM and AWS

The Cloud Connection: How Governance Supports Security

Why Migrate From Teradata to Snowflake

What Are the Key Features of Fivetran & dbt?

Top 10 Reasons for Alation with Snowflake: Reduce Risk with Active Data Governance

The First Pillar of Data Culture: Data Search & Discovery

How to Pass the dbt Cloud Administrator Exam: Your Comprehensive Guide

What is Fivetran LDP?

How to Setup Your HVR / Fivetran LDP Architecture

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

How to Optimize Power BI and Snowflake for Advanced Analytics

Top 5 Use Cases of phData’s Advisor Tool

Using Matillion Data Productivity Cloud to call APIs

Picking the Right Notebook for Your Data Science Team

What is Identity Resolution? A Comprehensive Guide

Beginner’s Guide To GCP BigQuery (Part 1)

How to Build a Data Mesh in Snowflake

Advance environmental sustainability in clinical trials using AWS

Stay Connected