Data Modeling, Document and ETL - Data Science Current

Data Modeling

Document

ETL

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

So why using IaC for Cloud Data Infrastructures? This ensures that the data models and queries developed by data professionals are consistent with the underlying infrastructure. Enhanced Security and Compliance Data Warehouses often store sensitive information, making security a paramount concern.

Data Warehouse

Data Warehouse Azure SQL Database

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Pickl AI

JUNE 7, 2024

Summary: Choosing the right ETL tool is crucial for seamless data integration. Top contenders like Apache Airflow and AWS Glue offer unique features, empowering businesses with efficient workflows, high data quality, and informed decision-making capabilities. Choosing the right ETL tool is crucial for smooth data management.

ETL

ETL Data Quality Data Pipeline Data Warehouse

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Beyond data: Cloud analytics mastery for business brilliance

Dataconomy

SEPTEMBER 4, 2023

Key features of cloud analytics solutions include: Data models , Processing applications, and Analytics models. Data models help visualize and organize data, processing applications handle large datasets efficiently, and analytics models aid in understanding complex data sets, laying the foundation for business intelligence.

Analytics

Analytics Analytics Big Data Analytics Big Data Analytics

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

However, to fully harness the potential of a data lake, effective data modeling methodologies and processes are crucial. Data modeling plays a pivotal role in defining the structure, relationships, and semantics of data within a data lake. Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

An example direct acyclic graph (DAG) might automate data ingestion, processing, model training, and deployment tasks, ensuring that each step is run in the correct order and at the right time. Though it’s worth mentioning that Airflow isn’t used at runtime as is usual for extract, transform, and load (ETL) tasks.

AWS

AWS Machine Learning Machine Learning ML

How to Better Plan Your Snowflake Migration

phData

SEPTEMBER 26, 2023

A common problem solved by phData is the migration from an existing data platform to the Snowflake Data Cloud , in the best possible manner. Data flows from the current data platform to the destination. Either way, it’s important to understand what data is transformed, and how so.

SQL

SQL Database ETL Data Models

Hierarchies in Dimensional Modelling

Pickl AI

AUGUST 9, 2024

Hierarchies align data modelling with business processes, making it easier to analyse data in a context that reflects real-world operations. Designing Hierarchies Designing effective hierarchies requires careful consideration of the business requirements and the data model.

Data Warehouse

Data Warehouse Data Quality ETL Business Intelligence

Who is a BI Developer: Role, Responsibilities & Skills

Pickl AI

JULY 3, 2023

It is the process of converting raw data into relevant and practical knowledge to help evaluate the performance of businesses, discover trends, and make well-informed choices. Data gathering, data integration, data modelling, analysis of information, and data visualization are all part of intelligence for businesses.

Business Intelligence

Business Intelligence Business Intelligence SQL Data Visualization

Understanding Business Intelligence Architecture: Key Components

Pickl AI

JANUARY 28, 2025

Data can be structured (e.g., documents and images). The diversity of data sources allows organizations to create a comprehensive view of their operations and market conditions. Data Integration Once data is collected from various sources, it needs to be integrated into a cohesive format.

Business Intelligence

Business Intelligence Business Intelligence ETL Data Lakes

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. This graph is an example of one analysis, documented in our internal catalog.

Data Analyst

Data Analyst Data Scientist Analytics Analytics

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

phData

AUGUST 10, 2023

Leverage dbt’s `test` macros within your models and add constraints to ensure data integrity between data vault entities. Maintain lineage and documentation: Data Vault emphasizes documenting the data lineage and providing clear documentation for each model.

SQL

SQL Data Observability Data Quality Data Pipeline

Understanding Zero-Code Development Life Cycle in Matillion

phData

MAY 11, 2023

With the “Data Productivity Cloud” launch, Matillion has achieved a balance of simplifying source control, collaboration, and dataops by elevating Git integration to a “first-class citizen” within the framework. In Matillion ETL, the Git integration enables an organization to connect to any Git offering (e.g.,

ETL

ETL Analytics Analytics Data Models

What Free Tools Pair Well With The Snowflake AI Data Cloud?

phData

OCTOBER 17, 2024

Apache Airflow Airflow is an open-source ETL software that is very useful when paired with Snowflake. dbt offers a SQL-first transformation workflow that lets teams build data transformation pipelines while following software engineering best practices like CI/CD, modularity, and documentation.

AI AI SQL Data Quality

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

Data Preprocessing Here, you can process the unstructured data into a format that can be used for the other downstream tasks. For instance, if the collected data was a text document in the form of a PDF, the data preprocessing—or preparation stage —can extract tables from this document. Unstructured.io

Machine Learning

Machine Learning Machine Learning Data Lakes AI

dbt and Sigma Integration

phData

JUNE 27, 2023

Using SQL-centric transformations to model data to be deployed. dbt is also great for data lineage and documentation to empower business analysts to make informed decisions on their data. dbt’s addition of data freshness, quality, and cataloging is just another example of Sigma’s vision.

SQL

SQL Database Data Quality Data Warehouse

How to Effectively Handle Unstructured Data Using AI

DagsHub

NOVEMBER 11, 2024

In this article, we’ll explore how AI can transform unstructured data into actionable intelligence, empowering you to make informed decisions, enhance customer experiences, and stay ahead of the competition. What is Unstructured Data? Word2Vec , GloVe , and BERT are good sources of embedding generation for textual data.

AI AI Data Lakes Database

Best Practices for Fact Tables in Dimensional Models

Pickl AI

AUGUST 11, 2024

Consider factors such as data volume, query patterns, and hardware constraints. Document and Communicate Maintain thorough documentation of fact table designs, including definitions, calculations, and relationships. Use indexing and partitioning strategies to improve query performance.

Data Quality

Data Quality Data Warehouse Data Governance Analytics

How and When to Use Dataflows in Power BI

phData

SEPTEMBER 28, 2023

Power BI Dataflows provide vital functionalities that effectively empower users to cleanse and reshape data from various sources. These Dataflows are crucial in fostering consistency and reducing the duplication of repetitive ETL (Extract, Transform, Load) steps, achieved by reusing transformations.

Power BI

Power BI Data Preparation Machine Learning Machine Learning

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

Slow Response to New Information: Legacy data systems often lack the computation power necessary to run efficiently and can be cost-inefficient to scale. This typically results in long-running ETL pipelines that cause decisions to be made on stale or old data.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

Your Essential Guide to MongoDB Interview Questions and Answers

Pickl AI

JULY 18, 2024

MongoDB is a NoSQL database that handles large-scale data and modern application requirements. Unlike traditional relational databases, MongoDB stores data in flexible, JSON-like documents, allowing for dynamic schemas. In contrast, MongoDB’s document-based model allows for a more flexible and scalable approach.

Database

Database SQL Data Analyst Database Administration

Best Data Engineering Tools Every Engineer Should Know

Pickl AI

MARCH 19, 2025

It is widely used for storing and managing structured data, making it an essential tool for data engineers. MongoDB MongoDB is a NoSQL database that stores data in flexible, JSON-like documents. Apache Spark Apache Spark is a powerful data processing framework that efficiently handles Big Data.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Webinars

Trending Sources

Beyond data: Cloud analytics mastery for business brilliance

Webinars

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

How to Better Plan Your Snowflake Migration

Hierarchies in Dimensional Modelling

Who is a BI Developer: Role, Responsibilities & Skills

Understanding Business Intelligence Architecture: Key Components

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

Understanding Zero-Code Development Life Cycle in Matillion

What Free Tools Pair Well With The Snowflake AI Data Cloud?

How to Manage Unstructured Data in AI and Machine Learning Projects

dbt and Sigma Integration

How to Effectively Handle Unstructured Data Using AI

Best Practices for Fact Tables in Dimensional Models

How and When to Use Dataflows in Power BI

The Ultimate Modern Data Stack Migration Guide

Your Essential Guide to MongoDB Interview Questions and Answers

Best Data Engineering Tools Every Engineer Should Know

Stay Connected