Data Modeling, ETL and SQL - Data Science Current

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

It offers full BI-Stack Automation, from source to data warehouse through to frontend. It supports a holistic data model, allowing for rapid prototyping of various models. It also supports a wide range of data warehouses, analytical databases, data lakes, frontends, and pipelines/ETL.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Remote Data Science Jobs: 5 High-Demand Roles for Career Growth

Data Science Dojo

OCTOBER 31, 2024

Their role is crucial in understanding the underlying data structures and how to leverage them for insights. Key Skills Proficiency in SQL is essential, along with experience in data visualization tools such as Tableau or Power BI. Programming Questions Data science roles typically require knowledge of Python, SQL, R, or Hadoop.

Data Science

Data Science Data Scientist Machine Learning Machine Learning

Navigate your way to success – Top 10 data science careers to pursue in 2023

Data Science Dojo

MAY 10, 2023

Top 10 Professions in Data Science: Below, we provide a list of the top data science careers along with their corresponding salary ranges: 1. Data Scientist Data scientists are responsible for designing and implementing data models, analyzing and interpreting data, and communicating insights to stakeholders.

Data Science

Data Science Data Scientist Database Administration Machine Learning

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

So why using IaC for Cloud Data Infrastructures? This ensures that the data models and queries developed by data professionals are consistent with the underlying infrastructure. Enhanced Security and Compliance Data Warehouses often store sensitive information, making security a paramount concern.

Data Warehouse

Data Warehouse Azure SQL Database

TigerEye (YC S22) Is Hiring a Full Stack Engineer

Hacker News

NOVEMBER 19, 2024

Here are a few of the things that you might do as an AI Engineer at TigerEye: - Design, develop, and validate statistical models to explain past behavior and to predict future behavior of our customers’ sales teams - Own training, integration, deployment, versioning, and monitoring of ML components - Improve TigerEye’s existing metrics collection and (..)

Computer Science

Computer Science Computer Science ML ML

5 Reasons Why SQL is Still the Most Accessible Language for New Data Scientists

ODSC - Open Data Science

APRIL 6, 2023

Though both are great to learn, what gets left out of the conversation is a simple yet powerful programming language that everyone in the data science world can agree on, SQL. But why is SQL, or Structured Query Language , so important to learn? Let’s start with the first clause often learned by new SQL users, the WHERE clause.

SQL

SQL Data Scientist Database Data Science

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Data engineering tools are software applications or frameworks specifically designed to facilitate the process of managing, processing, and transforming large volumes of data. It allows data engineers to define and manage complex workflows as directed acyclic graphs (DAGs).

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Pickl AI

JUNE 7, 2024

Summary: Choosing the right ETL tool is crucial for seamless data integration. Top contenders like Apache Airflow and AWS Glue offer unique features, empowering businesses with efficient workflows, high data quality, and informed decision-making capabilities. Choosing the right ETL tool is crucial for smooth data management.

ETL

ETL Data Quality Data Pipeline Data Warehouse

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

Data exploration and model development were conducted using well-known machine learning (ML) tools such as Jupyter or Apache Zeppelin notebooks. Apache Hive was used to provide a tabular interface to data stored in HDFS, and to integrate with Apache Spark SQL. HBase is employed to offer real-time key-based access to data.

Data Science

Data Science AWS Hadoop Data Scientist

How to Use Custom SQL and CSVs in Sigma Computing

phData

JULY 10, 2024

Sigma Computing , a cloud-based analytics platform, helps data analysts and business professionals maximize their data with collaborative and scalable analytics. One of Sigma’s key features is its support for custom SQL queries and CSV file uploads. These tools allow users to handle more advanced data tasks and analyses.

SQL

SQL Data Warehouse Analytics Analytics

Optimizing Snowflake’s Performance for Data Vault Modeling

phData

OCTOBER 9, 2023

In this blog, we explore best practices and techniques to optimize Snowflake’s performance for data vault modeling , enabling your organizations to achieve efficient data processing, accelerated query performance, and streamlined ETL workflows. This can make it nearly impossible to “handwrite” these SQL queries.

ETL

ETL Clustering Data Warehouse SQL

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

However, to fully harness the potential of a data lake, effective data modeling methodologies and processes are crucial. Data modeling plays a pivotal role in defining the structure, relationships, and semantics of data within a data lake. Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

Best Data Engineering Tools Every Engineer Should Know

Pickl AI

MARCH 19, 2025

Summary: Data engineering tools streamline data collection, storage, and processing. Tools like Python, SQL, Apache Spark, and Snowflake help engineers automate workflows and improve efficiency. Learning these tools is crucial for building scalable data pipelines.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Introduction to Power BI Datamarts

ODSC - Open Data Science

JUNE 12, 2023

This article is an excerpt from the book Expert Data Modeling with Power BI, Third Edition by Soheil Bakhshi, a completely updated and revised edition of the bestselling guide to Power BI and data modeling. Then we have some other ETL processes to constantly land the past 5 years of data into the Datamarts.

Power BI

Power BI Data Warehouse ETL Data Preparation

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Women in Big Data

NOVEMBER 27, 2024

By maintaining historical data from disparate locations, a data warehouse creates a foundation for trend analysis and strategic decision-making. Evaluate integration capabilities with existing data sources and Extract Transform and Load (ETL) tools. Its PostgreSQL foundation ensures compatibility with most SQL clients.

Data Warehouse

Data Warehouse Big Data Big Data Azure

Who is a BI Developer: Role, Responsibilities & Skills

Pickl AI

JULY 3, 2023

It is the process of converting raw data into relevant and practical knowledge to help evaluate the performance of businesses, discover trends, and make well-informed choices. Data gathering, data integration, data modelling, analysis of information, and data visualization are all part of intelligence for businesses.

Business Intelligence

Business Intelligence Business Intelligence SQL Data Visualization

How to Better Plan Your Snowflake Migration

phData

SEPTEMBER 26, 2023

Data flows from the current data platform to the destination. The necessary access is granted so data flows without issue. SQL Server Agent jobs). Transformations Transformations can be a part of data ingestion (ETL pattern) or can take place at a later stage after data has been landed (ELT pattern).

SQL

SQL Database ETL Data Modeling

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Data Visualization: Matplotlib, Seaborn, Tableau, etc.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

A Comprehensive Guide to Business Intelligence Analysts

Pickl AI

MARCH 3, 2025

Summary: Business Intelligence Analysts transform raw data into actionable insights. They use tools and techniques to analyse data, create reports, and support strategic decisions. Key skills include SQL, data visualization, and business acumen. Introduction We are living in an era defined by data.

Business Intelligence

Business Intelligence Business Intelligence Data Analyst Data Visualization

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Summary: The fundamentals of Data Engineering encompass essential practices like data modelling, warehousing, pipelines, and integration. Understanding these concepts enables professionals to build robust systems that facilitate effective data management and insightful analysis. What is Data Engineering?

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

In contrast, data warehouses and relational databases adhere to the ‘Schema-on-Write’ model, where data must be structured and conform to predefined schemas before being loaded into the database. Schema Enforcement: Data warehouses use a “schema-on-write” approach. This ensures data consistency and integrity.

Data Lakes

Data Lakes Data Warehouse Database Big Data

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

phData

AUGUST 10, 2023

That said, dbt provides the ability to generate data vault models and also allows you to write your data transformations using SQL and code-reusable macros powered by Jinja2 to run your data pipelines in a clean and efficient way. The most important reason for using DBT in Data Vault 2.0

SQL

SQL Data Observability Data Quality Data Pipeline

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

An example direct acyclic graph (DAG) might automate data ingestion, processing, model training, and deployment tasks, ensuring that each step is run in the correct order and at the right time. Though it’s worth mentioning that Airflow isn’t used at runtime as is usual for extract, transform, and load (ETL) tasks.

AWS

AWS Machine Learning Machine Learning ML

Understanding Business Intelligence Architecture: Key Components

Pickl AI

JANUARY 28, 2025

Data Integration Once data is collected from various sources, it needs to be integrated into a cohesive format. Data Quality Management : Ensures that the integrated data is accurate, consistent, and reliable for analysis. They are useful for big data analytics where flexibility is needed.

Business Intelligence

Business Intelligence Business Intelligence ETL Data Lakes

Azure Data Engineer Jobs

Pickl AI

APRIL 6, 2023

Understand the fundamentals of data engineering: To become an Azure Data Engineer, you must first understand the concepts and principles of data engineering. Knowledge of data modeling, warehousing, integration, pipelines, and transformation is required. Hands-on experience working with SQLDW and SQL-DB.

Azure

Azure Data Engineering Data Engineer Data Engineering

How to Build a Power BI Datamart Using Snowflake Data

phData

JULY 11, 2023

Power BI Datamarts provides a low/no code experience directly within Power BI Service that allows developers to ingest data from disparate sources, perform ETL tasks with Power Query, and load data into a fully managed Azure SQL database. Blog: Data Modeling Fundamentals in Power BI. a.

Power BI

Power BI SQL Azure ETL

Hierarchies in Dimensional Modelling

Pickl AI

AUGUST 9, 2024

Hierarchies align data modelling with business processes, making it easier to analyse data in a context that reflects real-world operations. Designing Hierarchies Designing effective hierarchies requires careful consideration of the business requirements and the data model.

Data Warehouse

Data Warehouse Data Quality ETL Business Intelligence

dbt and Sigma Integration

phData

JUNE 27, 2023

Using SQL-centric transformations to model data to be deployed. dbt is also great for data lineage and documentation to empower business analysts to make informed decisions on their data. Is dbt an Ideal Fit for YOUR Organization’s Data Stack? It is a compiler and a runner. Proceed as you see fit.

SQL

SQL Database Data Quality Data Warehouse

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

In order to fully leverage this vast quantity of collected data, companies need a robust and scalable data infrastructure to manage it. This is where Fivetran and the Modern Data Stack come in. Snowflake Data Cloud Replication Transferring data from a source system to a cloud data warehouse.

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

If you will ask data professionals about what is the most challenging part of their day to day work, you will likely discover their concerns around managing different aspects of data before they get to graduate to the data modeling stage. Uses secure protocols for data security.

Data Pipeline

Data Pipeline ETL SQL Data Quality

From zero to BI hero: Launching your business intelligence career

Dataconomy

MARCH 24, 2023

Some of the common career opportunities in BI include: Entry-level roles Data analyst: A data analyst is responsible for collecting and analyzing data, creating reports, and presenting insights to stakeholders. They may also be involved in data modeling and database design.

Business Intelligence

Business Intelligence Business Intelligence Data Analysis Data Analysis

From zero to BI hero: Launching your business intelligence career

Dataconomy

MARCH 24, 2023

Some of the common career opportunities in BI include: Entry-level roles Data analyst: A data analyst is responsible for collecting and analyzing data, creating reports, and presenting insights to stakeholders. They may also be involved in data modeling and database design.

Business Intelligence

Business Intelligence Business Intelligence Data Analysis Data Analysis

What Free Tools Pair Well With The Snowflake AI Data Cloud?

phData

OCTOBER 17, 2024

Apache Airflow Airflow is an open-source ETL software that is very useful when paired with Snowflake. dbt offers a SQL-first transformation workflow that lets teams build data transformation pipelines while following software engineering best practices like CI/CD, modularity, and documentation.

AI

AI AI SQL Data Quality

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

We document these custom models in Alation Data Catalog and publish common queries that other teams can use for operational use cases or reporting needs. Contact title mappings, which are buiilt in some of data models, are documented within our data catalog. Jason: How do you use these models?

Data Analyst

Data Analyst Data Scientist Analytics Analytics

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

Data professionals such as data scientists want to use the power of Apache Spark , Hive , and Presto running on Amazon EMR for fast data preparation; however, the learning curve is steep. Solution overview We demonstrate this solution with an end-to-end use case using a sample dataset, the TPC data model.

AWS

AWS Data Lakes Clustering Data Preparation

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

The right data architecture can help your organization improve data quality because it provides the framework that determines how data is collected, transported, stored, secured, used and shared for business intelligence and data science use cases. Perform data quality monitoring based on pre-configured rules.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

Why Improving Problem-Solving Skills is Crucial for Data Engineers?

DataSeries

AUGUST 15, 2024

Knowledge of Core Data Engineering Concepts Ensure one possess a strong foundation in core data engineering concepts, which include data structures, algorithms, database management systems, data modeling , data warehousing , ETL (Extract, Transform, Load) processes, and distributed computing frameworks (e.g.,

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Why Snowflake is the Ideal Platform for Data Vault Modeling

phData

APRIL 20, 2023

Variant columns can be used to store data that doesn’t fit neatly into traditional columns, such as nested data structures, arrays, or key-value pairs. Using variant columns in data vault satellites in Snowflake can provide several benefits. If data is present, Tasks runs SQL to push it to the raw data vault objects.

Data Warehouse

Data Warehouse Data Governance Clustering Database

How and When to Use Dataflows in Power BI

phData

SEPTEMBER 28, 2023

Power BI Dataflows provide vital functionalities that effectively empower users to cleanse and reshape data from various sources. These Dataflows are crucial in fostering consistency and reducing the duplication of repetitive ETL (Extract, Transform, Load) steps, achieved by reusing transformations.

Power BI

Power BI Data Preparation Machine Learning Machine Learning

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

Slow Response to New Information: Legacy data systems often lack the computation power necessary to run efficiently and can be cost-inefficient to scale. This typically results in long-running ETL pipelines that cause decisions to be made on stale or old data.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

Best Practices for Fact Tables in Dimensional Models

Pickl AI

AUGUST 11, 2024

These tools enable effective data structuring, transformation, and analysis, supporting best practices for dimensional modelling and ensuring high-quality, consistent business metrics. These tools help streamline the design process and ensure consistency.

Data Quality

Data Quality Data Warehouse Data Governance Analytics

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

Here’s the structured equivalent of this same data in tabular form: With structured data, you can use query languages like SQL to extract and interpret information. In contrast, such traditional query languages struggle to interpret unstructured data. is similar to the traditional Extract, Transform, Load (ETL) process.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

Introduction: The Customer Data Modeling Dilemma You know, that thing we’ve been doing for years, trying to capture the essence of our customers in neat little profile boxes? For years, we’ve been obsessed with creating these grand, top-down customer data models. Yeah, that one.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

Your Essential Guide to MongoDB Interview Questions and Answers

Pickl AI

JULY 18, 2024

Comparison with Traditional Relational Databases Traditional relational databases (RDBMS) like MySQL or PostgreSQL store data in structured tables with predefined schemas. In contrast, MongoDB’s document-based model allows for a more flexible and scalable approach. MongoDB is a NoSQL database that uses a document-oriented data model.

Database

Database SQL Data Analyst Database Administration

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Remote Data Science Jobs: 5 High-Demand Roles for Career Growth

Webinars

Trending Sources

Navigate your way to success – Top 10 data science careers to pursue in 2023

Webinars

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

TigerEye (YC S22) Is Hiring a Full Stack Engineer

5 Reasons Why SQL is Still the Most Accessible Language for New Data Scientists

Essential data engineering tools for 2023: Empowering for management and analysis

Top ETL Tools: Unveiling the Best Solutions for Data Integration

How Rocket Companies modernized their data science solution on AWS

How to Use Custom SQL and CSVs in Sigma Computing

Optimizing Snowflake’s Performance for Data Vault Modeling

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Best Data Engineering Tools Every Engineer Should Know

Introduction to Power BI Datamarts

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Who is a BI Developer: Role, Responsibilities & Skills

How to Better Plan Your Snowflake Migration

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

A Comprehensive Guide to Business Intelligence Analysts

Discover the Most Important Fundamentals of Data Engineering

Data Version Control for Data Lakes: Handling the Changes in Large Scale

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Understanding Business Intelligence Architecture: Key Components

Azure Data Engineer Jobs

How to Build a Power BI Datamart Using Snowflake Data

Hierarchies in Dimensional Modelling

dbt and Sigma Integration

Where Does Fivetran Fit into The Modern Data Stack?

Comparing Tools For Data Processing Pipelines

From zero to BI hero: Launching your business intelligence career

From zero to BI hero: Launching your business intelligence career

What Free Tools Pair Well With The Snowflake AI Data Cloud?

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Data architecture strategy for data quality

Why Improving Problem-Solving Skills is Crucial for Data Engineers?

Why Snowflake is the Ideal Platform for Data Vault Modeling

How and When to Use Dataflows in Power BI

The Ultimate Modern Data Stack Migration Guide

Best Practices for Fact Tables in Dimensional Models

How to Manage Unstructured Data in AI and Machine Learning Projects

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Your Essential Guide to MongoDB Interview Questions and Answers

Stay Connected