Data Engineering, Data Modeling and Data Pipeline

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. They transform data into a consistent format for users to consume.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Data engineering tools are software applications or frameworks specifically designed to facilitate the process of managing, processing, and transforming large volumes of data. Essential data engineering tools for 2023 Top 10 data engineering tools to watch out for in 2023 1.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Becoming a Data Engineer: 7 Tips to Take Your Career to the Next Level

Data Science Connect

JANUARY 27, 2023

Data engineering is a crucial field that plays a vital role in the data pipeline of any organization. It is the process of collecting, storing, managing, and analyzing large amounts of data, and data engineers are responsible for designing and implementing the systems and infrastructure that make this possible.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Summary: The fundamentals of Data Engineering encompass essential practices like data modelling, warehousing, pipelines, and integration. Understanding these concepts enables professionals to build robust systems that facilitate effective data management and insightful analysis. What is Data Engineering?

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Unfolding the difference between data engineer, data scientist, and data analyst. Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. Read more to know.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Beyond The Data: Eugenia Pais, Sr. Data Engineer

phData

JULY 22, 2024

Welcome to Beyond the Data, a series that investigates the people behind the talent of phData. Data Engineer at phData. Data Engineer? As a Senior Data Engineer, I wear many hats. On the technical side, I clean and organize data, design storage solutions, and build transformation pipelines.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Why Improving Problem-Solving Skills is Crucial for Data Engineers?

DataSeries

AUGUST 15, 2024

Enrich data engineering skills by building problem-solving ability with real-world projects, teaming with peers, participating in coding challenges, and more. Globally several organizations are hiring data engineers to extract, process and analyze information, which is available in the vast volumes of data sets.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

The Data Engineer’s Roadmap

Dataversity

SEPTEMBER 28, 2022

Data engineering is a fascinating and fulfilling career – you are at the helm of every business operation that requires data, and as long as users generate data, businesses will always need data engineers. The journey to becoming a successful data engineer […].

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

This is where Fivetran and the Modern Data Stack come in. Fivetran is a fully-automated, zero-maintenance data pipeline tool that automates the ETL process from data sources to your cloud warehouse. Snowflake Data Cloud Replication Transferring data from a source system to a cloud data warehouse.

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

Alignment to other tools in the organization’s tech stack Consider how well the MLOps tool integrates with your existing tools and workflows, such as data sources, data engineering platforms, code repositories, CI/CD pipelines, monitoring systems, etc. For example, neptune.ai Can you render audio/video?

Machine Learning

Machine Learning Machine Learning ML ML

Find Your AI Solutions at the ODSC West AI Expo

ODSC - Open Data Science

OCTOBER 15, 2023

Elementl / Dagster Labs Elementl and Dagster Labs are both companies that provide platforms for building and managing data pipelines. Elementl’s platform is designed for data engineers, while Dagster Labs’ platform is designed for data scientists. ArangoDB is designed to be scalable, reliable, and easy to use.

Machine Learning

Machine Learning Machine Learning Data Pipeline AI

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

By analyzing datasets, data scientists can better understand their potential use in an algorithm or machine learning model. The data science lifecycle Data science is iterative, meaning data scientists form hypotheses and experiment to see if a desired outcome can be achieved using available data.

Data Science

Data Science Analytics Analytics Data Scientist

Unlocking Tabular Data’s Hidden Potential

ODSC - Open Data Science

MAY 10, 2023

Many mistakenly equate tabular data with business intelligence rather than AI, leading to a dismissive attitude toward its sophistication. Standard data science practices could also be contributing to this issue. Making data engineering more systematic through principles and tools will be key to making AI algorithms work.

Data Scientist

Data Scientist Data Science Deep Learning Deep Learning

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

It simplifies feature access for model training and inference, significantly reducing the time and complexity involved in managing data pipelines. Additionally, Feast promotes feature reuse, so the time spent on data preparation is reduced greatly. Saurabh Gupta is a Principal Engineer at Zeta Global.

AWS

AWS Machine Learning Machine Learning ML

What are Snowflake Dynamic Tables?

phData

NOVEMBER 2, 2023

Managing data pipelines efficiently is paramount for any organization. The Snowflake Data Cloud has introduced a groundbreaking feature that promises to simplify and supercharge this process: Snowflake Dynamic Tables. Flexibility: Dynamic tables allow batch and streaming pipelines to be specified in the same way.

Data Pipeline

Data Pipeline SQL Data Warehouse Data Engineer

The Official Machine Learning and AI Platform of Hacktoberfest 2023

DagsHub

SEPTEMBER 10, 2023

DagsHub is a centralized platform to host and manage machine learning projects, including code, data, models, experiments, annotations, model registry, and more! Intermediate Data Pipeline : Build data pipelines using DVC for automation and versioning of Open Source Machine Learning projects.

Machine Learning

Machine Learning Machine Learning Data Pipeline AI

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

phData

AUGUST 10, 2023

That said, dbt provides the ability to generate data vault models and also allows you to write your data transformations using SQL and code-reusable macros powered by Jinja2 to run your data pipelines in a clean and efficient way. The most important reason for using DBT in Data Vault 2.0

SQL

SQL Data Observability Data Quality Data Pipeline

How to Optimize Power BI and Snowflake for Advanced Analytics

phData

MAY 25, 2023

Model Your Data Appropriately Once you have chosen the method to connect to your data (Import, DirectQuery, Composite), you will need to make sure that you create an efficient and optimized data model. Here are some of our best practices for building data models in Power BI to optimize your Snowflake experience: 1.

Power BI

Power BI Analytics Analytics Azure

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Pickl AI

JUNE 7, 2024

Integration: Airflow integrates seamlessly with other data engineering and Data Science tools like Apache Spark and Pandas. IBM Infosphere DataStage IBM Infosphere DataStage is an enterprise-level ETL tool that enables users to design, develop, and run data pipelines. Read Further: Azure Data Engineer Jobs.

ETL

ETL Data Quality Data Pipeline Data Warehouse

How to use foundation models and trusted governance to manage AI workflow risk

IBM Journey to AI blog

OCTOBER 16, 2023

It includes processes that trace and document the origin of data, models and associated metadata and pipelines for audits. A data store lets a business connect existing data with new data and discover new insights with real-time analytics and business intelligence.

AI

AI AI Data Warehouse ML

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

The first generation of data architectures represented by enterprise data warehouse and business intelligence platforms were characterized by thousands of ETL jobs, tables, and reports that only a small group of specialized data engineers understood, resulting in an under-realized positive impact on the business.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

What Are dbt Artifacts

phData

FEBRUARY 8, 2024

Data Modeling, dbt has gradually emerged as a powerful tool that largely simplifies the process of building and handling data pipelines. dbt is an open-source command-line tool that allows data engineers to transform, test, and document the data into one single hub which follows the best practices of software engineering.

Data Modeling

Data Modeling Data Models Data Warehouse Database

What Industries are Hiring for Different Jobs in AI

ODSC - Open Data Science

APRIL 26, 2023

Tools such as the mentioned are critical for anyone interested in becoming a machine learning engineer. Data Engineer Data engineers are the authors of the infrastructure that stores, processes, and manages the large volumes of data an organization has.

Data Analyst

Data Analyst Machine Learning Machine Learning Power BI

Generative AI in Software Development

Mlearning.ai

JUNE 16, 2023

Game changer ChatGPT in Software Engineering: A Glimpse Into the Future | HackerNoon Generative AI for DevOps: A Practical View - DZone ChatGPT for DevOps: Best Practices, Use Cases, and Warnings. GPT-4 Data Pipelines: Transform JSON to SQL Schema Instantly Blockstream’s public Bitcoin API.

AI

AI AI Data Analysis Data Analysis

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

With proper unstructured data management, you can write validation checks to detect multiple entries of the same data. Continuous learning: In a properly managed unstructured data pipeline, you can use new entries to train a production ML model, keeping the model up-to-date.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

ML Collaboration: Best Practices From 4 ML Teams

The MLOps Blog

DECEMBER 28, 2022

Team composition The team comprises domain experts, data engineers, data scientists, and ML engineers. Team composition The team comprises data pipeline engineers, ML engineers, full-stack engineers, and data scientists.

ML

ML ML Data Scientist Machine Learning

LLMOps vs. MLOps: Understanding the Differences

Iguazio

FEBRUARY 8, 2024

Data engineers, data scientists and other data professional leaders have been racing to implement gen AI into their engineering efforts. Data Pipeline - Manages and processes various data sources. ML Pipeline - Focuses on training, validation and deployment. LLMOps is MLOps for LLMs.

ML

ML ML Data Scientist AI

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

Thus, the solution allows for scaling data workloads independently from one another and seamlessly handling data warehousing, data lakes , data sharing, and engineering. Therefore, you’ll be empowered to truncate and reprocess data if bugs are detected and provide an excellent raw data source for data scientists.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

Star Schema vs. Snowflake Schema: Comparing Dimensional Modeling Techniques

Pickl AI

JULY 25, 2024

By normalising dimension tables, the schema ensures data is stored efficiently, eliminating duplicate entries and minimising the need for repetitive data storage. This normalisation helps conserve storage space and maintain a cleaner data model. Explore More: Build Data Pipelines: Comprehensive Step-by-Step Guide.

Data Warehouse

Data Warehouse Business Intelligence Business Intelligence Database

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

Enter dbt dbt provides SQL-centric transformations for your data modeling and transformations, which is efficient for scrubbing and transforming your data while being an easy skill set to hire for and develop within your teams. It should also enable easy sharing of insights across the organization.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

Data Governance for Dummies: Your Questions, Answered

Alation

FEBRUARY 17, 2023

I suggest building out a RACI framework that assigns core activities across these key roles: (1) Data Owner; (2) Business Data Steward; (3) Technical (IT) Data Steward; (4) Enterprise Data Steward; (5) Data Engineer; and (6) Data Consumer. Communication is essential. This is a very good thing.

Data Governance

Data Governance Data Quality Data Analyst Data Pipeline

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

MAY 12, 2023

The reason is that most teams do not have access to a robust data ecosystem for ML development. billion is lost by Fortune 500 companies because of broken data pipelines and communications. Publishing standards for data and governance of that data is either missing or very widely far from an ideal.

Machine Learning

Machine Learning Machine Learning ML ML

Capital One’s data-centric solutions to banking business challenges

Snorkel AI

MAY 12, 2023

The reason is that most teams do not have access to a robust data ecosystem for ML development. billion is lost by Fortune 500 companies because of broken data pipelines and communications. Publishing standards for data and governance of that data is either missing or very widely far from an ideal.

Machine Learning

Machine Learning Machine Learning ML ML

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

Introduction: The Customer Data Modeling Dilemma You know, that thing we’ve been doing for years, trying to capture the essence of our customers in neat little profile boxes? For years, we’ve been obsessed with creating these grand, top-down customer data models. Yeah, that one.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

Big data engineer

Dataconomy

MAY 26, 2025

Big data engineers are essential in today’s data-driven landscape, transforming vast amounts of information into valuable insights. As businesses increasingly depend on big data to tailor their strategies and enhance decision-making, the role of these engineers becomes more crucial.

Big Data

Big Data Big Data Data Engineering Data Engineering

Best Data Engineering Tools Every Engineer Should Know

Pickl AI

MARCH 19, 2025

Summary: Data engineering tools streamline data collection, storage, and processing. Tools like Python, SQL, Apache Spark, and Snowflake help engineers automate workflows and improve efficiency. Learning these tools is crucial for building scalable data pipelines. Thats where data engineering tools come in!

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Gen AI Trends and Scaling Strategies for 2025

Iguazio

MARCH 20, 2025

Data Management, Security, and Governance Automating, scaling, versioning and productizing data pipelines Ensuring data security, lineage and risk controls Adding application security Adding real-time guardrails and hallucination protection 2. The future of Gen AI belongs to those who build with foresight.

AI

AI AI Data Pipeline Data Scientist

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Essential data engineering tools for 2023: Empowering for management and analysis

Webinars

Trending Sources

Becoming a Data Engineer: 7 Tips to Take Your Career to the Next Level

Webinars

Discover the Most Important Fundamentals of Data Engineering

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Beyond The Data: Eugenia Pais, Sr. Data Engineer

Why Improving Problem-Solving Skills is Crucial for Data Engineers?

The Data Engineer’s Roadmap

Where Does Fivetran Fit into The Modern Data Stack?

MLOps Landscape in 2023: Top Tools and Platforms

Find Your AI Solutions at the ODSC West AI Expo

Data science vs data analytics: Unpacking the differences

Unlocking Tabular Data’s Hidden Potential

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

What are Snowflake Dynamic Tables?

The Official Machine Learning and AI Platform of Hacktoberfest 2023

Maximize the Power of dbt and Snowflake to Achieve Efficient and Scalable Data Vault Solutions

How to Optimize Power BI and Snowflake for Advanced Analytics

Top ETL Tools: Unveiling the Best Solutions for Data Integration

How to use foundation models and trusted governance to manage AI workflow risk

Data architecture strategy for data quality

What Are dbt Artifacts

What Industries are Hiring for Different Jobs in AI

Generative AI in Software Development

How to Manage Unstructured Data in AI and Machine Learning Projects

ML Collaboration: Best Practices From 4 ML Teams

LLMOps vs. MLOps: Understanding the Differences

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Star Schema vs. Snowflake Schema: Comparing Dimensional Modeling Techniques

The Ultimate Modern Data Stack Migration Guide

Data Governance for Dummies: Your Questions, Answered

Capital One’s data-centric solutions to banking business challenges

Capital One’s data-centric solutions to banking business challenges

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

Big data engineer

Best Data Engineering Tools Every Engineer Should Know

Gen AI Trends and Scaling Strategies for 2025

Stay Connected