Blog, Data Governance and Data Lakes

Blog

Data Governance

Data Lakes

Data lakes vs. data warehouses: Decoding the data storage debate

Data Science Dojo

JANUARY 12, 2023

When it comes to data, there are two main types: data lakes and data warehouses. What is a data lake? An enormous amount of raw data is stored in its original format in a data lake until it is required for analytics applications. Which one is right for your business?

Data Lakes

Data Lakes Data Warehouse Hadoop Machine Learning

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

It offers full BI-Stack Automation, from source to data warehouse through to frontend. It supports a holistic data model, allowing for rapid prototyping of various models. It also supports a wide range of data warehouses, analytical databases, data lakes, frontends, and pipelines/ETL. pipelines, Azure Data Bricks.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Evaluating Data Lakes vs. Data Warehouses

Dataversity

MARCH 21, 2022

While data lakes and data warehouses are both important Data Management tools, they serve very different purposes. If you’re trying to determine whether you need a data lake, a data warehouse, or possibly even both, you’ll want to understand the functionality of each tool and their differences.

Data Lakes

Data Lakes Data Warehouse Data Governance Data Quality

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Thank You Snowflake for Naming Alation the Data Governance Partner of the Year

Alation

JUNE 17, 2021

Leading companies like Cisco, Nielsen, and Finnair turn to Alation + Snowflake for data governance and analytics. By joining forces, we can build more potent, tailored solutions that leverage data governance as a competitive asset. Lastly, active data governance simplifies stewardship tasks of all kinds.

Data Governance

Data Governance Data Lakes Cloud Data Data Quality

How to modernize data lakes with a data lakehouse architecture

IBM Journey to AI blog

JULY 5, 2023

Data Lakes have been around for well over a decade now, supporting the analytic operations of some of the largest world corporations. Such data volumes are not easy to move, migrate or modernize. The challenges of a monolithic data lake architecture Data lakes are, at a high level, single repositories of data at scale.

Data Lakes

Data Lakes Data Warehouse Data Governance Analytics

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone is a data management service that makes it quick and convenient to catalog, discover, share, and govern data stored in AWS, on-premises, and third-party sources. The data lake environment is required to configure an AWS Glue database table, which is used to publish an asset in the Amazon DataZone catalog.

Machine Learning

Machine Learning Machine Learning Data Governance ML

A Bridge Between Data Lakes and Data Warehouses

Dataversity

JANUARY 28, 2021

It has been ten years since Pentaho Chief Technology Officer James Dixon coined the term “data lake.” While data warehouse (DWH) systems have had longer existence and recognition, the data industry has embraced the more […]. The post A Bridge Between Data Lakes and Data Warehouses appeared first on DATAVERSITY.

Data Lakes

Data Lakes Data Warehouse Data Quality Data Governance

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Pickl AI

NOVEMBER 15, 2023

Discover the nuanced dissimilarities between Data Lakes and Data Warehouses. Data management in the digital age has become a crucial aspect of businesses, and two prominent concepts in this realm are Data Lakes and Data Warehouses. It acts as a repository for storing all the data.

Data Lakes

Data Lakes Data Warehouse Database ETL

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

Data governance challenges Maintaining consistent data governance across different systems is crucial but complex. Amazon AppFlow was used to facilitate the smooth and secure transfer of data from various sources into ODAP. The following diagram shows a basic layout of how the solution works.

AWS

AWS Data Governance Data Silos SQL

How to Leverage Machine Learning to Identify Data Errors in a Data Lake

Dataversity

MAY 26, 2022

A data lake becomes a data swamp in the absence of comprehensive data quality validation and does not offer a clear link to value creation. Organizations are rapidly adopting the cloud data lake as the data lake of choice, and the need for validating data in real time has become critical.

Data Lakes

Data Lakes Machine Learning Machine Learning Data Quality

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Alation

FEBRUARY 20, 2020

For many enterprises, a hybrid cloud data lake is no longer a trend, but becoming reality. With an on-premise deployment, enterprises have full control over data security, data access, and data governance. Data that needs to be tightly controlled (e.g. The Problem with Hybrid Cloud Environments.

Data Lakes

Data Lakes Cloud Data AWS Tableau

How Data Governance Supports Analytics

Alation

JANUARY 27, 2022

People might not understand the data, the data they chose might not be ideal for their application, or there might be better, more current, or more accurate data available. An effective data governance program ensures data consistency and trustworthiness. It can also help prevent data misuse.

Data Governance

Data Governance Analytics Analytics Data Quality

Introducing Agile Data Governance – Alation TrustCheck

Alation

FEBRUARY 20, 2020

The rise of data lakes, IOT analytics, and big data pipelines has introduced a new world of fast, big data. How Data Catalogs Can Help. Data catalogs evolved as a key component of the data governance revolution by creating a bridge between the new world and old world of data governance.

Data Governance

Data Governance Tableau Analytics Analytics

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Data and governance foundations – This function uses a data mesh architecture for setting up and operating the data lake, central feature store, and data governance foundations to enable fine-grained data access.

ML ML AWS Data Lakes

Data Governance for Dummies: Your Questions, Answered

Alation

FEBRUARY 17, 2023

This past week, I had the pleasure of hosting Data Governance for Dummies author Jonathan Reichental for a fireside chat , along with Denise Swanson , Data Governance lead at Alation. Can you have proper data management without establishing a formal data governance program?

Data Governance

Data Governance Data Quality Data Analyst Data Pipeline

5 Ways Data Engineers Can Support Data Governance

Alation

JANUARY 26, 2023

These data requirements could be satisfied with a strong data governance strategy. Governance can — and should — be the responsibility of every data user, though how that’s achieved will depend on the role within the organization. How can data engineers address these challenges directly?

Data Governance

Data Governance Data Engineering Data Engineer Data Engineering

Scaling Data Access Governance

Dataversity

OCTOBER 4, 2022

The rise of data lakes and adjacent patterns such as the data lakehouse has given data teams increased agility and the ability to leverage major amounts of data. Constantly evolving data privacy legislation and the impact of major cybersecurity breaches has led to the call for responsible data […].

Data Lakes

Data Lakes Data Governance Data Quality

The Role of the Data Catalog in Data Security

Alation

JUNE 14, 2021

And third is what factors CIOs and CISOs should consider when evaluating a catalog – especially one used for data governance. The Role of the CISO in Data Governance and Security. They want CISOs putting in place the data governance needed to actively protect data. So CISOs must protect data.

Data Governance

Data Governance Data Lakes Data Classification Data Quality

Why Graph Databases Are an Essential Choice for Master Data Management

Dataversity

APRIL 23, 2021

Within the Data Management industry, it’s becoming clear that the old model of rounding up massive amounts of data, dumping it into a data lake, and building an API to extract needed information isn’t working. Click to learn more about author Brian Platz.

Database

Database Data Lakes Data Silos Data Governance

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

This blog was originally written by Keith Smith and updated for 2024 by Justin Delisi. Snowflake’s Data Cloud has emerged as a leader in cloud data warehousing. As they grow in both their complexity and data production/consumption, a data governance strategy needs to be designed as part of your information architecture.

Data Warehouse

Data Warehouse Data Lakes Clustering Cloud Data

Why Easier Governance Is Superior Governance

Alation

FEBRUARY 1, 2022

A new research report by Ventana Research, Embracing Modern Data Governance , shows that modern data governance programs can drive a significantly higher ROI in a much shorter time span. Historically, data governance has been a manual and restrictive process, making it almost impossible for these programs to succeed.

Data Lakes

Data Lakes Data Governance ML ML

Why Spreadsheets Are Your Secret Weapon for Efficient Data Governance

Alation

APRIL 6, 2023

Data governance is traditionally applied to structured data assets that are most often found in databases and information systems. This blog focuses on governing spreadsheets that contain data, information, and metadata, and must themselves be governed.

Data Governance

Data Governance Database Data Lakes Data Warehouse

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

AWS Machine Learning Blog

JUNE 5, 2023

Many teams are turning to Athena to enable interactive querying and analyze their data in the respective data stores without creating multiple data copies. Athena allows applications to use standard SQL to query massive amounts of data on an S3 data lake. Create a data lake with Lake Formation.

Machine Learning

Machine Learning Machine Learning AWS Data Lakes

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

IBM Journey to AI blog

JUNE 15, 2023

It is comprised of commodity cloud object storage, open data and open table formats, and high-performance open-source query engines. To help organizations scale AI workloads, we recently announced IBM watsonx.data , a data store built on an open data lakehouse architecture and part of the watsonx AI and data platform.

Data Warehouse

Data Warehouse Data Lakes Cloud Data Analytics

Data Mesh vs. Data Fabric: A Love Story

Alation

JANUARY 13, 2022

Thoughtworks says data mesh is key to moving beyond a monolithic data lake. Spoiler alert: data fabric and data mesh are independent design concepts that are, in fact, quite complementary. Thoughtworks says data mesh is key to moving beyond a monolithic data lake 2. Gartner on Data Fabric.

Data Lakes

Data Lakes Data Governance Data Quality Data Warehouse

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Journey to AI blog

AUGUST 4, 2023

Data democratization instead refers to the simplification of all processes related to data, from storage architecture to data management to data security. It also requires an organization-wide data governance approach, from adopting new types of employee training to creating new policies for data storage.

Data Lakes

Data Lakes AI AI Data Governance

An Introduction to Metadata Management

Dataversity

DECEMBER 16, 2020

According to IDC, the size of the global datasphere is projected to reach 163 ZB by 2025, leading to the disparate data sources in legacy systems, new system deployments, and the creation of data lakes and data warehouses. Most organizations do not utilize the entirety of the data […].

Data Warehouse

Data Warehouse Data Lakes Data Profiling Data Quality

Alation Announces 2021.4 Release: Interview on Column-Level Lineage with Jason Ma, Senior Director of Product Management

Alation

NOVEMBER 18, 2021

External Tables Create a Shared View of the Data Lake. We’ve seen external tables become popular with our customers, who use them to provide a normalized relational schema on top of their data lake. Essentially, external tables create a shared view of the data lake, a single pane of glass everyone can reference.

Data Lakes

Data Lakes Data Governance SQL AWS

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Taking the Chill Out of Selecting the Appropriate Iceberg Data Catalog

Dataversity

JULY 25, 2024

Over the past few years, the industry has increasingly recognized the need to adopt a data lakehouse architecture because of the inherent benefits. This approach improves data infrastructure costs and reduces time-to-insight by consolidating more data workloads into a single source of truth on the organization’s data lake.

Data Lakes

Data Lakes Data Governance Data Quality

How data stores and governance impact your AI initiatives

IBM Journey to AI blog

OCTOBER 12, 2023

Accounting for the complexities of the AI lifecycle Unfortunately, typical data storage and data governance tools fall short in the AI arena when it comes to helping an organization perform the tasks that underline efficient and responsible AI lifecycle management. And that makes sense.

AI AI Data Scientist Data Governance

Data Mesh or Data Mess?

Dataversity

SEPTEMBER 12, 2022

The ways in which we store and manage data have grown exponentially over recent years – and continue to evolve into new paradigms. For much of IT history, though, enterprise data architecture has existed as monolithic, centralized “data lakes.” The post Data Mesh or Data Mess?

Data Lakes

Data Lakes Data Quality Data Governance Cloud Data

The First Pillar of Data Culture: Data Search & Discovery

Alation

JUNE 9, 2021

According to Gartner, data culture is a top priority for chief data officers (CDOs) and chief data & analytics officers (CDAOs). This post focuses on the role of data search & discovery within a data culture. The third and fourth posts take a deeper look at data literacy and data governance respectively.

Data Governance

Data Governance Database Cloud Data Machine Learning

Data Lakehouses: The Future Of Data Migration

Dataversity

FEBRUARY 10, 2023

For many of these organizations, the path toward becoming more data-driven lies in the power of data lakehouses, which combine elements of data warehouse architecture with data lakes.

Data Lakes

Data Lakes Data Warehouse Data Quality Data Governance

What Is a Data Catalog?

Alation

FEBRUARY 13, 2020

Figure 1 illustrates the typical metadata subjects contained in a data catalog. Figure 1 – Data Catalog Metadata Subjects. Datasets are the files and tables that data workers need to find and access. They may reside in a data lake, warehouse, master data repository, or any other shared data resource.

Data Lakes

Data Lakes Data Analysis Data Analysis Big Data

What Is Data Curation?

Alation

FEBRUARY 13, 2020

Data curation is important in today’s world of data sharing and self-service analytics, but I think it is a frequently misused term. When speaking and consulting, I often hear people refer to data in their data lakes and data warehouses as curated data, believing that it is curated because it is stored as shareable data.

Data Warehouse

Data Warehouse Data Lakes Data Governance Analytics

Why Invest Now? Three Investors Share the Story Behind Alation’s Series E

Alation

NOVEMBER 2, 2022

We had not seen that in the broader intelligence & data governance market.”. At Databricks, we’re focused on enabling customers to adopt the data lakehouse, and that’s an open data architecture that combines the best of the data warehouse and the data lake into one platform,” Ferguson says. “[The

Data Governance

Data Governance Data Lakes Data Warehouse Analytics

What is Snowflake Horizon?

phData

AUGUST 5, 2024

How can my analysts discover where data is located? All of these questions describe a concept known as data governance. The Snowflake AI Data Cloud has built an entire blanket of features called Horizon, which tackles all of these questions and more. We will begin with compliance. appeared first on phData.

Data Governance

Data Governance Data Quality Data Lakes ML

Alation Cloud Service: Data Intelligence Just Got Simpler

Alation

APRIL 7, 2021

Business and technical users have always found Alation Data Catalog simple to use and manage. Enterprises can use the data catalog without any administrative overhead. Deliver data intelligence, as a service. Alation possesses three unique capabilities: intelligence, active data governance, and broad, deep connectivity.

Data Governance

Data Governance AWS Data Silos Data Lakes

What Is Data Modernization? 5 Benefits Worth Knowing

Alation

APRIL 19, 2022

In that sense, data modernization is synonymous with cloud migration. Modern data architectures, like cloud data warehouses and cloud data lakes , empower more people to leverage analytics for insights more efficiently. What Is the Role of the Cloud in Data Modernization? How to Modernize Data with Alation.

Data Governance

Data Governance Cloud Data Database Data Silos

Alation Earns 8 Top Rankings in BARC’s The Data Management Survey 23

Alation

OCTOBER 19, 2022

Alation outpaced its rivals by achieving 8 top rankings and 11 leading positions across two separate peer groups of Data Intelligence Platforms and Data Governance Products. In addition, 83 percent of surveyed users would recommend — and 90 percent are satisfied with — Alation Data Catalog. Subscribe to Alation's Blog.

Data Governance

Data Governance Data Quality Data Lakes Data Observability

Where Do Data Catalogs Fit in Metadata Management?

Alation

FEBRUARY 13, 2020

In an earlier blog, I defined a data catalog as “a collection of metadata, combined with data management and search tools, that helps analysts and other data users to find the data that they need, serves as an inventory of available data, and provides information to evaluate fitness data for intended uses.”.

Data Lakes

Data Lakes Data Governance Data Science Data Analyst

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

The first generation of data architectures represented by enterprise data warehouse and business intelligence platforms were characterized by thousands of ETL jobs, tables, and reports that only a small group of specialized data engineers understood, resulting in an under-realized positive impact on the business.

Data Quality

Data Quality Data Lakes Data Warehouse Big Data

Data lakes vs. data warehouses: Decoding the data storage debate

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Webinars

Trending Sources

Evaluating Data Lakes vs. Data Warehouses

Webinars

Thank You Snowflake for Naming Alation the Data Governance Partner of the Year

How to modernize data lakes with a data lakehouse architecture

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

A Bridge Between Data Lakes and Data Warehouses

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Shaping the future: OMRON’s data-driven journey with AWS

How to Leverage Machine Learning to Identify Data Errors in a Data Lake

Driving Business Value and ROI from a Hybrid Cloud Data Lake

How Data Governance Supports Analytics

Introducing Agile Data Governance – Alation TrustCheck

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Data Governance for Dummies: Your Questions, Answered

5 Ways Data Engineers Can Support Data Governance

Scaling Data Access Governance

The Role of the Data Catalog in Data Security

Why Graph Databases Are an Essential Choice for Master Data Management

What is the Snowflake Data Cloud and How Much Does it Cost?

Why Easier Governance Is Superior Governance

Why Spreadsheets Are Your Secret Weapon for Efficient Data Governance

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

Data Mesh vs. Data Fabric: A Love Story

Data democratization: How data architecture can drive business decisions and AI initiatives

An Introduction to Metadata Management

Alation Announces 2021.4 Release: Interview on Column-Level Lineage with Jason Ma, Senior Director of Product Management

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Taking the Chill Out of Selecting the Appropriate Iceberg Data Catalog

How data stores and governance impact your AI initiatives

Data Mesh or Data Mess?

The First Pillar of Data Culture: Data Search & Discovery

Data Lakehouses: The Future Of Data Migration

What Is a Data Catalog?

What Is Data Curation?

Why Invest Now? Three Investors Share the Story Behind Alation’s Series E

What is Snowflake Horizon?

Alation Cloud Service: Data Intelligence Just Got Simpler

What Is Data Modernization? 5 Benefits Worth Knowing

Alation Earns 8 Top Rankings in BARC’s The Data Management Survey 23

Where Do Data Catalogs Fit in Metadata Management?

Data architecture strategy for data quality

Stay Connected