Data Science, Data Warehouse and Database

The Need for Data Warehouse and Its Alternatives

Analytics Vidhya

OCTOBER 15, 2022

This article was published as a part of the Data Science Blogathon. Introduction Data from different sources are brought to a single location and then converted into a format that the data warehouse can process and store. A boss may […].

Data Warehouse

Data Warehouse Data Science Analytics Analytics

What are Schemas in Data Warehouse Modeling?

Analytics Vidhya

JUNE 6, 2022

This article was published as a part of the Data Science Blogathon. Introduction Do you think you can derive insights from raw data? Wouldn’t the process be much easier if the raw data were more organized and clean? Here’s when Data […]. The post What are Schemas in Data Warehouse Modeling?

Data Warehouse

Data Warehouse Data Science Analytics Analytics

Data Lake or Data Warehouse- Which is Better?

Analytics Vidhya

OCTOBER 28, 2022

This article was published as a part of the Data Science Blogathon. Introduction Data is defined as information that has been organized in a meaningful way. Data collection is critical for businesses to make informed decisions, understand customers’ […]. The post Data Lake or Data Warehouse- Which is Better?

Data Warehouse

Data Warehouse Data Lakes Data Science Analytics

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Data Warehouse for the Beginners!

Analytics Vidhya

SEPTEMBER 28, 2022

This article was published as a part of the Data Science Blogathon. Introduction The concept of data warehousing dates to the 1980s. IBM is one name that easily enters the picture whenever long history in computer science is involved. The post Data Warehouse for the Beginners!

Data Warehouse

Data Warehouse Computer Science Computer Science Data Science

Understanding Key Concepts on Data Warehouses

Analytics Vidhya

MAY 3, 2022

This article was published as a part of the Data Science Blogathon. Introduction on Data Warehouses During one of the technical webinars, it was highlighted where the transactional database was rendered no-operational bringing day to day operations to a standstill.

Data Warehouse

Data Warehouse Data Science Database Analytics

AWS Redshift: Cloud Data Warehouse Service

Analytics Vidhya

APRIL 25, 2022

This article was published as a part of the Data Science Blogathon. Introduction Amazon’s Redshift Database is a cloud-based large data warehousing solution. Companies may store petabytes of data in easy-to-access “clusters” that can be searched in parallel using the platform’s storage system.

Data Warehouse

Data Warehouse Cloud Data AWS Clustering

Beginners Guide to Data Warehouse Using Hive Query Language

Analytics Vidhya

APRIL 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction Have you ever wondered how big IT giants store and process huge amounts of data? storing the data […]. storing the data […].

Data Warehouse

Data Warehouse Database Data Science Analytics

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

In the contemporary age of Big Data, Data Warehouse Systems and Data Science Analytics Infrastructures have become an essential component for organizations to store, analyze, and make data-driven decisions. So why using IaC for Cloud Data Infrastructures?

Data Warehouse

Data Warehouse Azure SQL Database

A Complete Guide on Building an ETL Pipeline for Beginners

Analytics Vidhya

JUNE 13, 2022

This article was published as a part of the Data Science Blogathon. Introduction on ETL Pipeline ETL pipelines are a set of processes used to transfer data from one or more sources to a database, like a data warehouse.

ETL

ETL Data Warehouse Database Data Science

How a Delta Lake is Process with Azure Synapse Analytics

Analytics Vidhya

JULY 29, 2022

This article was published as a part of the Data Science Blogathon. The post How a Delta Lake is Process with Azure Synapse Analytics appeared first on Analytics Vidhya.

Azure

Azure Data Warehouse Data Lakes Analytics

How to Normalize Relational Databases With SQL Code?

Analytics Vidhya

FEBRUARY 27, 2023

Introduction Data is the new oil in this century. The database is the major element of a data science project. To generate actionable insights, the database must be centralized and organized efficiently. So, we are […] The post How to Normalize Relational Databases With SQL Code?

Database

Database SQL Data Science Analytics

Introduction to Partitioned hive table and PySpark

Analytics Vidhya

OCTOBER 28, 2021

This article was published as a part of the Data Science Blogathon What is the need for Hive? The official description of Hive is- ‘Apache Hive data warehouse software project built on top of Apache Hadoop for providing data query and analysis.

Apache Hadoop

Apache Hadoop Data Warehouse Hadoop SQL

Data lakes vs. data warehouses: Decoding the data storage debate

Data Science Dojo

JANUARY 12, 2023

When it comes to data, there are two main types: data lakes and data warehouses. What is a data lake? An enormous amount of raw data is stored in its original format in a data lake until it is required for analytics applications. Some NoSQL databases are also utilized as platforms for data lakes.

Data Lakes

Data Lakes Data Warehouse Hadoop Machine Learning

Data Warehousing with Snowflake and Other Alternatives

Analytics Vidhya

SEPTEMBER 27, 2022

This article was published as a part of the Data Science Blogathon. Businesses have adopted Snowflake as migration from on-premise enterprise data warehouses (such as Teradata) or a more flexibly scalable and easier-to-manage alternative to […].

Data Warehouse

Data Warehouse Data Science Analytics Analytics

Basic Introduction to Data Science Pipeline

Analytics Vidhya

AUGUST 16, 2022

This article was published as a part of the Data Science Blogathon. Introduction The Data science pipeline is the procedure and equipment used to compile raw data from many sources, evaluate it, and display the findings in a clear and concise manner.

Data Science

Data Science Analytics Analytics Data Warehouse

Apache Airflow used for Performing ETL

Analytics Vidhya

JULY 18, 2022

This article was published as a part of the Data Science Blogathon. Introduction Organizations with a separate transactional database and data warehouse typically have many data engineering activities. For example, they extract, transform and load data from various sources into their data warehouse.

ETL

ETL Data Warehouse Data Engineer Data Engineering

Intro to Rapidminer: A No-Code Development Platform for Data Mining (with Case Study)

Analytics Vidhya

OCTOBER 4, 2021

This article was published as a part of the Data Science Blogathon Image 1 What is data mining? Data mining is the process of finding interesting patterns and knowledge from large amounts of data. This analysis […].

Data Mining

Data Mining Data Mining Data Mining Data Warehouse

Mastering Data Normalization: A Comprehensive Guide

Data Science Dojo

MARCH 27, 2025

It powers business decisions, drives AI models, and keeps databases running efficiently. But heres the problem: raw data is often messy. Without proper organization, databases become bloated, slow, and unreliable. Thats where data normalization comes in. Thats where data normalization comes in.

Database

Database Data Warehouse Machine Learning Machine Learning

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. It offers full BI-Stack Automation, from source to data warehouse through to frontend.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Conventional ML development cycles take weeks to many months and requires sparse data science understanding and ML development skills. Business analysts’ ideas to use ML models often sit in prolonged backlogs because of data engineering and data science team’s bandwidth and data preparation activities.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

Apache Sqoop: Features, Architecture and Operations

Analytics Vidhya

SEPTEMBER 18, 2022

This article was published as a part of the Data Science Blogathon. Introduction Apache SQOOP is a tool designed to aid in the large-scale export and import of data into HDFS from structured data repositories. Relational databases, enterprise data warehouses, and NoSQL systems are all examples of data storage.

Data Warehouse

Data Warehouse Data Science Database Analytics

Cookiecutter Data Science V2

DrivenData Labs

MAY 21, 2024

The original Cookiecutter Data Science (CCDS) was published over 8 years ago. The goal was, as the tagline states “a logical, reasonably standardized but flexible project structure for data science.” That said, in the past 5 years, a lot has changed in data science tooling and MLOps. Badges are delightful.

Data Science

Data Science Python Data Scientist Data Warehouse

10 essential SQL concepts for data scientists: Tips and examples

Data Science Dojo

APRIL 25, 2023

SQL (Structured Query Language) is an important tool for data scientists. It is a programming language used to manipulate data stored in relational databases. Mastering SQL concepts allows a data scientist to quickly analyze large amounts of data and make decisions based on their findings.

Data Scientist

Data Scientist SQL Machine Learning Machine Learning

AWS Glue: Simplifying ETL Data Processing

Analytics Vidhya

DECEMBER 28, 2022

This article was published as a part of the Data Science Blogathon. Source: [link] Introduction If you are familiar with databases, or data warehouses, you have probably heard the term “ETL.” As the amount of data at organizations grow, making use of that data in analytics to derive business insights grows as well.

ETL

ETL AWS Data Warehouse Data Science

Differentiating Between Data Lakes and Data Warehouses

Smart Data Collective

SEPTEMBER 23, 2020

The market for data warehouses is booming. While there is a lot of discussion about the merits of data warehouses, not enough discussion centers around data lakes. We talked about enterprise data warehouses in the past, so let’s contrast them with data lakes. Data Warehouse.

Data Warehouse

Data Warehouse Data Lakes Big Data Big Data

Partitioning and Bucketing in Hive

Analytics Vidhya

JUNE 30, 2022

This article was published as a part of the Data Science Blogathon. Introduction Hive is a popular data warehouse built on top of Hadoop that is used by companies like Walmart, Tiktok, and AT&T. It is an important technology for data engineers to learn and master.

Data Warehouse

Data Warehouse Hadoop Data Engineer Data Engineering

Comparison between Online Processing Systems: OLTP Vs OLAP

Analytics Vidhya

JULY 1, 2022

This article was published as a part of the Data Science Blogathon. Introduction In the field of Data Science main types of online processing systems are Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP), which are used in most companies for transaction-oriented applications and analytical work.

Data Science

Data Science Database Analytics Analytics

Data warehouse architecture

Dataconomy

OCTOBER 17, 2023

Want to create a robust data warehouse architecture for your business? The sheer volume of data that companies are now gathering is incredible, and understanding how best to store and use this information to extract top performance can be incredibly overwhelming.

Data Warehouse

Data Warehouse Big Data Big Data ETL

Is web3 data storage ushering in a new era of privacy?

Dataconomy

MAY 27, 2024

The main solutions on the market are decentralized file storage networks (DSFN) like Filecoin and Arweave, and decentralized data warehouses like Space and Time (SxT). Built to seamlessly integrate with existing enterprise systems, the data warehouse lets businesses tap into blockchain data while publishing query results back on-chain.

Data Warehouse

Data Warehouse Database SQL Analytics

ETL Pipeline with Google DataFlow and Apache Beam

Analytics Vidhya

JULY 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction Processing large amounts of raw data from various sources requires appropriate tools and solutions for effective data integration. Building an ETL pipeline using Apache […].

ETL

ETL Data Science Analytics Analytics

Data Science News from Microsoft Ignite 2019

Data Science 101

NOVEMBER 7, 2019

Microsoft just held one of its largest conferences of the year, and a few major announcements were made which pertain to the cloud data science world. Azure Synapse Analytics can be seen as a merge of Azure SQL Data Warehouse and Azure Data Lake. Those are the big data science announcements of the week.

Data Science

Data Science Azure SQL Machine Learning

Exploring the fundamentals of online transaction processing databases

Dataconomy

APRIL 27, 2023

What is an online transaction processing database (OLTP)? OLTP is the backbone of modern data processing, a critical component in managing large volumes of transactions quickly and efficiently. This approach allows businesses to efficiently manage large amounts of data and leverage it to their advantage in a highly competitive market.

Database

Database Data Scientist Data Mining Data Mining

Developing an End-to-End Automated Data Pipeline

Analytics Vidhya

JULY 20, 2022

This article was published as a part of the Data Science Blogathon. Introduction Data acclimates to countless shapes and sizes to complete its journey from a source to a destination. The post Developing an End-to-End Automated Data Pipeline appeared first on Analytics Vidhya.

Data Pipeline

Data Pipeline ETL Data Science Analytics

Database vs Data Warehouse

Pickl AI

FEBRUARY 23, 2023

Organisations must store data in a safe and secure place for which Databases and Data warehouses are essential. You must be familiar with the terms, but Database and Data Warehouse have some significant differences while being equally crucial for businesses. What is a Database?

Data Warehouse

Data Warehouse Database Data Analysis Data Analysis

Advantages of Using Cloud Data Platform Snowflake

Analytics Vidhya

NOVEMBER 19, 2022

This article was published as a part of the Data Science Blogathon. Introduction The rate of data expansion in this decade is rapid. The requirement to process and store these data has also become problematic.

Cloud Data

Cloud Data Data Science Analytics Analytics

Getting Started with Data Pipeline

Analytics Vidhya

JULY 25, 2022

This article was published as a part of the Data Science Blogathon. Introduction These days companies seem to seek ways to integrate data from multiple sources to earn a competitive advantage over other businesses.

Data Pipeline

Data Pipeline Data Science Analytics Analytics

How Will The Cloud Impact Data Warehousing Technologies?

Smart Data Collective

APRIL 8, 2020

Dating back to the 1970s, the data warehousing market emerged when computer scientist Bill Inmon first coined the term ‘data warehouse’. Created as on-premise servers, the early data warehouses were built to perform on just a gigabyte scale. Big data and data warehousing.

Data Warehouse

Data Warehouse Big Data Big Data Big Data Analytics

Cloud Data Warehouse Migration 101: Expert Tips

Alation

JULY 28, 2022

There was a time when most CIOs would never consider putting their crown jewels — AKA customer data and associated analytics — into the cloud. But today, there is a magic quadrant for cloud databases and warehouses comprising more than 20 vendors. Yet the cloud, according to Sacolick, doesn’t come cheap. “A Migrate What Matters.

Data Warehouse

Data Warehouse Cloud Data Data Governance Database

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to specific questions.

Data Science

Data Science Analytics Analytics Data Scientist

Dynamic SQL Queries to Transform Data

Analytics Vidhya

JUNE 28, 2022

This article was published as a part of the Data Science Blogathon. “Preponderance data opens doorways to complex and Avant analytics.” ” Introduction to SQL Queries Data is the premium product of the 21st century.

SQL

SQL Data Science Analytics Analytics

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

In this article, we will delve into the concept of data lakes, explore their differences from data warehouses and relational databases, and discuss the significance of data version control in the context of large-scale data management. Before we address the questions, ‘ What is data version control ?’

Data Lakes

Data Lakes Data Warehouse Database Big Data

Exploring the Power of Data Warehouse Functionality

Pickl AI

JUNE 11, 2024

Summary: A data warehouse is a central information hub that stores and organizes vast amounts of data from different sources within an organization. Unlike operational databases focused on daily tasks, data warehouses are designed for analysis, enabling historical trend exploration and informed decision-making.

Data Warehouse

Data Warehouse ETL Data Mining Data Mining

Serverless High Volume ETL data processing on Code Engine

IBM Data Science in Practice

JANUARY 13, 2025

The blog post explains how the Internal Cloud Analytics team leveraged cloud resources like Code-Engine to improve, refine, and scale the data pipelines. Background One of the Analytics teams tasks is to load data from multiple sources and unify it into a data warehouse. Database size limits of 10GB.

ETL

ETL Data Pipeline Database Data Warehouse

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

With its decoupled compute and storage resources, Snowflake is a cloud-native data platform optimized to scale with the business. Dataiku is an advanced analytics and machine learning platform designed to democratize data science and foster collaboration across technical and non-technical teams.

Machine Learning

Machine Learning Machine Learning Data Science ML

The Need for Data Warehouse and Its Alternatives

What are Schemas in Data Warehouse Modeling?

Webinars

Trending Sources

Data Lake or Data Warehouse- Which is Better?

Webinars

Data Warehouse for the Beginners!

Understanding Key Concepts on Data Warehouses

AWS Redshift: Cloud Data Warehouse Service

Beginners Guide to Data Warehouse Using Hive Query Language

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

A Complete Guide on Building an ETL Pipeline for Beginners

How a Delta Lake is Process with Azure Synapse Analytics

How to Normalize Relational Databases With SQL Code?

Introduction to Partitioned hive table and PySpark

Data lakes vs. data warehouses: Decoding the data storage debate

Data Warehousing with Snowflake and Other Alternatives

Basic Introduction to Data Science Pipeline

Apache Airflow used for Performing ETL

Intro to Rapidminer: A No-Code Development Platform for Data Mining (with Case Study)

Mastering Data Normalization: A Comprehensive Guide

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

Apache Sqoop: Features, Architecture and Operations

Cookiecutter Data Science V2

10 essential SQL concepts for data scientists: Tips and examples

AWS Glue: Simplifying ETL Data Processing

Differentiating Between Data Lakes and Data Warehouses

Partitioning and Bucketing in Hive

Comparison between Online Processing Systems: OLTP Vs OLAP

Data warehouse architecture

Is web3 data storage ushering in a new era of privacy?

ETL Pipeline with Google DataFlow and Apache Beam

Data Science News from Microsoft Ignite 2019

Exploring the fundamentals of online transaction processing databases

Developing an End-to-End Automated Data Pipeline

Database vs Data Warehouse

Advantages of Using Cloud Data Platform Snowflake

Getting Started with Data Pipeline

How Will The Cloud Impact Data Warehousing Technologies?

Cloud Data Warehouse Migration 101: Expert Tips

Data science vs data analytics: Unpacking the differences

Dynamic SQL Queries to Transform Data

Data Version Control for Data Lakes: Handling the Changes in Large Scale

Exploring the Power of Data Warehouse Functionality

Serverless High Volume ETL data processing on Code Engine

How Dataiku and Snowflake Strengthen the Modern Data Stack

Stay Connected