Article, Data Engineering and Database

Neo4j vs. Amazon Neptune: Graph Databases in Data Engineering

Analytics Vidhya

AUGUST 4, 2024

Traditional databases, while still valuable, often falter when it comes to handling highly connected data. Enter the unsung heroes of the data world: graph databases. These powerful tools are designed to manage and query intricate data relationships effortlessly.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Introduction to SQL for Data Engineering

Analytics Vidhya

APRIL 23, 2022

This article was published as a part of the Data Science Blogathon. Introduction In this article, we will be looking for a very common yet very important topic i.e. SQL also pronounced as Ess-cue-ell. The post Introduction to SQL for Data Engineering appeared first on Analytics Vidhya.

SQL

SQL Data Engineer Data Engineering Data Engineering

Interacting with Remote Databases – PostgreSQL and DBAPIs

Analytics Vidhya

SEPTEMBER 22, 2022

This article was published as a part of the Data Science Blogathon. Introduction When creating data pipelines, Software Engineers and Data Engineers frequently work with databases using Database Management Systems like PostgreSQL.

Database

Database Data Pipeline Data Engineer Data Engineering

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Data Abstraction for Data Engineering with its Different Levels

Analytics Vidhya

OCTOBER 10, 2022

This article was published as a part of the Data Science Blogathon. Introduction A data model is an abstraction of real-world events that we use to create, capture, and store data in a database that user applications require, omitting unnecessary details.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Web Scrapping- Tool for Data Engineering

Analytics Vidhya

SEPTEMBER 26, 2022

This article was published as a part of the Data Science Blogathon. Introduction Have you ever thought of a means to get new data? The post Web Scrapping- Tool for Data Engineering appeared first on Analytics Vidhya. The usefulness of the topic is one that easily helps other disciplines.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Understand the ACID and BASE in Morden Data Engineering

Analytics Vidhya

DECEMBER 12, 2022

This article was published as a part of the Data Science Blogathon. Introduction Dear Data Engineers, this article is a very interesting topic. Let me give some flashback; a few years ago, Mr.Someone in the discussion coined the new word how ACID and BASE properties of DATA. Everyone started […].

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

What is relational about Relational Databases?

Analytics Vidhya

AUGUST 14, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Pretty much everything or all sorts of information available online is. The post What is relational about Relational Databases? appeared first on Analytics Vidhya.

Database

Database Data Science Analytics Analytics

A beginner’s Guide to Database: Part 1

Analytics Vidhya

JULY 7, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Pre-requisites A Basic understanding of Databases. The post A beginner’s Guide to Database: Part 1 appeared first on Analytics Vidhya. Introduction Here I am going.

Database

Database Data Science Analytics Analytics

How is AWS Athena Different from other Databases

Analytics Vidhya

JULY 23, 2022

This article was published as a part of the Data Science Blogathon. Introduction Amazon Athena is an interactive query service based on open-source Apache Presto that allows you to analyze data stored in Amazon S3 using ANSI SQL directly.

AWS

AWS Database SQL Data Science

Apache Cassandra Data Model(CQL) – Schema and Database Design

Analytics Vidhya

SEPTEMBER 11, 2021

This article was published as a part of the Data Science Blogathon Overview When Apache Cassandra first came out, it included a command-line interface for dealing with thrift. Manipulation of data in this manner was inconvenient and caused knowing the API’s intricacies.

Data Modeling

Data Modeling Data Models Database SQL

Introduction to Apache Sqoop

Analytics Vidhya

JULY 25, 2022

This article was published as a part of the Data Science Blogathon. Introduction Apache Sqoop is a big data engine for transferring data between Hadoop and relational database servers. Big Data Sqoop can also be […].

Hadoop

Hadoop Big Data Big Data Data Engineer

Database Errors: Dark Side and Lessons Learned

Analytics Vidhya

DECEMBER 24, 2022

This article was published as a part of the Data Science Blogathon. Introduction Source: Image by Pexels from Pixabay Have you ever wondered about the dark side of databases? We, software developers, rely on databases to store and manage important data for our applications.

Database

Database Data Science Analytics Analytics

Database Normalization- A Step-by-Step Guide with Examples

Analytics Vidhya

AUGUST 16, 2022

This article was published as a part of the Data Science Blogathon. Introduction As an SQL Developer, you regularly work with enormous amounts of data stored in different tables that are present inside databases. The post Database Normalization- A Step-by-Step Guide with Examples appeared first on Analytics Vidhya.

Database

Database SQL Data Science Analytics

Database Design Mistakes and Ways to Avoid Them

Analytics Vidhya

DECEMBER 24, 2022

This article was published as a part of the Data Science Blogathon. Introduction Source: Photo by Kylo on Unsplash As a database (DB) designer, getting the design right from the start is important. The post Database Design Mistakes and Ways to Avoid Them appeared first on Analytics Vidhya.

Database

Database Data Science Analytics Analytics

An Introduction to MongoDB

Analytics Vidhya

NOVEMBER 1, 2022

This article was published as a part of the Data Science Blogathon. Introduction When we hear the word “DATABASE”, the first thought that comes to our mind is SQL! No doubt, SQL and relational databases are widely popular and used extensively for storing data.

SQL

SQL Database Data Science Analytics

Apache Airflow used for Performing ETL

Analytics Vidhya

JULY 18, 2022

This article was published as a part of the Data Science Blogathon. Introduction Organizations with a separate transactional database and data warehouse typically have many data engineering activities. For example, they extract, transform and load data from various sources into their data warehouse.

ETL

ETL Data Warehouse Data Engineer Data Engineering

How to connect MongoDB database with Django

Analytics Vidhya

JUNE 16, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Let’s consider a scenario where you are working on a. The post How to connect MongoDB database with Django appeared first on Analytics Vidhya.

Database

Database Data Science Analytics Analytics

Getting Started with MongoDB database for Data Science

Analytics Vidhya

APRIL 26, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon. Data Science without data is similar to fishing without fish. The post Getting Started with MongoDB database for Data Science appeared first on Analytics Vidhya.

Data Science

Data Science Database Analytics Analytics

Introduction to Apache CouchDB using Python

Analytics Vidhya

JULY 23, 2022

This article was published as a part of the Data Science Blogathon. Introduction Apache CouchDB is an open-source, document-based NoSQL database developed by Apache Software Foundation and used by big companies like Apple, GenCorp Technologies, and Wells Fargo.

Python

Python Database Data Science Analytics

How to screw SQL to anything with Apache Calcite

Analytics Vidhya

OCTOBER 1, 2021

This article was published as a part of the Data Science Blogathon Overview of Apache Calcite Making your own SQL database or running SQL queries against a NoSQL database seems to be a very daunting task. And if we are talking about a distributed database, then the complexity increases many times over.

SQL

SQL Database Data Science Analytics

A brief introduction to SQL Alchemy

Analytics Vidhya

JULY 30, 2022

This article was published as a part of the Data Science Blogathon. Introduction The structured data we generally deal with gets stored in a tabular format in relational databases. And stored data in these databases can be accessed by a query language called “sequel” or SQL. But, it is […].

SQL

SQL Database Data Science Analytics

MongoDB Replication and Sharding- A Complete Introduction

Analytics Vidhya

DECEMBER 26, 2022

This article was published as a part of the Data Science Blogathon. Introduction A NoSQL database is a non-relational database that does not use the traditional table-based schema of a relational database. NoSQL databases are often used for big data and real-time web applications.

Database

Database Big Data Big Data Data Science

Amazon S3: Everything You Need to Know

Analytics Vidhya

NOVEMBER 2, 2022

This article was published as a part of the Data Science Blogathon. Source: [link] Introduction Amazon Web Services (AWS) is a cloud computing platform offering a wide range of services coming under domains like networking, storage, computing, security, databases, machine learning, etc.

Cloud Computing

Cloud Computing AWS Machine Learning Machine Learning

An Introduction to Normalization Theory

Analytics Vidhya

FEBRUARY 11, 2021

ArticleVideos This article was published as a part of the Data Science Blogathon. Introduction: One of the main concepts of Relational Database Management Systems. The post An Introduction to Normalization Theory appeared first on Analytics Vidhya.

Data Science

Data Science Database Analytics Analytics

SQL and PL/SQL – An Unmissable Comparison

Analytics Vidhya

OCTOBER 12, 2022

This article was published as a part of the Data Science Blogathon. Introduction The essential element for any organization’s operation is data. Data is getting significant and gaining more traction by the day. Hence it is required to store such a large amount of data carefully.

SQL

SQL Data Science Database Analytics

Most Frequently Asked Apache HBase Interview Questions

Analytics Vidhya

AUGUST 1, 2022

This article was published as a part of the Data Science Blogathon. Introduction HBase is a column-oriented non-relational database management system that operates on Hadoop Distributed File System (HDFS). HBase provides a fault-tolerant manner of storing sparse data sets, which are prevalent in several big data use cases.

Hadoop

Hadoop Big Data Big Data Data Science

A Detailed Guide for Data Handling Techniques in Data Science

Analytics Vidhya

JANUARY 28, 2022

This article was published as a part of the Data Science Blogathon. Image Source: Author Introduction Data Engineers and Data Scientists need data for their Day-to-Day job. Of course, It could be for Data Analytics, Data Prediction, Data Mining, Building Machine Learning Models Etc.,

Data Science

Data Science Data Mining Data Mining Data Mining

A Brief Introduction to Apache HBase and it’s Architecture

Analytics Vidhya

OCTOBER 12, 2022

This article was published as a part of the Data Science Blogathon. Introduction Since the 1970s, relational database management systems have solved the problems of storing and maintaining large volumes of structured data.

Hadoop

Hadoop Big Data Big Data Data Science

Introduction to Partitioned hive table and PySpark

Analytics Vidhya

OCTOBER 28, 2021

This article was published as a part of the Data Science Blogathon What is the need for Hive? The official description of Hive is- ‘Apache Hive data warehouse software project built on top of Apache Hadoop for providing data query and analysis.

Apache Hadoop

Apache Hadoop Data Warehouse Hadoop SQL

How To Create An Aggregation Pipeline In MongoDB

Analytics Vidhya

APRIL 12, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon. Introduction MongoDB is a free open-source No-SQL document database. The post How To Create An Aggregation Pipeline In MongoDB appeared first on Analytics Vidhya.

SQL

SQL Data Science Database Analytics

A Beginner’s Guide to MySQL: Part 2

Analytics Vidhya

JULY 10, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Pre-requisites – Basic knowledge of any database. – Basic understanding of. The post A Beginner’s Guide to MySQL: Part 2 appeared first on Analytics Vidhya.

Database

Database Data Science Analytics Analytics

Article: The Wonders of Postgres Logical Decoding Messages

Hacker News

MARCH 6, 2023

In this article, author Gunnar Morling discusses Postgres database's logical decoding function to retrieve the messages from write-ahead log, process them, and relay them to external consumers, with help of use cases like outbox, audit logs and replication slots. By Gunnar Morling

Database

Database Data Engineer Data Engineering Data Engineering

Introduction to Apache Spark and its Datasets

Analytics Vidhya

AUGUST 17, 2022

This article was published as a part of the Data Science Blogathon. Introduction In this article, we will introduce you to the big data ecosystem and the role of Apache Spark in Big data. We will also cover the Distributed database system, the backbone of big data. In today’s world, data is the fuel.

Big Data

Big Data Big Data Data Science Database

Google OAuth for MongoDB User Authentication Sign-in

Analytics Vidhya

OCTOBER 26, 2021

This article was published as a part of the Data Science Blogathon Overview of MongoDB Because of its outstanding performance, extensive developer support, and generous free tier, MongoDB has rapidly become my non-relational database platform of choice.

Database

Database Data Science Analytics Analytics

Understanding the need for DBMS

Analytics Vidhya

AUGUST 20, 2022

This article was published as a part of the Data Science Blogathon. Introduction A Database is a collection of inter-related data, and a Database Management System is a set of programs that helps users create and maintain this data. DBMS is a computer-based data record-keeping system.

Database

Database Data Science Analytics Analytics

Introduction to Google Firebase Cloud Storage using Python

Analytics Vidhya

JULY 16, 2022

This article was published as a part of the Data Science Blogathon. It aims to replace conventional backend servers for web and mobile applications by offering multiple services on the same platform like authentication, real-time database, Firestore (NoSQL database), cloud functions, […].

Python

Python Database Data Science Analytics

Partitioning and Bucketing in Hive

Analytics Vidhya

JUNE 30, 2022

This article was published as a part of the Data Science Blogathon. Introduction Hive is a popular data warehouse built on top of Hadoop that is used by companies like Walmart, Tiktok, and AT&T. It is an important technology for data engineers to learn and master.

Data Warehouse

Data Warehouse Hadoop Data Engineer Data Engineering

NoSQL Data Modeling Technique

Analytics Vidhya

JULY 20, 2022

This article was published as a part of the Data Science Blogathon. Introduction NoSQL databases allow us to store vast amounts of data and access them anytime, from any location and device. However, deciding which data modelling technique best suits your needs is complex.

Data Modeling

Data Modeling Data Models Database Data Science

Library Management System using MYSQL

Analytics Vidhya

JULY 31, 2022

This article was published as a part of the Data Science Blogathon. Introduction In this article, we will build Library Management System using MYSQL. We will build the database, which includes tables. The post Library Management System using MYSQL appeared first on Analytics Vidhya.

Data Science

Data Science Database Analytics Analytics

One-stop-shop for Connecting Snowflake to Python!

Analytics Vidhya

MAY 25, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon In this article, we will learn to connect the Snowflake database. The post One-stop-shop for Connecting Snowflake to Python! appeared first on Analytics Vidhya.

Python

Python Data Science Database Analytics

Beginners Guide to Data Warehouse Using Hive Query Language

Analytics Vidhya

APRIL 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction Have you ever wondered how big IT giants store and process huge amounts of data? storing the data […].

Data Warehouse

Data Warehouse Database Data Science Analytics

Exploring the fundamentals of online transaction processing databases

Dataconomy

APRIL 27, 2023

What is an online transaction processing database (OLTP)? OLTP is the backbone of modern data processing, a critical component in managing large volumes of transactions quickly and efficiently. This approach allows businesses to efficiently manage large amounts of data and leverage it to their advantage in a highly competitive market.

Database

Database Data Scientist Data Mining Data Mining

What is Apache Impala- Features and Architecture

Analytics Vidhya

AUGUST 17, 2022

This article was published as a part of the Data Science Blogathon. Introduction Impala is an open-source and native analytics database for Hadoop. Vendors such as Cloudera, Oracle, MapReduce, and Amazon have shipped Impala. If you want to learn all things Impala, you’ve come to the right place.

Hadoop

Hadoop Data Science Database Analytics

Neo4j vs. Amazon Neptune: Graph Databases in Data Engineering

Introduction to SQL for Data Engineering

Webinars

Trending Sources

Interacting with Remote Databases – PostgreSQL and DBAPIs

Webinars

Data Abstraction for Data Engineering with its Different Levels

Web Scrapping- Tool for Data Engineering

Understand the ACID and BASE in Morden Data Engineering

What is relational about Relational Databases?

A beginner’s Guide to Database: Part 1

How is AWS Athena Different from other Databases

Apache Cassandra Data Model(CQL) – Schema and Database Design

Introduction to Apache Sqoop

Database Errors: Dark Side and Lessons Learned

Database Normalization- A Step-by-Step Guide with Examples

Database Design Mistakes and Ways to Avoid Them

An Introduction to MongoDB

Apache Airflow used for Performing ETL

How to connect MongoDB database with Django

Top Interview Questions & Answers for Apache Sqoop

Getting Started with MongoDB database for Data Science

Introduction to Apache CouchDB using Python

How to screw SQL to anything with Apache Calcite

A brief introduction to SQL Alchemy

MongoDB Replication and Sharding- A Complete Introduction

Amazon S3: Everything You Need to Know

An Introduction to Normalization Theory

SQL and PL/SQL – An Unmissable Comparison

Most Frequently Asked Apache HBase Interview Questions

A Detailed Guide for Data Handling Techniques in Data Science

A Brief Introduction to Apache HBase and it’s Architecture

Introduction to Partitioned hive table and PySpark

How To Create An Aggregation Pipeline In MongoDB

A Beginner’s Guide to MySQL: Part 2

Article: The Wonders of Postgres Logical Decoding Messages

Introduction to Apache Spark and its Datasets

Google OAuth for MongoDB User Authentication Sign-in

Understanding the need for DBMS

Introduction to Google Firebase Cloud Storage using Python

Partitioning and Bucketing in Hive

NoSQL Data Modeling Technique

Library Management System using MYSQL

One-stop-shop for Connecting Snowflake to Python!

Beginners Guide to Data Warehouse Using Hive Query Language

Exploring the fundamentals of online transaction processing databases

What is Apache Impala- Features and Architecture

Stay Connected