Data Science, ETL and SQL - Data Science Current

Remote Data Science Jobs: 5 High-Demand Roles for Career Growth

Data Science Dojo

OCTOBER 31, 2024

Remote work quickly transitioned from a perk to a necessity, and data science—already digital at heart—was poised for this change. For data scientists, this shift has opened up a global market of remote data science jobs, with top employers now prioritizing skills that allow remote professionals to thrive.

Data Science

Data Science Data Scientist Machine Learning Machine Learning

Building a Scalable ETL with SQL + Python

KDnuggets

APRIL 21, 2022

This post will look at building a modular ETL pipeline that transforms data with SQL and visualizes it with Python and R.

ETL

ETL SQL Python Data Science

From Blob Storage to SQL Database Using Azure Data Factory

Analytics Vidhya

APRIL 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction Azure data factory (ADF) is a cloud-based ETL (Extract, Transform, Load) tool and data integration service which allows you to create a data-driven workflow. In this article, I’ll show […].

Azure

Azure SQL Database ETL

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Navigate your way to success – Top 10 data science careers to pursue in 2023

Data Science Dojo

MAY 10, 2023

Navigating the realm of data science careers is no longer a tedious task. In the current landscape, data science has emerged as the lifeblood of organizations seeking to gain a competitive edge.

Data Science

Data Science Data Scientist Database Administration Machine Learning

KDnuggets News, April 27: A Brief Introduction to Papers With Code; Machine Learning Books You Need To Read In 2022

KDnuggets

APRIL 27, 2022

A Brief Introduction to Papers With Code; Machine Learning Books You Need To Read In 2022; Building a Scalable ETL with SQL + Python; 7 Steps to Mastering SQL for Data Science; Top Data Science Projects to Build Your Skills.

Machine Learning

Machine Learning Machine Learning ETL SQL

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. Data Lakes : It supports MS Azure Blob Storage. pipelines, Azure Data Bricks.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Serverless High Volume ETL data processing on Code Engine

IBM Data Science in Practice

JANUARY 13, 2025

By Santhosh Kumar Neerumalla , Niels Korschinsky & Christian Hoeboer Introduction This blogpost describes how to manage and orchestrate high volume Extract-Transform-Load (ETL) loads using a serverless process based on Code Engine. The source data is unstructured JSON, while the target is a structured, relational database.

ETL

ETL Data Pipeline Database Data Warehouse

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

Rockets legacy data science environment challenges Rockets previous data science solution was built around Apache Spark and combined the use of a legacy version of the Hadoop environment and vendor-provided Data Science Experience development tools.

Data Science

Data Science AWS Hadoop Data Scientist

Understand Apache Drill and its Working

Analytics Vidhya

AUGUST 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction Data scientists, engineers, and BI analysts often need to analyze, process, or query different data sources.

ETL

ETL Data Scientist Data Science Analytics

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Data science bootcamps are intensive short-term educational programs designed to equip individuals with the skills needed to enter or advance in the field of data science. They cover a wide range of topics, ranging from Python, R, and statistics to machine learning and data visualization.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

How to Learn Machine Learning

APRIL 26, 2025

The field of data science is now one of the most preferred and lucrative career options available in the area of data because of the increasing dependence on data for decision-making in businesses, which makes the demand for data science hires peak.

Data Science

Data Science Data Analyst Data Scientist Machine Learning

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Data Science Blog

SEPTEMBER 19, 2023

In the contemporary age of Big Data, Data Warehouse Systems and Data Science Analytics Infrastructures have become an essential component for organizations to store, analyze, and make data-driven decisions. So why using IaC for Cloud Data Infrastructures?

Data Warehouse

Data Warehouse Azure SQL Database

The 2021 Executive Guide To Data Science and AI

Applied Data Science

AUGUST 2, 2021

This post is a bitesize walk-through of the 2021 Executive Guide to Data Science and AI — a white paper packed with up-to-date advice for any CIO or CDO looking to deliver real value through data. Team Building the right data science team is complex. Download the free, unabridged version here.

Data Science

Data Science Data Scientist ML ML

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

AWS Machine Learning Blog

APRIL 16, 2024

In the process of working on their ML tasks, data scientists typically start their workflow by discovering relevant data sources and connecting to them. They then use SQL to explore, analyze, visualize, and integrate data from various sources before using it in their ML training and inference.

SQL

SQL AWS Database Data Scientist

5 Reasons Why SQL is Still the Most Accessible Language for New Data Scientists

ODSC - Open Data Science

APRIL 6, 2023

Though both are great to learn, what gets left out of the conversation is a simple yet powerful programming language that everyone in the data science world can agree on, SQL. But why is SQL, or Structured Query Language , so important to learn? Finally, SQL’s window function. Let’s briefly dive into each bit.

SQL

SQL Data Scientist Database Data Science

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Data engineering tools are software applications or frameworks specifically designed to facilitate the process of managing, processing, and transforming large volumes of data. It allows data engineers to define and manage complex workflows as directed acyclic graphs (DAGs).

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

ETL Process Explained: Essential Steps for Effective Data Management

Pickl AI

OCTOBER 17, 2024

Summary: The ETL process, which consists of data extraction, transformation, and loading, is vital for effective data management. Following best practices and using suitable tools enhances data integrity and quality, supporting informed decision-making. Introduction The ETL process is crucial in modern data management.

ETL

ETL Data Warehouse SQL Data Quality

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Summary: This article explores the significance of ETL Data in Data Management. It highlights key components of the ETL process, best practices for efficiency, and future trends like AI integration and real-time processing, ensuring organisations can leverage their data effectively for strategic decision-making.

ETL

ETL Data Warehouse Data Quality Data Governance

Top ETL Tools: Unveiling the Best Solutions for Data Integration

Pickl AI

JUNE 7, 2024

Summary: Choosing the right ETL tool is crucial for seamless data integration. Top contenders like Apache Airflow and AWS Glue offer unique features, empowering businesses with efficient workflows, high data quality, and informed decision-making capabilities. Choosing the right ETL tool is crucial for smooth data management.

ETL

ETL Data Quality Data Pipeline Data Warehouse

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Big Data Technologies: Hadoop, Spark, etc.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

What is Open Database Connectivity (ODBC) and Why Is It Important?

Pickl AI

NOVEMBER 4, 2024

Each database type requires its specific driver, which interprets the application’s SQL queries and translates them into a format the database can understand. The driver manages the connection to the database, processes SQL commands, and retrieves the resulting data. INSERT : Add new records to a table.

Database

Database SQL ETL Azure

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Flipboard

NOVEMBER 24, 2023

This use case highlights how large language models (LLMs) are able to become a translator between human languages (English, Spanish, Arabic, and more) and machine interpretable languages (Python, Java, Scala, SQL, and so on) along with sophisticated internal reasoning. Room for improvement!

Database

Database AWS ETL SQL

How to Shift from Data Science to Data Engineering

ODSC - Open Data Science

JANUARY 18, 2024

These professionals will work with their colleagues to ensure that data is accessible, with proper access. So let’s go through each step one by one, and help you build a roadmap toward becoming a data engineer. Identify your existing data science strengths. Stay on top of data engineering trends.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Difference Between JDBC and ODBC in Database Connectivity

Pickl AI

NOVEMBER 5, 2024

It allows developers to easily connect to databases, execute SQL queries, and retrieve data. It operates as an intermediary, translating Java calls into SQL commands the database understands. For instance, reporting and analytics tools commonly use it to pull data from various database systems. from 2023 to 2030.

Database

Database SQL Python Database Administration

Driving Progress with Open Data Science: Trends, Tools, and Opportunities

ODSC - Open Data Science

DECEMBER 9, 2024

What was once only possible for tech giants is now at our fingertipsvast amounts of data and analytical tools with the power to drive real progress. Open data science is making it a reality. Remarkably, open data science is democratizing analytics. In fact, statistics show the expansion firsthand.

Data Science

Data Science Python Machine Learning Machine Learning

Top Data Analytics Skills and Platforms for 2023

ODSC - Open Data Science

APRIL 3, 2023

As the sibling of data science, data analytics is still a hot field that garners significant interest. Companies have plenty of data at their disposal and are looking for people who can make sense of it and make deductions quickly and efficiently. Cloud Services: Google Cloud Platform, AWS, Azure.

Analytics

Analytics Analytics Data Analyst Data Science

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

Data Warehouses and Relational Databases It is essential to distinguish data lakes from data warehouses and relational databases, as each serves different purposes and has distinct characteristics. Schema Enforcement: Data warehouses use a “schema-on-write” approach. This ensures data consistency and integrity.

Data Lakes

Data Lakes Data Warehouse Database Big Data

Introduction to Power BI Datamarts

ODSC - Open Data Science

JUNE 12, 2023

Then we have some other ETL processes to constantly land the past 5 years of data into the Datamarts. Then we have some other ETL processes to constantly land the past 5 years of data into the Datamarts. No-code/low-code experience using a diagram view in the data preparation layer similar to Dataflows.

Power BI

Power BI Data Warehouse ETL Data Preparation

Azure service cloud summarized: Part I

Mlearning.ai

APRIL 24, 2023

Over the past few years Data Science has MIGRATED from individual computers to service cloud platforms. I just finished learning Azure’s service cloud platform using Coursera and the Microsoft Learning Path for Data Science. It will take a couple of months but it is worth it!

Azure

Azure SQL Database Python

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

An example direct acyclic graph (DAG) might automate data ingestion, processing, model training, and deployment tasks, ensuring that each step is run in the correct order and at the right time. Though it’s worth mentioning that Airflow isn’t used at runtime as is usual for extract, transform, and load (ETL) tasks.

AWS

AWS Machine Learning Machine Learning ML

6 Data And Analytics Trends To Prepare For In 2020

Smart Data Collective

MAY 20, 2019

By 2020, over 40 percent of all data science tasks will be automated. The popular tools, on the other hand, include Power BI, ETL, IBM Db2, and Teradata. This means that data professionals must be able to effectively communicate complex subjects to non-technical professionals. Machine Learning Experience is a Must.

Analytics

Analytics Analytics Data Analyst Machine Learning

Software Engineering Patterns for Machine Learning

The MLOps Blog

SEPTEMBER 7, 2023

Data Scientists and ML Engineers typically write lots and lots of code. From writing code for doing exploratory analysis, experimentation code for modeling, ETLs for creating training datasets, Airflow (or similar) code to generate DAGs, REST APIs, streaming jobs, monitoring jobs, etc. Related post MLOps Is an Extension of DevOps.

Machine Learning

Machine Learning Machine Learning ETL ML

Unleashing the power of Presto: The Uber case study

IBM Journey to AI blog

SEPTEMBER 25, 2023

Every day, millions of riders use the Uber app, unwittingly contributing to a complex web of data-driven decisions. This blog takes you on a journey into the world of Uber’s analytics and the critical role that Presto, the open source SQL query engine, plays in driving their success. What is Presto?

Data Lakes

Data Lakes Analytics Analytics Clustering

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

It is known to have benefits in handling data due to its robustness, speed, and scalability. A typical modern data stack consists of the following: A data warehouse. Data ingestion/integration services. Reverse ETL tools. Data orchestration tools. A Note on the Shift from ETL to ELT.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

How to Better Plan Your Snowflake Migration

phData

SEPTEMBER 26, 2023

Data flows from the current data platform to the destination. The necessary access is granted so data flows without issue. SQL Server Agent jobs). Transformations Transformations can be a part of data ingestion (ETL pattern) or can take place at a later stage after data has been landed (ELT pattern).

SQL

SQL Database ETL Data Modeling

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Pickl AI

NOVEMBER 15, 2023

What Is a Data Warehouse? On the other hand, a Data Warehouse is a structured storage system designed for efficient querying and analysis. It involves the extraction, transformation, and loading (ETL) process to organize data for business intelligence purposes. It often serves as a source for Data Warehouses.

Data Lakes

Data Lakes Data Warehouse Database ETL

How to Use Exploratory Notebooks [Best Practices]

The MLOps Blog

OCTOBER 20, 2023

Jupyter notebooks have been one of the most controversial tools in the data science community. Nevertheless, many data scientists will agree that they can be really valuable – if used well. I’ll show you best practices for using Jupyter Notebooks for exploratory data analysis. documentation. For one, Git diffs within.py

SQL

SQL Database Data Scientist Python

Beginner’s Guide To GCP BigQuery (Part 1)

Mlearning.ai

JULY 10, 2023

In my 7 years of Data Science journey, I’ve been exposed to a number of different databases including but not limited to Oracle Database, MS SQL, MySQL, EDW, and Apache Hadoop. A lot of you who are already in the data science field must be familiar with BigQuery and its advantages.

SQL

SQL Database Apache Hadoop Data Science

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Additionally, Data Engineers implement quality checks, monitor performance, and optimise systems to handle large volumes of data efficiently. Differences Between Data Engineering and Data Science While Data Engineering and Data Science are closely related, they focus on different aspects of data.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Where Does Fivetran Fit into The Modern Data Stack?

phData

JULY 17, 2023

In order to fully leverage this vast quantity of collected data, companies need a robust and scalable data infrastructure to manage it. This is where Fivetran and the Modern Data Stack come in. We can also create advanced data science models with this data using AI/ Machine Learning. What is Fivetran?

Data Warehouse

Data Warehouse Data Pipeline Cloud Data ETL

How Does Snowpark Work?

phData

FEBRUARY 7, 2024

Snowpark is the set of libraries and runtimes in Snowflake that securely deploy and process non-SQL code, including Python, Java, and Scala. A DataFrame is like a query that must be evaluated to retrieve data. An action causes the DataFrame to be evaluated and sends the corresponding SQL statement to the server for execution.

Python

Python ML ML SQL

Importing Data in Python Cheat Sheet with Comprehensive Tutorial

Pickl AI

NOVEMBER 14, 2023

Whether you’re working on Data Analysis, Machine Learning, or any other data-related task, having a well-organized Importing Data in Python Cheat Sheet for importing data in Python is invaluable. So, let me present to you an Importing Data in Python Cheat Sheet which will make your life easier.

Python

Python SQL Database Data Analysis

Drowning in Data? A Data Lake May Be Your Lifesaver

ODSC - Open Data Science

SEPTEMBER 29, 2023

It truly is an all-in-one data lake solution. HPCC Systems and Spark also differ in that they work with distinct parts of the big data pipeline. Spark is more focused on data science, ingestion, and ETL, while HPCC Systems focuses on ETL and data delivery and governance. Tell me more about ECL.

Data Lakes

Data Lakes Clustering Big Data Big Data

What is Alteryx certification: A comprehensive guide

Pickl AI

FEBRUARY 4, 2024

Furthermore, Alteryx provides an array of tools and connectors tailored for different data sources, spanning Excel spreadsheets, databases, and social media platforms. Data Analytics automation Alteryx’s standout feature lies in its capability to automate data analytics workflows. Is Alteryx an ETL tool?

Data Preparation

Data Preparation Tableau Data Visualization Analytics

Remote Data Science Jobs: 5 High-Demand Roles for Career Growth

Building a Scalable ETL with SQL + Python

Webinars

Trending Sources

From Blob Storage to SQL Database Using Azure Data Factory

Webinars

Navigate your way to success – Top 10 data science careers to pursue in 2023

KDnuggets News, April 27: A Brief Introduction to Papers With Code; Machine Learning Books You Need To Read In 2022

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Serverless High Volume ETL data processing on Code Engine

How Rocket Companies modernized their data science solution on AWS

Understand Apache Drill and its Working

A Guide to Choose the Best Data Science Bootcamp

Data Science Career Paths: Analyst, Scientist, Engineer – What’s Right for You?

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

The 2021 Executive Guide To Data Science and AI

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

5 Reasons Why SQL is Still the Most Accessible Language for New Data Scientists

Essential data engineering tools for 2023: Empowering for management and analysis

ETL Process Explained: Essential Steps for Effective Data Management

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Top ETL Tools: Unveiling the Best Solutions for Data Integration

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

What is Open Database Connectivity (ODBC) and Why Is It Important?

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

How to Shift from Data Science to Data Engineering

Difference Between JDBC and ODBC in Database Connectivity

Driving Progress with Open Data Science: Trends, Tools, and Opportunities

Top Data Analytics Skills and Platforms for 2023

Data Version Control for Data Lakes: Handling the Changes in Large Scale

Introduction to Power BI Datamarts

Azure service cloud summarized: Part I

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

6 Data And Analytics Trends To Prepare For In 2020

Software Engineering Patterns for Machine Learning

Unleashing the power of Presto: The Uber case study

The Modern Data Stack Explained: What The Future Holds

How to Better Plan Your Snowflake Migration

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

How to Use Exploratory Notebooks [Best Practices]

Beginner’s Guide To GCP BigQuery (Part 1)

Discover the Most Important Fundamentals of Data Engineering

Where Does Fivetran Fit into The Modern Data Stack?

How Does Snowpark Work?

Importing Data in Python Cheat Sheet with Comprehensive Tutorial

Drowning in Data? A Data Lake May Be Your Lifesaver

What is Alteryx certification: A comprehensive guide

Stay Connected