Analytics, Data Pipeline and SQL - Data Science Current

How to Build a SQL Agent with CrewAI and Composio?

Analytics Vidhya

JULY 1, 2024

Introduction SQL is easily one of the most important languages in the computer world. It serves as the primary means for communicating with relational databases, where most organizations store crucial data. SQL plays a significant role including analyzing complex data, creating data pipelines, and efficiently managing data warehouses.

SQL

SQL Data Warehouse Data Pipeline Database

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Data Science Blog

MAY 20, 2024

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines: It is a Game-Changer with AnalyticsCreator! The need for efficient and reliable data pipelines is paramount in data science and data engineering. They transform data into a consistent format for users to consume.

Data Pipeline

Data Pipeline Data Warehouse Azure Data Lakes

Dynamic SQL Queries to Transform Data

Analytics Vidhya

JUNE 28, 2022

. “Preponderance data opens doorways to complex and Avant analytics.” ” Introduction to SQL Queries Data is the premium product of the 21st century. The post Dynamic SQL Queries to Transform Data appeared first on Analytics Vidhya.

SQL

SQL Data Science Analytics Analytics

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

ETL

ETL Data Warehouse Analytics Analytics

Interacting with Remote Databases – PostgreSQL and DBAPIs

Analytics Vidhya

SEPTEMBER 22, 2022

This article was published as a part of the Data Science Blogathon. Introduction When creating data pipelines, Software Engineers and Data Engineers frequently work with databases using Database Management Systems like PostgreSQL.

Database

Database Data Pipeline Data Engineering Data Engineer

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

Hacker News

APRIL 7, 2025

Knowledge-intensive analytical applications retrieve context from both structured tabular data and unstructured, text-free documents for effective decision-making. Large language models (LLMs) have made it significantly easier to prototype such retrieval and reasoning data pipelines.

Data Pipeline

Data Pipeline SQL Analytics Analytics

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Data Science Dojo

SEPTEMBER 11, 2024

Microsoft Fabric aims to reduce unnecessary data replication, centralize storage, and create a unified environment with its unique data fabric method. Microsoft Fabric is a cutting-edge analytics platform that helps data experts and companies work together on data projects. What is Microsoft Fabric?

Power BI

Power BI Data Pipeline Data Warehouse Data Engineering

Automating CSV to PostgreSQL Ingestion with Airflow and Docker

Analytics Vidhya

OCTOBER 3, 2024

Introduction Managing a data pipeline, such as transferring data from CSV to PostgreSQL, is like orchestrating a well-timed process where each step relies on the previous one. Apache Airflow streamlines this process by automating the workflow, making it easy to manage complex data tasks.

Data Pipeline

Data Pipeline Analytics Analytics Database

Airbyte: The ultimate workhorse for all your ELT pipelines

Data Science Dojo

JANUARY 27, 2023

An ELT pipeline is a data pipeline that extracts (E) data from a source, loads (L) the data into a destination, and then transforms (T) data after it has been stored in the destination. If you can’t import all your data, you may only have a partial picture of your business.

Azure

Azure Data Science Data Pipeline Data Engineering

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Data engineering tools are software applications or frameworks specifically designed to facilitate the process of managing, processing, and transforming large volumes of data. Google BigQuery: Google BigQuery is a serverless, cloud-based data warehouse designed for big data analytics.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock

AWS Machine Learning Blog

DECEMBER 6, 2023

This is especially true for questions that require analytical reasoning across multiple documents. This task involves answering analytical reasoning questions. In this post, we show how to design an intelligent document assistant capable of answering analytical and multi-step reasoning questions in three parts.

SQL

SQL AWS Analytics Analytics

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

Managing and retrieving the right information can be complex, especially for data analysts working with large data lakes and complex SQL queries. This tool converts questions from data analysts asked in natural language (such as “Which table contains customer address information?”)

SQL

SQL Data Lakes Data Analyst AWS

The power of remote engine execution for ETL/ELT data pipelines

IBM Journey to AI blog

MAY 15, 2024

Data must be combined and harmonized from multiple sources into a unified, coherent format before being used with AI models. Unified, governed data can also be put to use for various analytical, operational and decision-making purposes. This process is known as data integration, one of the key components to a strong data fabric.

Data Pipeline

Data Pipeline ETL SQL Database

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning Blog

DECEMBER 4, 2024

They have structured data such as sales transactions and revenue metrics stored in databases, alongside unstructured data such as customer reviews and marketing reports collected from various channels. This will provision the backend infrastructure and services that the sales analytics application will rely on.

AWS

AWS AI AI SQL

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

However, most organizations struggle to become data driven. Data is stuck in siloes, infrastructure can’t scale to meet growing data needs, and analytics is still too hard for most people to use. Google's Cloud Platform is the enterprise solution of choice for many organizations with large and complex data problems.

Tableau

Tableau Analytics Analytics Machine Learning

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

At the heart of this transformation is the OMRON Data & Analytics Platform (ODAP), an innovative initiative designed to revolutionize how the company harnesses its data assets. The robust security features provided by Amazon S3, including encryption and durability, were used to provide data protection.

AWS

AWS Data Governance Data Silos SQL

Serverless High Volume ETL data processing on Code Engine

IBM Data Science in Practice

JANUARY 13, 2025

The blog post explains how the Internal Cloud Analytics team leveraged cloud resources like Code-Engine to improve, refine, and scale the data pipelines. Background One of the Analytics teams tasks is to load data from multiple sources and unify it into a data warehouse.

ETL

ETL Data Pipeline Database Data Warehouse

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Amazon QuickSight powers data-driven organizations with unified (BI) at hyperscale. With QuickSight, all users can meet varying analytic needs from the same source of truth through modern interactive dashboards, paginated reports, embedded analytics, and natural language queries. Basic knowledge of a SQL query editor.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

Build Data Pipelines: Comprehensive Step-by-Step Guide

Pickl AI

JULY 8, 2024

Summary: This blog explains how to build efficient data pipelines, detailing each step from data collection to final delivery. Introduction Data pipelines play a pivotal role in modern data architecture by seamlessly transporting and transforming raw data into valuable insights.

Data Pipeline

Data Pipeline Data Quality Database Apache Kafka

Most Frequently Asked Azure Data Factory Interview Questions

Analytics Vidhya

FEBRUARY 20, 2023

Azure data factory helps organizations across the globe in making critical business decisions by collecting data from various sources such as e-commerce websites, supply chains, logistics, […] The post Most Frequently Asked Azure Data Factory Interview Questions appeared first on Analytics Vidhya.

Azure

Azure ETL Analytics Analytics

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to specific questions.

Data Science

Data Science Analytics Analytics Data Scientist

How to Build Effective Data Pipelines in Snowpark

phData

AUGUST 6, 2024

As today’s world keeps progressing towards data-driven decisions, organizations must have quality data created from efficient and effective data pipelines. For customers in Snowflake, Snowpark is a powerful tool for building these effective and scalable data pipelines.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Tableau

OCTOBER 8, 2021

However, most organizations struggle to become data driven. Data is stuck in siloes, infrastructure can’t scale to meet growing data needs, and analytics is still too hard for most people to use. Google's Cloud Platform is the enterprise solution of choice for many organizations with large and complex data problems.

Tableau

Tableau Analytics Analytics Machine Learning

Understanding ETL Tools as a Data-Centric Organization

Smart Data Collective

SEPTEMBER 8, 2021

The data is initially extracted from a vast array of sources before transforming and converting it to a specific format based on business requirements. ETL is one of the most integral processes required by Business Intelligence and Analytics use cases since it relies on the data stored in Data Warehouses to build reports and visualizations.

ETL

ETL Hadoop Data Warehouse Data Pipeline

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

The modern data stack is defined by its ability to handle large datasets, support complex analytical workflows, and scale effortlessly as data and business needs grow. Two key technologies that have become foundational for this type of architecture are the Snowflake AI Data Cloud and Dataiku.

Machine Learning

Machine Learning Machine Learning Data Science ML

The journey of PGA TOUR’s generative AI virtual assistant, from concept to development to prototype

AWS Machine Learning Blog

MARCH 14, 2024

Using structured data to answer questions requires a way to effectively extract data that’s relevant to a user’s query. We formulated a text-to-SQL approach where by a user’s natural language query is converted to a SQL statement using an LLM. The SQL is run by Amazon Athena to return the relevant data.

SQL

SQL AWS AI AI

How to Optimize Power BI and Snowflake for Advanced Analytics

phData

MAY 25, 2023

How to Optimize Power BI and Snowflake for Advanced Analytics Spencer Baucke May 25, 2023 The world of business intelligence and data modernization has never been more competitive than it is today. Much of what is discussed in this guide will assume some level of analytics strategy has been considered and/or defined. No problem!

Power BI

Power BI Analytics Analytics Azure

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Tableau

JUNE 8, 2021

Domain experts, for example, feel they are still overly reliant on core IT to access the data assets they need to make effective business decisions. In all of these conversations there is a sense of inertia: Data warehouses and data lakes feel cumbersome and data pipelines just aren't agile enough.

Tableau

Tableau Data Lakes Data Warehouse SQL

The 2021 Executive Guide To Data Science and AI

Applied Data Science

AUGUST 2, 2021

Automation Automating data pipelines and models ➡️ 6. The most common data science languages are Python and R — SQL is also a must have skill for acquiring and manipulating data. The Data Engineer Not everyone working on a data science project is a data scientist.

Data Science

Data Science Data Scientist ML ML

How Alteryx & Snowflake Accelerates Analytics

phData

FEBRUARY 24, 2023

Alteryx and the Snowflake Data Cloud offer a potential solution to this issue and can speed up your path to Analytics. In this blog post, we will explore how Alteryx and Snowflake can accelerate your journey to Analytics by sharing use cases and best practices. What is Alteryx? What is Snowflake?

Analytics

Analytics Analytics Database Python

Journeying into the realms of ML engineers and data scientists

Dataconomy

MAY 16, 2023

Skills and qualifications required for the role To excel as a machine learning engineer, individuals need a combination of technical skills, analytical thinking, and problem-solving abilities. They work with raw data, transform it into a usable format, and apply various analytical techniques to extract actionable insights.

Data Scientist

Data Scientist ML ML Machine Learning

40 Must-Know Data Science Skills and Frameworks for 2023

ODSC - Open Data Science

FEBRUARY 2, 2023

Being able to discover connections between variables and to make quick insights will allow any practitioner to make the most out of the data. Analytics and Data Analysis Coming in as the 4th most sought-after skill is data analytics, as many data scientists will be expected to do some analysis in their careers.

Data Science

Data Science Data Scientist Computer Science Computer Science

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC - Open Data Science

FEBRUARY 24, 2023

There are many well-known libraries and platforms for data analysis such as Pandas and Tableau, in addition to analytical databases like ClickHouse, MariaDB, Apache Druid, Apache Pinot, Google BigQuery, Amazon RedShift, etc. VisiData works with CSV files, Excel spreadsheets, SQL databases, and many other data sources.

Exploratory Data Analysis

Exploratory Data Analysis Data Visualization Data Analysis Data Analysis

A Guide to Choose the Best Data Science Bootcamp

Data Science Dojo

JULY 3, 2024

Data Processing and Analysis : Techniques for data cleaning, manipulation, and analysis using libraries such as Pandas and Numpy in Python. Databases and SQL : Managing and querying relational databases using SQL, as well as working with NoSQL databases like MongoDB.

Data Science

Data Science Machine Learning Machine Learning Data Visualization

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the best price-performance at any scale. You can use query_string to filter your dataset by SQL and unload it to Amazon S3.

ML

ML ML AWS Data Warehouse

Use Snowflake as a data source to train ML models with Amazon SageMaker

AWS Machine Learning Blog

MARCH 8, 2023

In order to train a model using data stored outside of the three supported storage services, the data first needs to be ingested into one of these services (typically Amazon S3). This requires building a data pipeline (using tools such as Amazon SageMaker Data Wrangler ) to move data into Amazon S3.

ML

ML ML AWS Python

How to Load Google Analytics 4 Dataset into Snowflake with BigQuery & Azure Data Factory

phData

SEPTEMBER 5, 2023

Google Analytics 4 (GA4) is a powerful tool for collecting and analyzing website and app data that many businesses rely heavily on to make informed business decisions. However, there might be instances where you need to migrate the raw event data from GA4 to Snowflake for more in-depth analysis and business intelligence purposes.

Azure

Azure Analytics Analytics Data Pipeline

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

In this post, you will learn about the 10 best data pipeline tools, their pros, cons, and pricing. A typical data pipeline involves the following steps or processes through which the data passes before being consumed by a downstream process, such as an ML model training process.

Data Pipeline

Data Pipeline ETL SQL Data Quality

What Are Snowflake’s Best Features for Data Transformation?

phData

AUGUST 8, 2024

Putting the T for Transformation in ELT (ETL) is essential to any data pipeline. After extracting and loading your data into the Snowflake AI Data Cloud , you may wonder how best to transform it. Luckily, Snowflake answers this question with many features designed to transform your data for all your analytic use cases.

SQL

SQL Data Pipeline Python ETL

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Data Visualization: Matplotlib, Seaborn, Tableau, etc.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

What Is Fivetran and How Much Does It Cost?

phData

MARCH 8, 2023

With over 160 data connectors available, Fivetran makes it easy to move data out of, into, and across any cloud data platform in the market. Building data pipelines manually is an expensive and time-consuming process. Why Use Fivetran?

Data Warehouse

Data Warehouse Data Engineer Data Engineering Data Engineering

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Women in Big Data

NOVEMBER 27, 2024

It consolidates data from various systems, such as transactional databases, CRM platforms, and external data sources, enabling organizations to perform complex queries and derive insights. By maintaining historical data from disparate locations, a data warehouse creates a foundation for trend analysis and strategic decision-making.

Data Warehouse

Data Warehouse Big Data Big Data Azure

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Data Engineering is designing, constructing, and managing systems that enable data collection, storage, and analysis. It involves developing data pipelines that efficiently transport data from various sources to storage solutions and analytical tools. ETL is vital for ensuring data quality and integrity.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Top 5 Fivetran Connectors for Healthcare

phData

APRIL 29, 2024

In our previous blog, Top 5 Fivetran Connectors for Financial Services , we explored Fivetran’s capabilities that address the data integration needs of the finance industry. Now, let’s cover the healthcare industry, which also has a surging demand for data and analytics, along with the underlying processes to make it happen.

SQL

SQL Data Warehouse Azure Cloud Data

How to Build a SQL Agent with CrewAI and Composio?

CI/CD for Data Pipelines: A Game-Changer with AnalyticsCreator

Webinars

Trending Sources

Dynamic SQL Queries to Transform Data

Webinars

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Interacting with Remote Databases – PostgreSQL and DBAPIs

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

Exploring the Power of Microsoft Fabric: A Hands-On Guide with a Sales Use Case

Automating CSV to PostgreSQL Ingestion with Airflow and Docker

Airbyte: The ultimate workhorse for all your ELT pipelines

Essential data engineering tools for 2023: Empowering for management and analysis

Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

The power of remote engine execution for ETL/ELT data pipelines

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Shaping the future: OMRON’s data-driven journey with AWS

Serverless High Volume ETL data processing on Code Engine

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

Build Data Pipelines: Comprehensive Step-by-Step Guide

Most Frequently Asked Azure Data Factory Interview Questions

Data science vs data analytics: Unpacking the differences

How to Build Effective Data Pipelines in Snowpark

Self-Service Analytics for Google Cloud, now with Looker and Tableau

Understanding ETL Tools as a Data-Centric Organization

How Dataiku and Snowflake Strengthen the Modern Data Stack

The journey of PGA TOUR’s generative AI virtual assistant, from concept to development to prototype

How to Optimize Power BI and Snowflake for Advanced Analytics

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

The 2021 Executive Guide To Data Science and AI

How Alteryx & Snowflake Accelerates Analytics

Journeying into the realms of ML engineers and data scientists

40 Must-Know Data Science Skills and Frameworks for 2023

11 Open Source Data Exploration Tools You Need to Know in 2023

A Guide to Choose the Best Data Science Bootcamp

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Use Snowflake as a data source to train ML models with Amazon SageMaker

How to Load Google Analytics 4 Dataset into Snowflake with BigQuery & Azure Data Factory

Comparing Tools For Data Processing Pipelines

What Are Snowflake’s Best Features for Data Transformation?

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

What Is Fivetran and How Much Does It Cost?

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Discover the Most Important Fundamentals of Data Engineering

Top 5 Fivetran Connectors for Healthcare

Stay Connected