Artificial Intelligence, Data Warehouse and SQL

10 essential SQL concepts for data scientists: Tips and examples

Data Science Dojo

APRIL 25, 2023

SQL (Structured Query Language) is an important tool for data scientists. It is a programming language used to manipulate data stored in relational databases. Mastering SQL concepts allows a data scientist to quickly analyze large amounts of data and make decisions based on their findings.

Data Scientist

Data Scientist SQL Machine Learning Machine Learning

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

AWS Machine Learning Blog

APRIL 16, 2024

In the process of working on their ML tasks, data scientists typically start their workflow by discovering relevant data sources and connecting to them. They then use SQL to explore, analyze, visualize, and integrate data from various sources before using it in their ML training and inference.

SQL

SQL AWS Database Data Scientist

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads. Learn more about the AWS zero-ETL future with newly launched AWS databases integrations with Amazon Redshift.

AWS

AWS Data Warehouse ETL SQL

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Why companies need to accelerate data warehousing solution modernization

IBM Journey to AI blog

APRIL 24, 2023

Data is reported from one central repository, enabling management to draw more meaningful business insights and make faster, better decisions. By running reports on historical data, a data warehouse can clarify what systems and processes are working and what methods need improvement.

Data Warehouse

Data Warehouse Data Lakes Database Big Data

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

OMRONs data strategyrepresented on ODAPalso allowed the organization to unlock generative AI use cases focused on tangible business outcomes and enhanced productivity. This tool democratizes data access across the organization, enabling even nontechnical users to gain valuable insights.

AWS

AWS Data Governance Data Silos SQL

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Flipboard

DECEMBER 6, 2023

In this post, we discuss a Q&A bot use case that Q4 has implemented, the challenges that numerical and structured datasets presented, and how Q4 concluded that using SQL may be a viable solution. RAG with semantic search – Conventional RAG with semantic search was the last step before moving to SQL generation.

SQL

SQL Database AWS Machine Learning

Securing the data pipeline, from blockchain to AI

Dataconomy

OCTOBER 8, 2024

Generative artificial intelligence is the talk of the town in the technology world today. Space and Time’s creator SxT Labs has created three technologies that underpin its verifiable compute layer, including a blockchain indexer, a distributed data warehouse and a zero-knowledge coprocessor.

Data Pipeline

Data Pipeline AI AI Data Warehouse

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

In this article, we will delve into the concept of data lakes, explore their differences from data warehouses and relational databases, and discuss the significance of data version control in the context of large-scale data management. Schema Enforcement: Data warehouses use a “schema-on-write” approach.

Data Lakes

Data Lakes Data Warehouse Database Big Data

Unlock the power of structured data for enterprises using natural language with Amazon Q Business

AWS Machine Learning Blog

AUGUST 20, 2024

One of the most common applications of generative artificial intelligence (AI) and large language models (LLMs) in an enterprise environment is answering questions based on the enterprise’s knowledge corpus. Natural language is ambiguous and imprecise, whereas data adheres to rigid schemas.

SQL

SQL AWS Database Natural Language Processing

Sneak peek at Microsoft Fabric price and its promising features

Dataconomy

JUNE 1, 2023

Unified data storage : Fabric’s centralized data lake, Microsoft OneLake, eliminates data silos and provides a unified storage system, simplifying data access and retrieval. Flexible compute capacity One of the key advantages of Microsoft Fabric is its ability to optimize compute capacity across different workloads.

Power BI

Power BI Data Lakes Azure Data Silos

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

Its cloud-native architecture, combined with robust data-sharing capabilities, allows businesses to easily leverage cutting-edge tools from partners like Dataiku, fostering innovation and driving more insightful, data-driven outcomes. Dataiku and Snowflake: A Good Combo?

Machine Learning

Machine Learning Machine Learning Data Science ML

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

AWS Machine Learning Blog

JUNE 13, 2023

The natural language capabilities allow non-technical users to query data through conversational English rather than complex SQL. The AI and language models must identify the appropriate data sources, generate effective SQL queries, and produce coherent responses with embedded results at scale.

Database

Database SQL AWS AI

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Microsoft secures your place in the world of business

Dataconomy

SEPTEMBER 27, 2023

The Microsoft Certified Solutions Associate and Microsoft Certified Solutions Expert certifications cover a wide range of topics related to Microsoft’s technology suite, including Windows operating systems, Azure cloud computing, Office productivity software, Visual Studio programming tools, and SQL Server databases.

Database Administration

Database Administration Database Azure Cloud Computing

Celebrating 40 years of Db2: Running the world’s mission critical workloads

IBM Journey to AI blog

SEPTEMBER 11, 2023

Codd published his famous paper “ A Relational Model of Data for Large Shared Data Banks.” Boyce to create Structured Query Language (SQL). Developers can leverage features like REST APIs, JSON support and enhanced SQL compatibility to easily build cloud-native applications. Chamberlin and Raymond F.

Database

Database SQL Data Warehouse Machine Learning

IBM to help businesses scale AI workloads, for all data, anywhere

IBM Journey to AI blog

MAY 9, 2023

Watsonx.data will allow users to access their data through a single point of entry and run multiple fit-for-purpose query engines across IT environments. Through workload optimization an organization can reduce data warehouse costs by up to 50 percent by augmenting with this solution. [1]

Data Warehouse

Data Warehouse AWS AI AI

A Primer to Scaling Pandas

ODSC - Open Data Science

AUGUST 23, 2023

Run pandas at scale on your data warehouse Most enterprise data teams store their data in a database or data warehouse, such as Snowflake, BigQuery, or DuckDB. Ponder solves this problem by translating your pandas code to SQL that can be understood by your data warehouse.

Data Warehouse

Data Warehouse Data Science Database SQL

Introduction to Power BI Datamarts

ODSC - Open Data Science

JUNE 12, 2023

They all agree that a Datamart is a subject-oriented subset of a data warehouse focusing on a particular business unit, department, subject area, or business functionality. The Datamart’s data is usually stored in databases containing a moving frame required for data analysis, not the full history of data.

Power BI

Power BI Data Warehouse ETL Data Preparation

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Additionally, Feast promotes feature reuse, so the time spent on data preparation is reduced greatly.

AWS

AWS Machine Learning Machine Learning ML

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC - Open Data Science

FEBRUARY 24, 2023

With Great Expectations , data teams can express what they “expect” from their data using simple assertions. Great Expectations provides support for different data backends such as flat file formats, SQL databases, Pandas dataframes and Sparks, and comes with built-in notification and data documentation functionality.

Exploratory Data Analysis

Exploratory Data Analysis Data Visualization Data Analysis Data Analysis

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. A typical modern data stack consists of the following: A data warehouse.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize

AWS Machine Learning Blog

JANUARY 6, 2023

The rules in this engine were predefined and written in SQL, which aside from posing a challenge to manage, also struggled to cope with the proliferation of data from TR’s various integrated data source. TR customer data is changing at a faster rate than the business rules can evolve to reflect changing customer needs.

AWS

AWS Data Warehouse ML ML

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Analytics Analytics Data Scientist

Unleashing the power of Presto: The Uber case study

IBM Journey to AI blog

SEPTEMBER 25, 2023

Every day, millions of riders use the Uber app, unwittingly contributing to a complex web of data-driven decisions. This blog takes you on a journey into the world of Uber’s analytics and the critical role that Presto, the open source SQL query engine, plays in driving their success. What is Presto?

Data Lakes

Data Lakes Analytics Analytics Clustering

How to modernize data lakes with a data lakehouse architecture

IBM Journey to AI blog

JULY 5, 2023

Another unexpected challenge was the introduction of Spark as a processing framework for big data. It gained rapid popularity given its support for data transformations, streaming and SQL. But it never co-existed amicably within existing data lake environments.

Data Lakes

Data Lakes Data Warehouse Data Governance Analytics

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Introduction ETL plays a crucial role in Data Management. This process enables organisations to gather data from various sources, transform it into a usable format, and load it into data warehouses or databases for analysis. Loading The transformed data is loaded into the target destination, such as a data warehouse.

ETL

ETL Data Warehouse Data Quality Data Governance

ETL Process Explained: Essential Steps for Effective Data Management

Pickl AI

OCTOBER 17, 2024

It is a data integration process that involves extracting data from various sources, transforming it into a suitable format, and loading it into a target system, typically a data warehouse. ETL is the backbone of effective data management, ensuring organisations can leverage their data for informed decision-making.

ETL

ETL Data Warehouse SQL Data Quality

Data platform trinity: Competitive or complementary?

IBM Journey to AI blog

JANUARY 18, 2023

They defined it as : “ A data lakehouse is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of data lakes with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data. ”.

Data Lakes

Data Lakes Data Warehouse Azure Apache Hadoop

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

It is supported by querying, governance, and open data formats to access and share data across the hybrid cloud. Through workload optimization across multiple query engines and storage tiers, organizations can reduce data warehouse costs by up to 50 percent.

AI

AI AI Machine Learning Machine Learning

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

With the birth of cloud data warehouses, data applications, and generative AI , processing large volumes of data faster and cheaper is more approachable and desired than ever. First up, let’s dive into the foundation of every Modern Data Stack, a cloud-based data warehouse.

Data Warehouse

Data Warehouse Analytics Analytics SQL

5 Benefits of BigQuery for Marketers

ODSC - Open Data Science

FEBRUARY 8, 2023

Common databases appear unable to cope with the immense increase in data volumes. This is where the BigQuery data warehouse comes into play. Building a data center on your own can be expensive, time-consuming, and difficult to scale. BigQuery for Marketing: What Makes it Special?

Database

Database Data Science Big Data Big Data

Db2 Warehouse delivers 4x faster query performance than previously, while cutting storage costs by 34x

IBM Journey to AI blog

JULY 11, 2023

Data warehouses are a critical component of any organization’s technology ecosystem. They provide the backbone for a range of use cases such as business intelligence (BI) reporting, dashboarding, and machine-learning (ML)-based predictive analytics that enable faster decision making and insights.

Data Warehouse

Data Warehouse Database Cloud Data Big Data

Top 50+ Data Analyst Interview Questions & Answers

Pickl AI

APRIL 26, 2024

This comprehensive blog outlines vital aspects of Data Analyst interviews, offering insights into technical, behavioural, and industry-specific questions. It covers essential topics such as SQL queries, data visualization, statistical analysis, machine learning concepts, and data manipulation techniques.

Data Analyst

Data Analyst Data Analysis Data Analysis Machine Learning

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Data Visualization: Matplotlib, Seaborn, Tableau, etc.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Transitioning off Amazon Lookout for Metrics

AWS Machine Learning Blog

OCTOBER 9, 2024

To start using CloudWatch anomaly detection, you first must ingest data into CloudWatch and then enable anomaly detection on the log group. Using Amazon Redshift ML for anomaly detection Amazon Redshift ML makes it easy to create, train, and apply machine learning models using familiar SQL commands in Amazon Redshift data warehouses.

AWS

AWS ML ML Data Quality

The Big Lie about Data

DataSeries

JANUARY 18, 2023

Classical data systems are founded on this story. Nonetheless, the truth is slowing starting to emerge… The value of data is not in insights Most dashboards fail to provide useful insights and quickly become derelict. We increasingly refer to these technologies collectively as Artificial Intelligence (AI).

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Data Warehouse Deep Learning

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Journey to AI blog

AUGUST 4, 2023

It’s distributed both in the cloud and on-premises, allowing extensive use and movement across clouds, apps and networks, as well as stores of data at rest. An architecture designed for data democratization aims to be flexible, integrated, agile and secure to enable the use of data and artificial intelligence (AI) at scale.

Data Lakes

Data Lakes AI AI Data Governance

Data science vs. machine learning: What’s the difference?

IBM Journey to AI blog

JULY 6, 2023

Data science solves a business problem by understanding the problem, knowing the data that’s required, and analyzing the data to help solve the real-world problem. Machine learning (ML) is a subset of artificial intelligence (AI) that focuses on learning from what the data science comes up with.

Machine Learning

Machine Learning Machine Learning Data Science Big Data

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

AWS Machine Learning Blog

JUNE 25, 2024

Amazon Bedrock , a fully managed service designed to facilitate the integration of LLMs into enterprise applications, offers a choice of high-performing LLMs from leading artificial intelligence (AI) companies like Anthropic, Mistral AI, Meta, and Amazon through a single API. The LLM generates output based on the user prompt.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

AI that’s ready for business starts with data that’s ready for AI

IBM Journey to AI blog

JULY 3, 2024

This includes integration with your data warehouse engines, which now must balance real-time data processing and decision-making with cost-effective object storage, open source technologies and a shared metadata layer to share data seamlessly with your data lakehouse.

AI

AI AI Data Quality Database

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

Businesses face significant hurdles when preparing data for artificial intelligence (AI) applications. The existence of data silos and duplication, alongside apprehensions regarding data quality, presents a multifaceted environment for organizations to manage.

AWS

AWS Database ETL AI

Big Data Syllabus: A Comprehensive Overview

Pickl AI

AUGUST 9, 2024

Understanding the differences between SQL and NoSQL databases is crucial for students. Data Warehousing Solutions Tools like Amazon Redshift, Google BigQuery, and Snowflake enable organisations to store and analyse large volumes of data efficiently. Understanding the benefits and challenges of cloud storage is crucial.

Big Data

Big Data Big Data Big Data Analytics Big Data Analytics

Your Complete Roadmap to Become an Azure Data Scientist

Pickl AI

SEPTEMBER 5, 2024

However, a master’s degree or specialised Data Science or Machine Learning courses can give you a competitive edge, offering advanced knowledge and practical experience. Essential Technical Skills Technical proficiency is at the heart of an Azure Data Scientist’s role.

Azure

Azure Data Scientist Data Science Machine Learning

10 essential SQL concepts for data scientists: Tips and examples

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

Webinars

Trending Sources

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Webinars

Why companies need to accelerate data warehousing solution modernization

Shaping the future: OMRON’s data-driven journey with AWS

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Securing the data pipeline, from blockchain to AI

Data Version Control for Data Lakes: Handling the Changes in Large Scale

Unlock the power of structured data for enterprises using natural language with Amazon Q Business

Sneak peek at Microsoft Fabric price and its promising features

How Dataiku and Snowflake Strengthen the Modern Data Stack

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

Microsoft secures your place in the world of business

Celebrating 40 years of Db2: Running the world’s mission critical workloads

IBM to help businesses scale AI workloads, for all data, anywhere

A Primer to Scaling Pandas

Introduction to Power BI Datamarts

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

11 Open Source Data Exploration Tools You Need to Know in 2023

The Modern Data Stack Explained: What The Future Holds

How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize

Data science vs data analytics: Unpacking the differences

Unleashing the power of Presto: The Uber case study

How to modernize data lakes with a data lakehouse architecture

Maximising Efficiency with ETL Data: Future Trends and Best Practices

ETL Process Explained: Essential Steps for Effective Data Management

Data platform trinity: Competitive or complementary?

Exploring the AI and data capabilities of watsonx

The Ultimate Modern Data Stack Migration Guide

5 Benefits of BigQuery for Marketers

Db2 Warehouse delivers 4x faster query performance than previously, while cutting storage costs by 34x

Top 50+ Data Analyst Interview Questions & Answers

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Transitioning off Amazon Lookout for Metrics

The Big Lie about Data

Data democratization: How data architecture can drive business decisions and AI initiatives

Data science vs. machine learning: What’s the difference?

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

AI that’s ready for business starts with data that’s ready for AI

Tackling AI’s data challenges with IBM databases on AWS

Big Data Syllabus: A Comprehensive Overview

Your Complete Roadmap to Become an Azure Data Scientist

Stay Connected