AWS, Data Lakes and SQL - Data Science Current

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning Blog

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Database AI AI

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

At the heart of this transformation is the OMRON Data & Analytics Platform (ODAP), an innovative initiative designed to revolutionize how the company harnesses its data assets. Amazon AppFlow was used to facilitate the smooth and secure transfer of data from various sources into ODAP.

AWS

AWS Data Governance Data Silos SQL

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

AWS Machine Learning Blog

DECEMBER 16, 2024

We spoke with Dr. Swami Sivasubramanian, Vice President of Data and AI, shortly after AWS re:Invent 2024 to hear his impressionsand to get insights on how the latest AWS innovations help meet the real-world needs of customers as they build and scale transformative generative AI applications. Canva uses AWS to power 1.2

AWS

AWS AI AI Data Warehouse

Webinars

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

Communication between the two systems was established through Kerberized Apache Livy (HTTPS) connections over AWS PrivateLink. Data exploration and model development were conducted using well-known machine learning (ML) tools such as Jupyter or Apache Zeppelin notebooks. This also led to a backlog of data that needed to be ingested.

Data Science

Data Science AWS Hadoop Data Scientist

Generate financial industry-specific insights using generative AI and in-context fine-tuning

AWS Machine Learning Blog

NOVEMBER 12, 2024

You may check out additional reference notebooks on aws-samples for how to use Meta’s Llama models hosted on Amazon Bedrock. You can implement these steps either from the AWS Management Console or using the latest version of the AWS Command Line Interface (AWS CLI).

SQL

SQL AWS AI AI

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications.

SQL

SQL Data Lakes Data Analyst AWS

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Customers use Amazon Redshift as a key component of their data architecture to drive use cases from typical dashboarding to self-service analytics, real-time analytics, machine learning (ML), data sharing and monetization, and more.

AWS

AWS Data Warehouse ETL SQL

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Dataversity

MARCH 26, 2024

Writing data to an AWS data lake and retrieving it to populate an AWS RDS MS SQL database involves several AWS services and a sequence of steps for data transfer and transformation.

Data Lakes

Data Lakes SQL AWS ETL

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

AWS Machine Learning Blog

FEBRUARY 28, 2024

Structured Query Language (SQL) is a complex language that requires an understanding of databases and metadata. Today, generative AI can enable people without SQL knowledge. This generative AI task is called text-to-SQL, which generates SQL queries from natural language processing (NLP) and converts text into semantically correct SQL.

SQL

SQL AWS Database ML

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Flipboard

DECEMBER 11, 2024

Many of these applications are complex to build because they require collaboration across teams and the integration of data, tools, and services. Data engineers use data warehouses, data lakes, and analytics tools to load, transform, clean, and aggregate data. Choose Create VPC.

SQL

SQL AWS Data Lakes AI

Imperva optimizes SQL generation from natural language using Amazon Bedrock

AWS Machine Learning Blog

JUNE 20, 2024

Our goal was to improve the user experience of an existing application used to explore the counters and insights data. The data is stored in a data lake and retrieved by SQL using Amazon Athena. The following figure shows a search query that was translated to SQL and run. The challenge is to assure quality.

SQL

SQL Database AWS Machine Learning

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

Flipboard

JUNE 26, 2023

Companies are faced with the daunting task of ingesting all this data, cleansing it, and using it to provide outstanding customer experience. Typically, companies ingest data from multiple sources into their data lake to derive valuable insights from the data. Run the AWS Glue ML transform job.

AWS

AWS ML ML ETL

How Northpower used computer vision with AWS to automate safety inspection risk assessments

AWS Machine Learning Blog

SEPTEMBER 27, 2024

Amazon Simple Storage Service (Amazon S3) stores the model artifacts and creates a data lake to host the inference output, document analysis output, and other datasets in CSV format. About the authors Scott Patterson is a Senior Solutions Architect at AWS. The sunburst graph below is a visualization of this classification.

AWS

AWS Data Lakes ML ML

Data-Centric Firms Address Athena Shortcomings with Smart Indexing

Smart Data Collective

FEBRUARY 23, 2022

Traditional relational databases provide certain benefits, but they are not suitable to handle big and various data. That is when data lake products started gaining popularity, and since then, more companies introduced lake solutions as part of their data infrastructure. AWS Athena and S3. Limits of Athena.

Data Lakes

Data Lakes AWS SQL Big Data

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

AWS Machine Learning Blog

JUNE 5, 2023

To simplify access to Parquet files, Amazon SageMaker Canvas has added data import capabilities from over 40 data sources , including Amazon Athena , which supports Apache Parquet. Canvas provides connectors to AWS data sources such as Amazon Simple Storage Service (Amazon S3), Athena, and Amazon Redshift.

Machine Learning

Machine Learning Machine Learning AWS Data Lakes

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Generating value from enterprise data: Best practices for Text2SQL and generative AI

AWS Machine Learning Blog

JANUARY 4, 2024

One such area that is evolving is using natural language processing (NLP) to unlock new opportunities for accessing data through intuitive SQL queries. Instead of dealing with complex technical code, business users and data analysts can ask questions related to data and insights in plain language.

SQL

SQL Database AI AI

Cloud Data Science News Beta #1

Data Science 101

NOVEMBER 11, 2019

Azure Synapse Analytics This is the future of data warehousing. It combines data warehousing and data lakes into a simple query interface for a simple and fast analytics service. SQL Server 2019 SQL Server 2019 went Generally Available. It can be used to do distributed Machine Learning on AWS.

Cloud Data

Cloud Data Data Science Azure Clustering

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Amazon Redshift: Amazon Redshift is a cloud-based data warehousing service provided by Amazon Web Services (AWS). Amazon Redshift allows data engineers to analyze large datasets quickly using massively parallel processing (MPP) architecture. It offers extensibility and integration with various data engineering tools.

Data Engineering

Data Engineering Data Engineer Data Engineering Data Engineering

Approaches to migrating your VMware workloads to AWS

IBM Journey to AI blog

JUNE 11, 2024

Further to the acquisition, Broadcom decided to discontinue (link resides outside ibm.com) its AWS authorization to resell VMware Cloud on AWS as of 30 April 2024. As a result, AWS will no longer be able to offer new subscriptions or additional services.

AWS

AWS AI AI Data Lakes

How Marubeni is optimizing market decisions using AWS machine learning and analytics

AWS Machine Learning Blog

MARCH 8, 2023

This solution helps market analysts design and perform data-driven bidding strategies optimized for power asset profitability. In this post, you will learn how Marubeni is optimizing market decisions by using the broad set of AWS analytics and ML services, to build a robust and cost-effective Power Bid Optimization solution.

AWS

AWS Machine Learning Machine Learning Analytics

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

The solution: IBM databases on AWS To solve for these challenges, IBM’s portfolio of SaaS database solutions on Amazon Web Services (AWS), enables enterprises to scale applications, analytics and AI across the hybrid cloud landscape. Let’s delve into the database portfolio from IBM available on AWS. 

AWS

AWS Database ETL AI

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Alation

FEBRUARY 20, 2020

For many enterprises, a hybrid cloud data lake is no longer a trend, but becoming reality. Due to these needs, hybrid cloud data lakes emerged as a logical middle ground between the two consumption models. AWS Propelling Hybrid Cloud Environments. The Problem with Hybrid Cloud Environments.

Data Lakes

Data Lakes Cloud Data AWS Tableau

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Flipboard

DECEMBER 6, 2023

needed to address some of these challenges in one of their many AI use cases built on AWS. In this post, we discuss a Q&A bot use case that Q4 has implemented, the challenges that numerical and structured datasets presented, and how Q4 concluded that using SQL may be a viable solution.

SQL

SQL Database AWS Machine Learning

Data Science News from Microsoft Ignite 2019

Data Science 101

NOVEMBER 7, 2019

Azure Synapse Analytics can be seen as a merge of Azure SQL Data Warehouse and Azure Data Lake. Synapse allows one to use SQL to query petabytes of data, both relational and non-relational, with amazing speed. Azure Synapse. I think this announcement will have a very large and immediate impact.

Data Science

Data Science Azure SQL Machine Learning

Building an Effective OSS Management Layer for Your Data Lake

ODSC - Open Data Science

OCTOBER 13, 2024

Be sure to check out her talk, “ Don’t Go Over the Deep End: Building an Effective OSS Management Layer for Your Data Lake ,” there! Managing a data lake can often feel like being lost at sea — especially when dealing with both structured and unstructured data.

Data Lakes

Data Lakes Database Data Pipeline SQL

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

Amazon Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the best price-performance at any scale. If you want to do the process in a low-code/no-code way, you can follow option C.

ML

ML ML AWS Data Warehouse

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

AWS Machine Learning Blog

JUNE 13, 2023

The natural language capabilities allow non-technical users to query data through conversational English rather than complex SQL. The AI and language models must identify the appropriate data sources, generate effective SQL queries, and produce coherent responses with embedded results at scale.

Database

Database SQL AWS AI

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

AWS Machine Learning Blog

MAY 31, 2024

With the Amazon Bedrock serverless experience, you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using the Amazon Web Services (AWS) tools without having to manage infrastructure. The following screenshot shows what the upload looks like when it’s complete.

AWS

AWS Machine Learning Machine Learning Database

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

AWS Machine Learning Blog

JUNE 25, 2024

In addition, the generative business intelligence (BI) capabilities of QuickSight allow you to ask questions about customer feedback using natural language, without the need to write SQL queries or learn a BI tool. In the first step, an AWS Lambda function reads and validates the file, and extracts the raw data.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Why Open Table Format Architecture is Essential for Modern Data Systems

phData

NOVEMBER 8, 2024

Note : Cloud Data warehouses like Snowflake and Big Query already have a default time travel feature. However, this feature becomes an absolute must-have if you are operating your analytics on top of your data lake or lakehouse. It can also be integrated into major data platforms like Snowflake. Contact phData Today!

Data Lakes

Data Lakes Data Warehouse Database Azure

Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas

AWS Machine Learning Blog

NOVEMBER 27, 2023

Configure the following scopes on your connected app: Manage user data via APIs ( api ). Perform ANSI SQL queries on Salesforce Data Cloud data (Data Cloud_query_api ). Manage Data Cloud profile data ( Data Cloud_profile_api ). Drag and drop the file, then choose Edit in SQL.

ML

ML ML AWS SQL

Generate actionable insights for predictive maintenance management with Amazon Monitron and Amazon Kinesis

AWS Machine Learning Blog

APRIL 18, 2023

It includes sensor devices to capture vibration and temperature data, a gateway device to securely transfer data to the AWS Cloud, the Amazon Monitron service that analyzes the data for anomalies with ML, and a companion mobile app to track potential failures in your machinery.

AWS

AWS ML ML Database

Pathways to optimize, migrate, and modernize your VMware workloads on AWS

IBM Journey to AI blog

JULY 15, 2024

Explore VMware Modernization Assessment In this blog, we will share how IBM Consulting can help organizations with a preference for AWS-based cloud-native technologies, leveraging the contemporary tools and modern cloud services that AWS has to offer.

AWS

AWS AI AI Data Lakes

Accelerating query performance with watsonx.data Presto C++ and Intel Sapphire Rapid Processor on AWS

IBM Journey to AI blog

OCTOBER 15, 2024

IBM watsonx.data is a hybrid, governed data lake house optimized for data, analytics and AI workloads. Additionally, watsonx.data provides a flexible approach and a unified view of your data across hybrid cloud environments. Key highlights include driving business analytics with engines like Presto and Spark.

AWS

AWS Data Lakes EDA Database

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Women in Big Data

NOVEMBER 27, 2024

How to Choose a Data Warehouse for Your Big Data Choosing a data warehouse for big data storage necessitates a thorough assessment of your unique requirements. Begin by determining your data volume, variety, and the performance expectations for querying and reporting.

Data Warehouse

Data Warehouse Big Data Big Data Azure

40 Must-Know Data Science Skills and Frameworks for 2023

ODSC - Open Data Science

FEBRUARY 2, 2023

Computer Science and Computer Engineering Similar to knowing statistics and math, a data scientist should know the fundamentals of computer science as well. While knowing Python, R, and SQL are expected, you’ll need to go beyond that. Big Data As datasets become larger and more complex, knowing how to work with them will be key.

Data Science

Data Science Data Scientist Computer Science Computer Science

Adopting & Scaling AI, a Beginner’s Guide to Prompt Engineering, and Pretraining Large Language…

ODSC - Open Data Science

JULY 27, 2023

Choosing a Data Lake Format: What to Actually Look For The differences between many data lake products today might not matter as much as you think. When choosing a data lake, here’s something else to consider. When choosing a data lake, here’s something else to consider.

Data Lakes

Data Lakes SQL AI AI

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

AWS Machine Learning Blog

AUGUST 4, 2023

The following steps give an overview of how to use the new capabilities launched in SageMaker for Salesforce to enable the overall integration: Set up the Amazon SageMaker Studio domain and OAuth between Salesforce and the AWS account s. The endpoint will be exposed to Salesforce Data Cloud as an API through API Gateway.

ML

ML ML AWS AI

Alation Announces 2021.4 Release: Interview on Column-Level Lineage with Jason Ma, Senior Director of Product Management

Alation

NOVEMBER 18, 2021

External Tables Create a Shared View of the Data Lake. We’ve seen external tables become popular with our customers, who use them to provide a normalized relational schema on top of their data lake. Essentially, external tables create a shared view of the data lake, a single pane of glass everyone can reference.

Data Lakes

Data Lakes Data Governance SQL AWS

How to Create Iceberg Tables in Snowflake

phData

MARCH 22, 2024

Snowflake-managed Iceberg table’s performance is at par with Snowflake native tables while storing the data in public cloud storage. They are Ideal for situations where the data is already stored in data lakes and do not intend to load into Snowflake but need to use the features and performance of Snowflake.

SQL

SQL AWS Database Data Lakes

Big Data vs. Data Science: Demystifying the Buzzwords

Pickl AI

APRIL 21, 2025

Key Skills for Data Science: A data scientist typically needs a blend of skills: Mathematics and Statistics: To understand the theoretical underpinnings of models. Programming: Often in languages like Python or R, using libraries for data manipulation, analysis, and machine learning.

Big Data

Big Data Big Data Data Science Machine Learning

Data platform trinity: Competitive or complementary?

IBM Journey to AI blog

JANUARY 18, 2023

In another decade, the internet and mobile started the generate data of unforeseen volume, variety and velocity. It required a different data platform solution. Hence, Data Lake emerged, which handles unstructured and structured data with huge volume. Data fabric: A mostly new architecture.

Data Lakes

Data Lakes Data Warehouse Azure Apache Hadoop

How AWS sales uses Amazon Q Business for customer engagement

Shaping the future: OMRON’s data-driven journey with AWS

Webinars

Trending Sources

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

Webinars

How Rocket Companies modernized their data science solution on AWS

Generate financial industry-specific insights using generative AI and in-context fine-tuning

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Imperva optimizes SQL generation from natural language using Amazon Bedrock

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

How Northpower used computer vision with AWS to automate safety inspection risk assessments

Data-Centric Firms Address Athena Shortcomings with Smart Indexing

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Generating value from enterprise data: Best practices for Text2SQL and generative AI

Cloud Data Science News Beta #1

Essential data engineering tools for 2023: Empowering for management and analysis

Approaches to migrating your VMware workloads to AWS

How Marubeni is optimizing market decisions using AWS machine learning and analytics

Tackling AI’s data challenges with IBM databases on AWS

Driving Business Value and ROI from a Hybrid Cloud Data Lake

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Data Science News from Microsoft Ignite 2019

Building an Effective OSS Management Layer for Your Data Lake

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

Why Open Table Format Architecture is Essential for Modern Data Systems

Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas

Generate actionable insights for predictive maintenance management with Amazon Monitron and Amazon Kinesis

Pathways to optimize, migrate, and modernize your VMware workloads on AWS

Accelerating query performance with watsonx.data Presto C++ and Intel Sapphire Rapid Processor on AWS

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

40 Must-Know Data Science Skills and Frameworks for 2023

Adopting & Scaling AI, a Beginner’s Guide to Prompt Engineering, and Pretraining Large Language…

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

Alation Announces 2021.4 Release: Interview on Column-Level Lineage with Jason Ma, Senior Director of Product Management

How to Create Iceberg Tables in Snowflake

Big Data vs. Data Science: Demystifying the Buzzwords

Data platform trinity: Competitive or complementary?

Stay Connected