AWS, Data Lakes and Database - Data Science Current

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning Blog

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Database AI AI

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

Scaling and load balancing The gateway can handle load balancing across different servers, model instances, or AWS Regions so that applications remain responsive. The AWS Solutions Library offers solution guidance to set up a multi-provider generative AI gateway.

AWS

AWS AI AI Database

Setting up Data Lake on GCP using Cloud Storage and BigQuery

Analytics Vidhya

FEBRUARY 25, 2023

Introduction A data lake is a centralized and scalable repository storing structured and unstructured data. The need for a data lake arises from the growing volume, variety, and velocity of data companies need to manage and analyze.

Data Lakes

Data Lakes Analytics Analytics Data Warehouse

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

AWS Machine Learning Blog

DECEMBER 16, 2024

We spoke with Dr. Swami Sivasubramanian, Vice President of Data and AI, shortly after AWS re:Invent 2024 to hear his impressionsand to get insights on how the latest AWS innovations help meet the real-world needs of customers as they build and scale transformative generative AI applications. Canva uses AWS to power 1.2

AWS

AWS AI AI Data Warehouse

Search enterprise data assets using LLMs backed by knowledge graphs

Flipboard

NOVEMBER 27, 2024

In the context of enterprise data asset search powered by a metadata catalog hosted on services such Amazon DataZone, AWS Glue, and other third-party catalogs, knowledge graphs can help integrate this linked data and also enable a scalable search paradigm that integrates metadata that evolves over time. raw_customer".

AWS

AWS Database ML ML

Build a financial research assistant using Amazon Q Business and Amazon QuickSight for generative AI–powered insights

Flipboard

MAY 14, 2025

Their information is split between two types of data: unstructured data (such as PDFs, HTML pages, and documents) and structured data (such as databases, data lakes, and real-time reports). Different types of data typically require different tools to access them.

AWS

AWS AI AI Database

Governing ML lifecycle at scale: Best practices to set up cost and usage visibility of ML workloads in multi-account environments

AWS Machine Learning Blog

NOVEMBER 14, 2024

For a multi-account environment, you can track costs at an AWS account level to associate expenses. A combination of an AWS account and tags provides the best results. Implement a tagging strategy A tag is a label you assign to an AWS resource. The AWS reserved prefix aws: tags provide additional metadata tracked by AWS.

ML

ML ML AWS Machine Learning

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Customers use Amazon Redshift as a key component of their data architecture to drive use cases from typical dashboarding to self-service analytics, real-time analytics, machine learning (ML), data sharing and monetization, and more.

AWS

AWS Data Warehouse ETL SQL

10 Things AWS Can Do for Your SaaS Company

Smart Data Collective

FEBRUARY 20, 2022

AWS (Amazon Web Services), the comprehensive and evolving cloud computing platform provided by Amazon, is comprised of infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service (SaaS). With its wide array of tools and convenience, AWS has already become a popular choice for many SaaS companies.

AWS

AWS Cloud Computing Data Lakes Database

Query structured data from Amazon Q Business using Amazon QuickSight integration

AWS Machine Learning Blog

DECEMBER 3, 2024

Although generative AI is fueling transformative innovations, enterprises may still experience sharply divided data silos when it comes to enterprise knowledge, in particular between unstructured content (such as PDFs, Word documents, and HTML pages), and structured data (real-time data and reports stored in databases or data lakes).

AWS

AWS Database Data Silos Data Lakes

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

However, with the help of AI and machine learning (ML), new software tools are now available to unearth the value of unstructured data. In this post, we discuss how AWS can help you successfully address the challenges of extracting insights from unstructured data. These services write the output to a data lake.

AWS

AWS ML ML Analytics

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

The existence of data silos and duplication, alongside apprehensions regarding data quality, presents a multifaceted environment for organizations to manage. Also, traditional database management tasks, including backups, upgrades and routine maintenance drain valuable time and resources, hindering innovation.

AWS

AWS Database ETL AI

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Dataversity

MARCH 26, 2024

Writing data to an AWS data lake and retrieving it to populate an AWS RDS MS SQL database involves several AWS services and a sequence of steps for data transfer and transformation.

Data Lakes

Data Lakes SQL AWS ETL

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

Flipboard

JUNE 26, 2023

Companies are faced with the daunting task of ingesting all this data, cleansing it, and using it to provide outstanding customer experience. Typically, companies ingest data from multiple sources into their data lake to derive valuable insights from the data. Run the AWS Glue ML transform job.

AWS

AWS ML ML ETL

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Flipboard

DECEMBER 11, 2024

Many of these applications are complex to build because they require collaboration across teams and the integration of data, tools, and services. Data engineers use data warehouses, data lakes, and analytics tools to load, transform, clean, and aggregate data. Choose Create VPC.

SQL

SQL AWS Data Lakes AI

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning Blog

JANUARY 29, 2025

Large organizations often have many business units with multiple lines of business (LOBs), with a central governing entity, and typically use AWS Organizations with an Amazon Web Services (AWS) multi-account strategy. LOBs have autonomy over their AI workflows, models, and data within their respective AWS accounts.

AWS

AWS AI AI Database

Your guide to generative AI and ML at AWS re:Invent 2023

AWS Machine Learning Blog

NOVEMBER 22, 2023

Yes, the AWS re:Invent season is upon us and as always, the place to be is Las Vegas! are the sessions dedicated to AWS DeepRacer ! Generative AI is at the heart of the AWS Village this year. You marked your calendars, you booked your hotel, and you even purchased the airfare. And last but not least (and always fun!)

AWS

AWS ML ML AI

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

AWS Machine Learning Blog

AUGUST 26, 2024

At AWS, we are transforming our seller and customer journeys by using generative artificial intelligence (AI) across the sales lifecycle. It will be able to answer questions, generate content, and facilitate bidirectional interactions, all while continuously using internal AWS and external data to deliver timely, personalized insights.

AWS

AWS AI AI K-nearest Neighbors

Generate financial industry-specific insights using generative AI and in-context fine-tuning

AWS Machine Learning Blog

NOVEMBER 12, 2024

You may check out additional reference notebooks on aws-samples for how to use Meta’s Llama models hosted on Amazon Bedrock. You can implement these steps either from the AWS Management Console or using the latest version of the AWS Command Line Interface (AWS CLI). Solutions Architect at AWS. Varun Mehta is a Sr.

SQL

SQL AWS AI AI

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

AWS Machine Learning Blog

JUNE 5, 2023

To simplify access to Parquet files, Amazon SageMaker Canvas has added data import capabilities from over 40 data sources , including Amazon Athena , which supports Apache Parquet. Canvas provides connectors to AWS data sources such as Amazon Simple Storage Service (Amazon S3), Athena, and Amazon Redshift.

Machine Learning

Machine Learning Machine Learning AWS Data Lakes

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications. The following diagram illustrates the solution architecture.

SQL

SQL Data Lakes Data Analyst AWS

Data-Centric Firms Address Athena Shortcomings with Smart Indexing

Smart Data Collective

FEBRUARY 23, 2022

Traditional relational databases provide certain benefits, but they are not suitable to handle big and various data. That is when data lake products started gaining popularity, and since then, more companies introduced lake solutions as part of their data infrastructure. AWS Athena and S3.

Data Lakes

Data Lakes AWS SQL Big Data

Data Warehouse vs. Data Lake

Precisely

MARCH 9, 2023

Data warehouse vs. data lake, each has their own unique advantages and disadvantages; it’s helpful to understand their similarities and differences. In this article, we’ll focus on a data lake vs. data warehouse. It is often used as a foundation for enterprise data lakes.

Data Warehouse

Data Warehouse Data Lakes Hadoop Big Data

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Flipboard

NOVEMBER 17, 2023

Generative AI models have the potential to revolutionize enterprise operations, but businesses must carefully consider how to harness their power while overcoming challenges such as safeguarding data and ensuring the quality of AI-generated content. Set up the database access and network access. As always, AWS welcomes feedback.

K-nearest Neighbors

K-nearest Neighbors AWS Clustering Database

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

AWS Machine Learning Blog

NOVEMBER 15, 2023

They are processing data across channels, including recorded contact center interactions, emails, chat and other digital channels. Solution requirements Principal provides investment services through Genesys Cloud CX, a cloud-based contact center that provides powerful, native integrations with AWS.

AWS

AWS Analytics Analytics ML

Approaches to migrating your VMware workloads to AWS

IBM Journey to AI blog

JUNE 11, 2024

Further to the acquisition, Broadcom decided to discontinue (link resides outside ibm.com) its AWS authorization to resell VMware Cloud on AWS as of 30 April 2024. As a result, AWS will no longer be able to offer new subscriptions or additional services.

AWS

AWS AI AI Data Lakes

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone is a data management service that makes it quick and convenient to catalog, discover, share, and govern data stored in AWS, on-premises, and third-party sources. An Amazon DataZone domain and an associated Amazon DataZone project configured in your AWS account.

Machine Learning

Machine Learning Machine Learning Data Governance ML

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

In this post, we will talk about how BMW Group, in collaboration with AWS Professional Services, built its Jupyter Managed (JuMa) service to address these challenges. For example, teams using these platforms missed an easy migration of their AI/ML prototypes to the industrialization of the solution running on AWS.

ML

ML ML AWS AI

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

AWS Machine Learning Blog

FEBRUARY 28, 2024

Structured Query Language (SQL) is a complex language that requires an understanding of databases and metadata. The solution in this post aims to bring enterprise analytics operations to the next level by shortening the path to your data using natural language. Today, generative AI can enable people without SQL knowledge.

SQL

SQL AWS Database ML

Insights in implementing production-ready solutions with generative AI

AWS Machine Learning Blog

APRIL 30, 2025

This post explores key insights and lessons learned from AWS customers in Europe, Middle East, and Africa (EMEA) who have successfully navigated this transition, providing a roadmap for others looking to follow suit. Il Sole 24 Ore leveraged its vast internal knowledge with a Retrieval Augmented Generation (RAG) solution powered by AWS.

AWS

AWS AI AI Machine Learning

Building an Effective OSS Management Layer for Your Data Lake

ODSC - Open Data Science

OCTOBER 13, 2024

Be sure to check out her talk, “ Don’t Go Over the Deep End: Building an Effective OSS Management Layer for Your Data Lake ,” there! Managing a data lake can often feel like being lost at sea — especially when dealing with both structured and unstructured data.

Data Lakes

Data Lakes Database Data Pipeline SQL

Why Open Table Format Architecture is Essential for Modern Data Systems

phData

NOVEMBER 8, 2024

Note : Cloud Data warehouses like Snowflake and Big Query already have a default time travel feature. However, this feature becomes an absolute must-have if you are operating your analytics on top of your data lake or lakehouse. It can also be integrated into major data platforms like Snowflake. Contact phData Today!

Data Lakes

Data Lakes Data Warehouse Database Azure

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

On December 6 th -8 th 2023, the non-profit organization, Tech to the Rescue , in collaboration with AWS, organized the world’s largest Air Quality Hackathon – aimed at tackling one of the world’s most pressing health and environmental challenges, air pollution. This allows for data to be aggregated for further manufacturer-agnostic analysis.

AWS

AWS AI AI Python

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

AWS Machine Learning Blog

JUNE 25, 2024

The customer review analysis workflow consists of the following steps: A user uploads a file to dedicated data repository within your Amazon Simple Storage Service (Amazon S3) data lake, invoking the processing using AWS Step Functions. The raw data is processed by an LLM using a preconfigured user prompt.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Generating value from enterprise data: Best practices for Text2SQL and generative AI

AWS Machine Learning Blog

JANUARY 4, 2024

To do this, the text input is transformed into a structured representation, and from this representation, a SQL query that can be used to access a database is created. The primary goal of Text2SQL is to make querying databases more accessible to non-technical users, who can provide their queries in natural language. gymnast_id = t2.

SQL

SQL Database AI AI

Imperva optimizes SQL generation from natural language using Amazon Bedrock

AWS Machine Learning Blog

JUNE 20, 2024

Our goal was to improve the user experience of an existing application used to explore the counters and insights data. The data is stored in a data lake and retrieved by SQL using Amazon Athena. The problem Making data accessible to users through applications has always been a challenge.

SQL

SQL Database AWS Machine Learning

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

AWS Machine Learning Blog

JUNE 21, 2024

To accomplish this, eSentire built AI Investigator, a natural language query tool for their customers to access security platform data by using AWS generative artificial intelligence (AI) capabilities. eSentire has over 2 TB of signal data stored in their Amazon Simple Storage Service (Amazon S3) data lake.

AWS

AWS AI AI Natural Language Processing

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

AWS Machine Learning Blog

JUNE 13, 2023

Overall, implementing a modern data architecture and generative AI techniques with AWS is a promising approach for gleaning and disseminating key insights from diverse, expansive data at an enterprise scale. AWS also offers foundation models through Amazon SageMaker JumpStart as Amazon SageMaker endpoints.

Database

Database SQL AWS AI

From concept to reality: Navigating the Journey of RAG from proof of concept to production

AWS Machine Learning Blog

FEBRUARY 12, 2025

Many organizations store their data in structured formats within data warehouses and data lakes. Amazon Bedrock Knowledge Bases offers a feature that lets you connect your RAG workflow to structured data stores. The key is to choose a solution that can effectively host your database and compute infrastructure.

AWS

AWS Machine Learning Machine Learning AI

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

AWS Machine Learning Blog

SEPTEMBER 11, 2024

In this post, we demonstrate how to build a robust real-time anomaly detection solution for streaming time series data using Amazon Managed Service for Apache Flink and other AWS managed services. It offers an AWS CloudFormation template for straightforward deployment in an AWS account.

AWS

AWS ML ML Apache Kafka

Accelerating time-to-insight with MongoDB time series collections and Amazon SageMaker Canvas

AWS Machine Learning Blog

DECEMBER 18, 2023

In this post, we will explore the potential of using MongoDB’s time series data and SageMaker Canvas as a comprehensive solution. MongoDB Atlas MongoDB Atlas is a fully managed developer data platform that simplifies the deployment and scaling of MongoDB databases in the cloud. Setup the Database access and Network access.

Clustering

Clustering AWS Database ML

AWS re:Invent Recap: The Future of Cloud

Alation

DECEMBER 14, 2021

Alation recently attended AWS re:invent 2021 … in person! AWS Keynote: “Still Early Days” for Cloud. Adam Selipsky, CEO of AWS, brought this energy in his opening keynote, welcoming a packed room and looking back on the progress of AWS. Re:Invent 2021 Keynote by AWS CEO Adam Selipsky. AWS’ Top Cloud Challenges.

AWS

AWS Data Lakes Data Warehouse Machine Learning

How AWS sales uses Amazon Q Business for customer engagement

Architect a mature generative AI foundation on AWS

Webinars

Trending Sources

Setting up Data Lake on GCP using Cloud Storage and BigQuery

Webinars

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

Search enterprise data assets using LLMs backed by knowledge graphs

Build a financial research assistant using Amazon Q Business and Amazon QuickSight for generative AI–powered insights

Governing ML lifecycle at scale: Best practices to set up cost and usage visibility of ML workloads in multi-account environments

AWS re:Invent 2023 Amazon Redshift Sessions Recap

10 Things AWS Can Do for Your SaaS Company

Query structured data from Amazon Q Business using Amazon QuickSight integration

Unstructured data management and governance using AWS AI/ML and analytics services

Tackling AI’s data challenges with IBM databases on AWS

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Generative AI operating models in enterprise organizations with Amazon Bedrock

Your guide to generative AI and ML at AWS re:Invent 2023

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

Generate financial industry-specific insights using generative AI and in-context fine-tuning

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

Data-Centric Firms Address Athena Shortcomings with Smart Indexing

Data Warehouse vs. Data Lake

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

Approaches to migrating your VMware workloads to AWS

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

Insights in implementing production-ready solutions with generative AI

Building an Effective OSS Management Layer for Your Data Lake

Why Open Table Format Architecture is Essential for Modern Data Systems

Improving air quality with generative AI

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

Generating value from enterprise data: Best practices for Text2SQL and generative AI

Imperva optimizes SQL generation from natural language using Amazon Bedrock

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

From concept to reality: Navigating the Journey of RAG from proof of concept to production

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

Accelerating time-to-insight with MongoDB time series collections and Amazon SageMaker Canvas

AWS re:Invent Recap: The Future of Cloud

Stay Connected