AWS, Data Lakes and Information - Data Science Current

Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

Flipboard

NOVEMBER 22, 2024

For example, in the bank marketing use case, the management account would be responsible for setting up the organizational structure for the bank’s data and analytics teams, provisioning separate accounts for data governance, data lakes, and data science teams, and maintaining compliance with relevant financial regulations.

Data Governance

Data Governance ML ML Data Lakes

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning Blog

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Field Advisor serves four primary use cases: AWS-specific knowledge search With Amazon Q Business, weve made internal data sources as well as public AWS content available in Field Advisors index.

AWS

AWS Database AI AI

Precise Software Solutions implements ML as a service on AWS to save time and money for federal agency

Flipboard

JANUARY 6, 2025

Precise), an Amazon Web Services (AWS) Partner , participated in the AWS Think Big for Small Business Program (TBSB) to expand their AWS capabilities and to grow their business in the public sector. Precise Software Solutions, Inc. The platform helped the agency digitize and process forms, pictures, and other documents.

AWS

AWS ML ML Machine Learning

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

Unstructured data is information that doesn’t conform to a predefined schema or isn’t organized according to a preset data model. Unstructured information may have a little or a lot of structure but in ways that are unexpected or inconsistent. Text, images, audio, and videos are common examples of unstructured data.

AWS

AWS ML ML Analytics

10 Things AWS Can Do for Your SaaS Company

Smart Data Collective

FEBRUARY 20, 2022

AWS (Amazon Web Services), the comprehensive and evolving cloud computing platform provided by Amazon, is comprised of infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service (SaaS). With its wide array of tools and convenience, AWS has already become a popular choice for many SaaS companies.

AWS

AWS Cloud Computing Data Lakes Database

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

Thats why we use advanced technology and data analytics to streamline every step of the homeownership experience, from application to closing. Communication between the two systems was established through Kerberized Apache Livy (HTTPS) connections over AWS PrivateLink. This also led to a backlog of data that needed to be ingested.

Data Science

Data Science AWS Hadoop Data Scientist

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

FEBRUARY 21, 2025

While these models are trained on vast amounts of generic data, they often lack the organization-specific context and up-to-date information needed for accurate responses in business settings. Lets assume that the question What date will AWS re:invent 2024 occur? If the question was Whats the schedule for AWS events in December?,

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Dataversity

MARCH 26, 2024

Writing data to an AWS data lake and retrieving it to populate an AWS RDS MS SQL database involves several AWS services and a sequence of steps for data transfer and transformation. This process leverages AWS S3 for the data lake storage, AWS Glue for ETL operations, and AWS Lambda for orchestration.

Data Lakes

Data Lakes AWS SQL ETL

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

Flipboard

JUNE 26, 2023

Companies are faced with the daunting task of ingesting all this data, cleansing it, and using it to provide outstanding customer experience. Typically, companies ingest data from multiple sources into their data lake to derive valuable insights from the data. Run the AWS Glue ML transform job.

AWS

AWS ML ML ETL

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

AWS Machine Learning Blog

AUGUST 26, 2024

At AWS, we are transforming our seller and customer journeys by using generative artificial intelligence (AI) across the sales lifecycle. It will be able to answer questions, generate content, and facilitate bidirectional interactions, all while continuously using internal AWS and external data to deliver timely, personalized insights.

AWS

AWS AI AI K-nearest Neighbors

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

AWS Machine Learning Blog

MAY 30, 2024

To serve their customers, Vitech maintains a repository of information that includes product documentation (user guides, standard operating procedures, runbooks), which is currently scattered across multiple internal platforms (for example, Confluence sites and SharePoint folders). langsmith==0.0.43 pgvector==0.2.3 streamlit==1.28.0

AI

AI AI AWS Database

Your guide to generative AI and ML at AWS re:Invent 2023

AWS Machine Learning Blog

NOVEMBER 22, 2023

Yes, the AWS re:Invent season is upon us and as always, the place to be is Las Vegas! are the sessions dedicated to AWS DeepRacer ! Generative AI is at the heart of the AWS Village this year. You marked your calendars, you booked your hotel, and you even purchased the airfare. And last but not least (and always fun!)

AWS

AWS ML ML AI

Getir end-to-end workforce management: Amazon Forecast and AWS Step Functions

AWS Machine Learning Blog

DECEMBER 7, 2023

In this post, we describe the end-to-end workforce management system that begins with location-specific demand forecast, followed by courier workforce planning and shift assignment using Amazon Forecast and AWS Step Functions. AWS Step Functions automatically initiate and monitor these workflows by simplifying error handling.

AWS

AWS Algorithm Data Science Machine Learning

Data Warehouse vs. Data Lake

Precisely

MARCH 9, 2023

As cloud computing platforms make it possible to perform advanced analytics on ever larger and more diverse data sets, new and innovative approaches have emerged for storing, preprocessing, and analyzing information. In this article, we’ll focus on a data lake vs. data warehouse.

Data Warehouse

Data Warehouse Data Lakes Hadoop Big Data

How Northpower used computer vision with AWS to automate safety inspection risk assessments

AWS Machine Learning Blog

SEPTEMBER 27, 2024

Amazon Simple Storage Service (Amazon S3) stores the model artifacts and creates a data lake to host the inference output, document analysis output, and other datasets in CSV format. About the authors Scott Patterson is a Senior Solutions Architect at AWS. Andreas Astrom is the Head of Technology and Innovation at Northpower

AWS

AWS Data Lakes ML ML

How Getir reduced model training durations by 90% with Amazon SageMaker and AWS Batch

AWS Machine Learning Blog

DECEMBER 4, 2023

In this post, we explain how we built an end-to-end product category prediction pipeline to help commercial teams by using Amazon SageMaker and AWS Batch , reducing model training duration by 90%. An important aspect of our strategy has been the use of SageMaker and AWS Batch to refine pre-trained BERT models for seven different languages.

AWS

AWS Predictive Analytics ML ML

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

AWS Machine Learning Blog

NOVEMBER 15, 2023

They are processing data across channels, including recorded contact center interactions, emails, chat and other digital channels. Solution requirements Principal provides investment services through Genesys Cloud CX, a cloud-based contact center that provides powerful, native integrations with AWS.

AWS

AWS Analytics Analytics ML

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications.

SQL

SQL Data Lakes Data Analyst AWS

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Generate financial industry-specific insights using generative AI and in-context fine-tuning

AWS Machine Learning Blog

NOVEMBER 12, 2024

You may check out additional reference notebooks on aws-samples for how to use Meta’s Llama models hosted on Amazon Bedrock. You can implement these steps either from the AWS Management Console or using the latest version of the AWS Command Line Interface (AWS CLI). As we can see the data retrieval is more accurate.

SQL

SQL AWS AI AI

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning Blog

JANUARY 29, 2025

Large organizations often have many business units with multiple lines of business (LOBs), with a central governing entity, and typically use AWS Organizations with an Amazon Web Services (AWS) multi-account strategy. LOBs have autonomy over their AI workflows, models, and data within their respective AWS accounts.

AWS

AWS AI AI Database

AI/ML-driven actionable insights and themes for Amazon third-party sellers using AWS

Flipboard

MARCH 7, 2023

The large volume of contacts creates a challenge for CSBA to extract key information from the transcripts that helps sellers promptly address customer needs and improve customer experience. We use multiple AWS AI/ML services, such as Contact Lens for Amazon Connect and Amazon SageMaker , and utilize a combined architecture.

ML

ML ML AWS AI

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Customers of every size and industry are innovating on AWS by infusing machine learning (ML) into their products and services. However, implementing security, data privacy, and governance controls are still key challenges faced by customers when implementing ML workloads at scale.

ML

ML ML AWS Data Lakes

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

AWS Machine Learning Blog

SEPTEMBER 5, 2023

In this post, we show how the Carrier and AWS teams applied ML to predict faults across large fleets of equipment using a single model. We first highlight how we use AWS Glue for highly parallel data processing. This dramatically reduces the size of data while capturing features that characterize the equipment’s behavior.

AWS

AWS ML ML Machine Learning

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Alation

FEBRUARY 20, 2020

For many enterprises, a hybrid cloud data lake is no longer a trend, but becoming reality. Due to these needs, hybrid cloud data lakes emerged as a logical middle ground between the two consumption models. earthquake, flood, or fire), where the data collected does not need to be as tightly controlled.

Data Lakes

Data Lakes Cloud Data AWS Tableau

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

AWS Machine Learning Blog

JUNE 5, 2023

To simplify access to Parquet files, Amazon SageMaker Canvas has added data import capabilities from over 40 data sources , including Amazon Athena , which supports Apache Parquet. Canvas provides connectors to AWS data sources such as Amazon Simple Storage Service (Amazon S3), Athena, and Amazon Redshift.

Machine Learning

Machine Learning Machine Learning AWS Data Lakes

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

Hybrid Vs. Multi-Cloud: 5 Key Comparisons in Kafka Architectures

Smart Data Collective

AUGUST 17, 2022

You can safely use an Apache Kafka cluster for seamless data movement from the on-premise hardware solution to the data lake using various cloud services like Amazon’s S3 and others. It will enable you to quickly transform and load the data results into Amazon S3 data lakes or JDBC data stores.

Apache Kafka

Apache Kafka ETL Data Lakes AWS

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

AWS Machine Learning Blog

JUNE 22, 2023

Working with AWS, Light & Wonder recently developed an industry-first secure solution, Light & Wonder Connect (LnW Connect), to stream telemetry and machine health data from roughly half a million electronic gaming machines distributed across its casino customer base globally when LnW Connect reaches its full potential.

AWS

AWS ML ML Machine Learning

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Flipboard

NOVEMBER 17, 2023

The vector field should be represented as an array of numbers (BSON int32, int64, or double data types only). Refer to Review knnVector Type Limitations for more information about the limitations of the knnVector type. Specify the AWS Lambda function that will interact with MongoDB Atlas and the LLM to provide responses.

K-nearest Neighbors

K-nearest Neighbors AWS Clustering Database

Intelligent healthcare forms analysis with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 13, 2024

The healthcare industry generates and collects a significant amount of unstructured textual data, including clinical documentation such as patient information, medical history, and test results, as well as non-clinical documentation like administrative records.

AWS

AWS Data Lakes Machine Learning Machine Learning

How Marubeni is optimizing market decisions using AWS machine learning and analytics

AWS Machine Learning Blog

MARCH 8, 2023

MPII is using a machine learning (ML) bid optimization engine to inform upstream decision-making processes in power asset management and trading. This solution helps market analysts design and perform data-driven bidding strategies optimized for power asset profitability.

AWS

AWS Machine Learning Machine Learning Analytics

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

AWS Machine Learning Blog

MAY 31, 2024

The General Data Protection Regulation (GDPR) right to be forgotten, also known as the right to erasure, gives individuals the right to request the deletion of their personally identifiable information (PII) data held by organizations. Knowledge Bases for Amazon Bedrock manages the end-to-end RAG workflow for you.

AWS

AWS Machine Learning Machine Learning Database

8 Data Lake Vendors to Make Your Data Life Easier in 2023

ODSC - Open Data Science

JUNE 7, 2023

To make your data management processes easier, here’s a primer on data lakes, and our picks for a few data lake vendors worth considering. What is a data lake? First, a data lake is a centralized repository that allows users or an organization to store and analyze large volumes of data.

Data Lakes

Data Lakes Azure Data Warehouse Hadoop

Fivetran Supports the Automation of the Modern Data Lake on Amazon S3

phData

APRIL 4, 2023

Fivetran today announced support for Amazon Simple Storage Service (Amazon S3) with Apache Iceberg data lake format. Amazon S3 is an object storage service from Amazon Web Services (AWS) that offers industry-leading scalability, data availability, security, and performance.

Data Lakes

Data Lakes Data Warehouse Cloud Data AWS

Introducing the Amazon Comprehend flywheel for MLOps

AWS Machine Learning Blog

MARCH 1, 2023

It helps you extract information by recognizing sentiments, key phrases, entities, and much more, allowing you to take advantage of state-of-the-art models and adapt them for your specific use case. This feature also allows you to automate model retraining after new datasets are ingested and available in the flywheel´s data lake.

Data Lakes

Data Lakes AWS ML ML

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

With that, the need for data scientists and machine learning (ML) engineers has grown significantly. These skilled professionals are tasked with building and deploying models that improve the quality and efficiency of BMW’s business processes and enable informed leadership decisions.

ML

ML ML AWS AI

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

AWS Machine Learning Blog

JUNE 25, 2024

For more information, see Customize models in Amazon Bedrock with your own data using fine-tuning and continued pre-training. In the first step, an AWS Lambda function reads and validates the file, and extracts the raw data. The raw data is processed by an LLM using a preconfigured user prompt.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone is a data management service that makes it quick and convenient to catalog, discover, share, and govern data stored in AWS, on-premises, and third-party sources. An Amazon DataZone domain and an associated Amazon DataZone project configured in your AWS account.

Machine Learning

Machine Learning Machine Learning Data Governance ML

Simplify continuous learning of Amazon Comprehend custom models using Comprehend flywheel

AWS Machine Learning Blog

MARCH 1, 2023

Flywheel creates a data lake (in Amazon S3) in your account where all the training and test data for all versions of the model are managed and stored. Periodically, the new labeled data (to retrain the model) can be made available to flywheel by creating datasets. The data can be accessed from AWS Open Data Registry.

Data Lakes

Data Lakes AWS ML ML

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

Through evaluations of sensors and informed decision-making support, Afri-SET empowers governments and civil society for effective air quality management. The attempt is disadvantaged by the current focus on data cleaning, diverting valuable skills away from building ML models for sensor calibration.

AWS

AWS AI AI Python

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

AWS Machine Learning Blog

MARCH 30, 2023

Therefore, it’s no surprise that determining the proficiency of goalkeepers in preventing the ball from entering the net is considered one of the most difficult tasks in football data analysis. Bundesliga and AWS have collaborated to perform an in-depth examination to study the quantification of achievements of Bundesliga’s keepers.

Machine Learning

Machine Learning Machine Learning AWS ML

Beyond data: Cloud analytics mastery for business brilliance

Dataconomy

SEPTEMBER 4, 2023

Cloud analytics is the art and science of mining insights from data stored in cloud-based platforms. By tapping into the power of cloud technology, organizations can efficiently analyze large datasets, uncover hidden patterns, predict future trends, and make informed decisions to drive their businesses forward.

Analytics

Analytics Analytics Big Data Analytics Big Data Analytics

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

AWS Machine Learning Blog

FEBRUARY 13, 2024

SageMaker Feature Store now makes it effortless to share, discover, and access feature groups across AWS accounts. With this launch, account owners can grant access to select feature groups by other accounts using AWS Resource Access Manager (AWS RAM). Review the access policy to understand permissions granted.

AWS

AWS ML ML Machine Learning

Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

How AWS sales uses Amazon Q Business for customer engagement

Webinars

Trending Sources

Precise Software Solutions implements ML as a service on AWS to save time and money for federal agency

Webinars

Unstructured data management and governance using AWS AI/ML and analytics services

10 Things AWS Can Do for Your SaaS Company

How Rocket Companies modernized their data science solution on AWS

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Integrating AWS Data Lake and RDS MS SQL: A Guide to Writing and Retrieving Data Securely

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

Your guide to generative AI and ML at AWS re:Invent 2023

Getir end-to-end workforce management: Amazon Forecast and AWS Step Functions

Data Warehouse vs. Data Lake

How Northpower used computer vision with AWS to automate safety inspection risk assessments

How Getir reduced model training durations by 90% with Amazon SageMaker and AWS Batch

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Generate financial industry-specific insights using generative AI and in-context fine-tuning

Generative AI operating models in enterprise organizations with Amazon Bedrock

AI/ML-driven actionable insights and themes for Amazon third-party sellers using AWS

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Hybrid Vs. Multi-Cloud: 5 Key Comparisons in Kafka Architectures

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Intelligent healthcare forms analysis with Amazon Bedrock

How Marubeni is optimizing market decisions using AWS machine learning and analytics

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

8 Data Lake Vendors to Make Your Data Life Easier in 2023

Fivetran Supports the Automation of the Modern Data Lake on Amazon S3

Introducing the Amazon Comprehend flywheel for MLOps

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Build an automated insight extraction framework for customer feedback analysis with Amazon Bedrock and Amazon QuickSight

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

Simplify continuous learning of Amazon Comprehend custom models using Comprehend flywheel

Improving air quality with generative AI

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

Beyond data: Cloud analytics mastery for business brilliance

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

Stay Connected