Remove 2012 Remove AWS Remove Big Data
article thumbnail

Enrich your AWS Glue Data Catalog with generative AI metadata using Amazon Bedrock

Flipboard

By harnessing the capabilities of generative AI, you can automate the generation of comprehensive metadata descriptions for your data assets based on their documentation, enhancing discoverability, understanding, and the overall data governance within your AWS Cloud environment. Python and boto3.

AWS 148
article thumbnail

Super charge your LLMs with RAG at scale using AWS Glue for Apache Spark

AWS Machine Learning Blog

Diverse data amplifies the need for customizable cleaning and transformation logic to handle the quirks of different sources. In this post, we will explore building a reusable RAG data pipeline on LangChain —an open source framework for building applications based on LLMs—and integrating it with AWS Glue and Amazon OpenSearch Serverless.

AWS 114
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Use AWS PrivateLink to set up private access to Amazon Bedrock

AWS Machine Learning Blog

Amazon Bedrock is a fully managed service provided by AWS that offers developers access to foundation models (FMs) and the tools to customize them for specific applications. Customers are building innovative generative AI applications using Amazon Bedrock APIs using their own proprietary data.

AWS 136
article thumbnail

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

AWS Machine Learning Blog

Harnessing the power of big data has become increasingly critical for businesses looking to gain a competitive edge. However, managing the complex infrastructure required for big data workloads has traditionally been a significant challenge, often requiring specialized expertise.

AWS 111
article thumbnail

Securing MLflow in AWS: Fine-grained access control with AWS native services

AWS Machine Learning Blog

In a previous post , we discussed MLflow and how it can run on AWS and be integrated with SageMaker—in particular, when tracking training jobs as experiments and deploying a model registered in MLflow to the SageMaker managed infrastructure. To automate the infrastructure deployment, we use the AWS Cloud Development Kit (AWS CDK).

AWS 76
article thumbnail

Set up cross-account Amazon S3 access for Amazon SageMaker notebooks in VPC-only mode using Amazon S3 Access Points

AWS Machine Learning Blog

It enables secure, high-speed data copy between same-Region access points using AWS internal networks and VPCs. Configure AWS Identity and Access Management (IAM) permissions and policies in Account A. S3 Access Points simplifies the management of access permissions specific to each application accessing a shared dataset.

AWS 109
article thumbnail

Promote pipelines in a multi-environment setup using Amazon SageMaker Model Registry, HashiCorp Terraform, GitHub, and Jenkins CI/CD

AWS Machine Learning Blog

Central model registry – Amazon SageMaker Model Registry is set up in a separate AWS account to track model versions generated across the dev and prod environments. with administrative privileges installed on AWS Terraform version 1.5.5 After the key is provisioned, it should be visible on the AWS KMS console.

AWS 121