2012, AWS and Big Data - Data Science Current

Enrich your AWS Glue Data Catalog with generative AI metadata using Amazon Bedrock

Flipboard

NOVEMBER 15, 2024

By harnessing the capabilities of generative AI, you can automate the generation of comprehensive metadata descriptions for your data assets based on their documentation, enhancing discoverability, understanding, and the overall data governance within your AWS Cloud environment. Python and boto3.

AWS

AWS Database AI AI

Use AWS PrivateLink to set up private access to Amazon Bedrock

AWS Machine Learning Blog

OCTOBER 30, 2023

Amazon Bedrock is a fully managed service provided by AWS that offers developers access to foundation models (FMs) and the tools to customize them for specific applications. Customers are building innovative generative AI applications using Amazon Bedrock APIs using their own proprietary data.

AWS

AWS ML ML Computer Science

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

AWS Machine Learning Blog

SEPTEMBER 3, 2024

Harnessing the power of big data has become increasingly critical for businesses looking to gain a competitive edge. However, managing the complex infrastructure required for big data workloads has traditionally been a significant challenge, often requiring specialized expertise.

AWS

AWS Clustering Big Data Big Data

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Securing MLflow in AWS: Fine-grained access control with AWS native services

AWS Machine Learning Blog

MAY 8, 2023

In a previous post , we discussed MLflow and how it can run on AWS and be integrated with SageMaker—in particular, when tracking training jobs as experiments and deploying a model registered in MLflow to the SageMaker managed infrastructure. To automate the infrastructure deployment, we use the AWS Cloud Development Kit (AWS CDK).

AWS

AWS Data Science Machine Learning Machine Learning

Set up cross-account Amazon S3 access for Amazon SageMaker notebooks in VPC-only mode using Amazon S3 Access Points

AWS Machine Learning Blog

MARCH 13, 2024

It enables secure, high-speed data copy between same-Region access points using AWS internal networks and VPCs. Configure AWS Identity and Access Management (IAM) permissions and policies in Account A. S3 Access Points simplifies the management of access permissions specific to each application accessing a shared dataset.

AWS

AWS Data Scientist Big Data Big Data

Promote pipelines in a multi-environment setup using Amazon SageMaker Model Registry, HashiCorp Terraform, GitHub, and Jenkins CI/CD

AWS Machine Learning Blog

NOVEMBER 9, 2023

Central model registry – Amazon SageMaker Model Registry is set up in a separate AWS account to track model versions generated across the dev and prod environments. with administrative privileges installed on AWS Terraform version 1.5.5 After the key is provisioned, it should be visible on the AWS KMS console.

AWS

AWS ML ML Machine Learning

Use Amazon SageMaker Model Card sharing to improve model governance

AWS Machine Learning Blog

AUGUST 31, 2023

During AWS re:Invent 2022, AWS introduced new ML governance tools for Amazon SageMaker which simplifies access control and enhances transparency over your ML projects. Depending on your governance requirements, Data Science & Dev accounts can be merged into a single AWS account.

AWS

AWS ML ML Data Scientist

A review of purpose-built accelerators for financial services

AWS Machine Learning Blog

SEPTEMBER 11, 2024

Examples of other PBAs now available include AWS Inferentia and AWS Trainium , Google TPU, and Graphcore IPU. Around this time, industry observers reported NVIDIA’s strategy pivoting from its traditional gaming and graphics focus to moving into scientific computing and data analytics. Work by Hinton et al.

AWS

AWS ML ML Clustering

Machine learning with decentralized training data using federated learning on Amazon SageMaker

AWS Machine Learning Blog

AUGUST 22, 2023

With SageMaker, data scientists and developers can quickly build and train ML models, and then deploy them into a production-ready hosted environment. In this post, we demonstrate how to use the managed ML platform to provide a notebook experience environment and perform federated learning across AWS accounts, using SageMaker training jobs.

Machine Learning

Machine Learning Machine Learning AWS ML

Onboard users to Amazon SageMaker Studio with Active Directory group-specific IAM roles

AWS Machine Learning Blog

JUNE 19, 2023

For provisioning Studio in your AWS account and Region, you first need to create an Amazon SageMaker domain—a construct that encapsulates your ML environment. The preceding event is logged in AWS CloudTrail with the name AddMemberToGroup. The EventBridge rule triggers the target AWS Lambda function.

AWS

AWS ML ML Machine Learning

Four approaches to manage Python packages in Amazon SageMaker Studio notebooks

Flipboard

MARCH 7, 2023

You can manage app images via the SageMaker console, the AWS SDK for Python (Boto3), and the AWS Command Line Interface (AWS CLI). The Studio Image Build CLI lets you build SageMaker-compatible Docker images directly from your Studio environments by using AWS CodeBuild. Environments without internet access.

Python

Python AWS ML ML

Publish predictive dashboards in Amazon QuickSight using ML predictions from Amazon SageMaker Canvas

AWS Machine Learning Blog

MAY 10, 2023

Prerequisites The following prerequisites are needed to implement this solution: An AWS account with permissions to create AWS Identity and Access Management (IAM) policies and roles. About the Authors Ajjay Govindaram is a Senior Solutions Architect at AWS. Varun Mehta is a Solutions Architect at AWS.

ML

ML ML AWS Data Analysis

Use Amazon SageMaker Model Cards sharing to improve model governance

AWS Machine Learning Blog

AUGUST 31, 2023

During AWS re:Invent 2022, AWS introduced new ML governance tools for Amazon SageMaker which simplifies access control and enhances transparency over your ML projects. Depending on your governance requirements, Data Science & Dev accounts can be merged into a single AWS account.

AWS

AWS ML ML Data Scientist

Announcing the ODSC East 2023 Keynote Speakers

ODSC - Open Data Science

MAY 1, 2023

.` Hagay Lupesko VP Engineering, MosaicML | Expert in Generative AI Training and Inference, Former Leader at Meta AI and AWS In his role as VP of Engineering, Hagay Lupesko focuses on making generative AI training and inference efficient, fast, and accessible.

Data Science

Data Science Machine Learning Machine Learning Azure

How Data Security Posture Management Protects Against Data Breaches

ODSC - Open Data Science

FEBRUARY 13, 2024

The number of annual data breaches gets higher each year. In 2012, records show there were 447 data breaches in the United States. Ten years later, in 2022, researchers recorded 1,800 cases of data compromise. million data records were leaked. In Q1 of 2023, as many as 6.41 You can connect with him on LinkedIn.

Data Science

Data Science Database Machine Learning Machine Learning

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

Flipboard

MARCH 21, 2025

Through this unified query capability, you can create comprehensive insights into customer transaction patterns and purchase behavior for active products without the traditional barriers of data silos or the need to copy data between systems. Environments are the actual data infrastructure behind a project.

SQL

SQL Data Analyst Data Warehouse AWS

Super charge your LLMs with RAG at scale using AWS Glue for Apache Spark

AWS Machine Learning Blog

OCTOBER 24, 2024

Diverse data amplifies the need for customizable cleaning and transformation logic to handle the quirks of different sources. In this post, we will explore building a reusable RAG data pipeline on LangChain —an open source framework for building applications based on LLMs—and integrating it with AWS Glue and Amazon OpenSearch Serverless.

AWS

AWS Data Pipeline Database Big Data

Data Science Current

Enrich your AWS Glue Data Catalog with generative AI metadata using Amazon Bedrock

Use AWS PrivateLink to set up private access to Amazon Bedrock

Webinars

Trending Sources

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

Webinars

Securing MLflow in AWS: Fine-grained access control with AWS native services

Set up cross-account Amazon S3 access for Amazon SageMaker notebooks in VPC-only mode using Amazon S3 Access Points

Promote pipelines in a multi-environment setup using Amazon SageMaker Model Registry, HashiCorp Terraform, GitHub, and Jenkins CI/CD

Use Amazon SageMaker Model Card sharing to improve model governance

A review of purpose-built accelerators for financial services

Machine learning with decentralized training data using federated learning on Amazon SageMaker

Onboard users to Amazon SageMaker Studio with Active Directory group-specific IAM roles

Four approaches to manage Python packages in Amazon SageMaker Studio notebooks

Publish predictive dashboards in Amazon QuickSight using ML predictions from Amazon SageMaker Canvas

Use Amazon SageMaker Model Cards sharing to improve model governance

Announcing the ODSC East 2023 Keynote Speakers

How Data Security Posture Management Protects Against Data Breaches

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

Super charge your LLMs with RAG at scale using AWS Glue for Apache Spark

Stay Connected