AWS, Data Lakes and Data Modeling - Data Science Current

Architect a mature generative AI foundation on AWS

Flipboard

MAY 30, 2025

Scaling and load balancing The gateway can handle load balancing across different servers, model instances, or AWS Regions so that applications remain responsive. The AWS Solutions Library offers solution guidance to set up a multi-provider generative AI gateway. Model versions should be managed centrally in a model registry.

AWS

AWS AI AI Database

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

The Hadoop environment was hosted on Amazon Elastic Compute Cloud (Amazon EC2) servers, managed in-house by Rockets technology team, while the data science experience infrastructure was hosted on premises. Communication between the two systems was established through Kerberized Apache Livy (HTTPS) connections over AWS PrivateLink.

Data Science

Data Science AWS Hadoop Data Scientist

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

Unstructured data is information that doesn’t conform to a predefined schema or isn’t organized according to a preset data model. Text, images, audio, and videos are common examples of unstructured data. Additionally, we show how to use AWS AI/ML services for analyzing unstructured data.

AWS

AWS ML ML Analytics

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Integrate foundation models into your code with Amazon Bedrock

AWS Machine Learning Blog

NOVEMBER 6, 2024

Prerequisites Before you dive into the integration process, make sure you have the following prerequisites in place: AWS account – You’ll need an AWS account to access and use Amazon Bedrock. You can interact with Amazon Bedrock using AWS SDKs available in Python, Java, Node.js, and more.

AWS

AWS Python Machine Learning Machine Learning

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Modeling Data Models Data Warehouse

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

AWS Machine Learning Blog

SEPTEMBER 5, 2023

The solution framework is scalable as more equipment is installed and can be reused for a variety of downstream modeling tasks. In this post, we show how the Carrier and AWS teams applied ML to predict faults across large fleets of equipment using a single model. The effective precision of the trained model is 91.6%.

AWS

AWS ML ML Machine Learning

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Customers of every size and industry are innovating on AWS by infusing machine learning (ML) into their products and services. Recent developments in generative AI models have further sped up the need of ML adoption across industries. The architecture maps the different capabilities of the ML platform to AWS accounts.

ML

ML ML AWS Data Lakes

Data Warehouse vs. Data Lake

Precisely

MARCH 9, 2023

Data warehouse vs. data lake, each has their own unique advantages and disadvantages; it’s helpful to understand their similarities and differences. In this article, we’ll focus on a data lake vs. data warehouse. It is often used as a foundation for enterprise data lakes.

Data Warehouse

Data Warehouse Data Lakes Hadoop Big Data

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

AWS Machine Learning Blog

AUGUST 21, 2023

You can streamline the process of feature engineering and data preparation with SageMaker Data Wrangler and finish each stage of the data preparation workflow (including data selection, purification, exploration, visualization, and processing at scale) within a single visual interface.

AWS

AWS Data Lakes Clustering Data Preparation

Essential data engineering tools for 2023: Empowering for management and analysis

Data Science Dojo

JULY 6, 2023

Amazon Redshift: Amazon Redshift is a cloud-based data warehousing service provided by Amazon Web Services (AWS). Amazon Redshift allows data engineers to analyze large datasets quickly using massively parallel processing (MPP) architecture. Airflow An open-source platform for building and scheduling data pipelines.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

AWS Machine Learning Blog

JUNE 22, 2023

Working with AWS, Light & Wonder recently developed an industry-first secure solution, Light & Wonder Connect (LnW Connect), to stream telemetry and machine health data from roughly half a million electronic gaming machines distributed across its casino customer base globally when LnW Connect reaches its full potential.

AWS

AWS ML ML Machine Learning

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

AWS Machine Learning Blog

MAY 31, 2024

With the Amazon Bedrock serverless experience, you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using the Amazon Web Services (AWS) tools without having to manage infrastructure. However, this is beyond the scope of this post.

AWS

AWS Machine Learning Machine Learning Database

Beyond data: Cloud analytics mastery for business brilliance

Dataconomy

SEPTEMBER 4, 2023

Key features of cloud analytics solutions include: Data models , Processing applications, and Analytics models. Data models help visualize and organize data, processing applications handle large datasets efficiently, and analytics models aid in understanding complex data sets, laying the foundation for business intelligence.

Analytics

Analytics Analytics Big Data Analytics Big Data Analytics

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

Women in Big Data

NOVEMBER 27, 2024

By maintaining historical data from disparate locations, a data warehouse creates a foundation for trend analysis and strategic decision-making. How to Choose a Data Warehouse for Your Big Data Choosing a data warehouse for big data storage necessitates a thorough assessment of your unique requirements.

Data Warehouse

Data Warehouse Big Data Big Data Azure

MLOps and DevOps: Why Data Makes It Different

O'Reilly Media

OCTOBER 19, 2021

We need robust versioning for data, models, code, and preferably even the internal state of applications—think Git on steroids to answer inevitable questions: What changed? ML use cases rarely dictate the master data management solution, so the ML stack needs to integrate with existing data warehouses.

ML

ML ML Data Scientist AWS

How does Tableau power Salesforce Genie Customer Data Cloud?

Tableau

DECEMBER 7, 2022

Built-in connectors bring in data from every single channel. That includes live data streams, streaming data from web and mobile, and APIs integrated with MuleSoft to bring in external data from legacy systems or proprietary data lakes. . Bring your own AI with AWS.

Tableau

Tableau Data Warehouse Data Pipeline Data Visualization

How does Tableau power Salesforce Genie Customer Data Cloud?

Tableau

DECEMBER 7, 2022

Built-in connectors bring in data from every single channel. That includes live data streams, streaming data from web and mobile, and APIs integrated with MuleSoft to bring in external data from legacy systems or proprietary data lakes. . Bring your own AI with AWS.

Tableau

Tableau Data Warehouse Data Pipeline Data Visualization

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Summary: The fundamentals of Data Engineering encompass essential practices like data modelling, warehousing, pipelines, and integration. Understanding these concepts enables professionals to build robust systems that facilitate effective data management and insightful analysis. What is Data Engineering?

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

How to Effectively Handle Unstructured Data Using AI

DagsHub

NOVEMBER 11, 2024

In this article, we’ll explore how AI can transform unstructured data into actionable intelligence, empowering you to make informed decisions, enhance customer experiences, and stay ahead of the competition. What is Unstructured Data? These processes are essential in AI-based big data analytics and decision-making.

AI

AI AI Data Lakes Database

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

To combine the collected data, you can integrate different data producers into a data lake as a repository. A central repository for unstructured data is beneficial for tasks like analytics and data virtualization. Data Cleaning The next step is to clean the data after ingesting it into the data lake.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

Azure Data Engineer Jobs

Pickl AI

APRIL 6, 2023

Understand the fundamentals of data engineering: To become an Azure Data Engineer, you must first understand the concepts and principles of data engineering. Knowledge of data modeling, warehousing, integration, pipelines, and transformation is required. Data Warehousing concepts and knowledge should be strong.

Azure

Azure Data Engineering Data Engineering Data Engineer

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

For example, if you use AWS, you may prefer Amazon SageMaker as an MLOps platform that integrates with other AWS services. SageMaker Studio offers built-in algorithms, automated model tuning, and seamless integration with AWS services, making it a powerful platform for developing and deploying machine learning solutions at scale.

Machine Learning

Machine Learning Machine Learning ML ML

What is Salesforce Data Cloud for Tableau?

Tableau

DECEMBER 7, 2022

Built-in connectors bring in data from every single channel. That includes live data streams, streaming data from web and mobile, and APIs integrated with MuleSoft to bring in external data from legacy systems or proprietary data lakes. Unify The power of the customer graph keeps going.

Tableau

Tableau Data Warehouse Data Pipeline Data Visualization

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

If you will ask data professionals about what is the most challenging part of their day to day work, you will likely discover their concerns around managing different aspects of data before they get to graduate to the data modeling stage.

Data Pipeline

Data Pipeline ETL SQL Data Quality

How to Build an End-To-End ML Pipeline

The MLOps Blog

MAY 9, 2023

3 Quickly build and deploy an end-to-end ML pipeline with Kubeflow Pipelines on AWS. They include: 1 Data (or input) pipeline. 2 Model (or training) pipeline. Model/training pipeline This pipeline trains one or more models on the training data with preset hyperparameters. What is a machine learning pipeline?

ML

ML ML Machine Learning Machine Learning

Dive deep into vector data stores using Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

OCTOBER 11, 2024

With the Amazon Bedrock serverless experience, you can experiment with and evaluate top foundation models (FMs) for your use cases, privately customize them with your data using techniques such as fine-tuning and RAG, and build agents that run tasks using enterprise systems and data sources.

Database

Database AWS Clustering AI

Data Science Current

Architect a mature generative AI foundation on AWS

How Rocket Companies modernized their data science solution on AWS

Webinars

Trending Sources

Unstructured data management and governance using AWS AI/ML and analytics services

Webinars

Integrate foundation models into your code with Amazon Bedrock

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Data Warehouse vs. Data Lake

Apply fine-grained data access controls with AWS Lake Formation in Amazon SageMaker Data Wrangler

Essential data engineering tools for 2023: Empowering for management and analysis

How Light & Wonder built a predictive maintenance solution for gaming machines on AWS

Implementing Knowledge Bases for Amazon Bedrock in support of GDPR (right to be forgotten) requests

Beyond data: Cloud analytics mastery for business brilliance

Top 5 Data Warehouses to Supercharge Your Big Data Strategy

MLOps and DevOps: Why Data Makes It Different

How does Tableau power Salesforce Genie Customer Data Cloud?

How does Tableau power Salesforce Genie Customer Data Cloud?

Discover the Most Important Fundamentals of Data Engineering

How to Effectively Handle Unstructured Data Using AI

How to Manage Unstructured Data in AI and Machine Learning Projects

Azure Data Engineer Jobs

MLOps Landscape in 2023: Top Tools and Platforms

What is Salesforce Data Cloud for Tableau?

Comparing Tools For Data Processing Pipelines

How to Build an End-To-End ML Pipeline

Dive deep into vector data stores using Amazon Bedrock Knowledge Bases

Stay Connected