ML and System Architecture - Data Science Current

Reduce ML training costs with Amazon SageMaker HyperPod

AWS Machine Learning Blog

APRIL 10, 2025

This means users can build resilient clusters for machine learning (ML) workloads and develop or fine-tune state-of-the-art frontier models, as demonstrated by organizations such as Luma Labs and Perplexity AI. Frontier model builders can further enhance model performance using built-in ML tools within SageMaker HyperPod.

ML

ML ML Clustering AWS

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Businesses are under pressure to show return on investment (ROI) from AI use cases, whether predictive machine learning (ML) or generative AI. Only 54% of ML prototypes make it to production, and only 5% of generative AI use cases make it to production. Using SageMaker, you can build, train and deploy ML models.

ML

ML ML AWS AI

Rad AI reduces real-time inference latency by 50% using Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 26, 2024

Challenges in deploying advanced ML models in healthcare Rad AI, being an AI-first company, integrates machine learning (ML) models across various functions—from product development to customer success, from novel research to internal applications. Rad AI’s ML organization tackles this challenge on two fronts.

ML

ML ML AI AI

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Apple Workshop on Privacy-Preserving Machine Learning 2024

Machine Learning Research at Apple

AUGUST 28, 2024

We develop system architectures that enable learning at scale by leveraging advances in machine learning (ML), such as private federated learning (PFL), combined with…

Machine Learning

Machine Learning Machine Learning System Architecture ML

Going beyond AI assistants: Examples from Amazon.com reinventing industries with generative AI

Flipboard

MAY 30, 2025

The quality assurance process includes automated testing methods combining ML-, algorithm-, or LLM-based evaluations. In addition, the process employs traditional ML procedures such as named entity recognition (NER) or estimation of final confidence with regression models. The team extensively used fine-tuned SLMs.

AI

AI AI AWS ML

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Flipboard

APRIL 23, 2025

Rather than maintaining constantly running endpoints, the system creates them on demand when document processing begins and automatically stops them upon completion. This endpoint based architecture provides decoupling between the other processing, allowing independent scaling, versioning, and maintenance of each component.

AWS

AWS ML ML Natural Language Processing

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

AWS Machine Learning Blog

FEBRUARY 13, 2025

To understand how this dynamic role-based functionality works under the hood, lets examine the following system architecture diagram. As shown in preceding architecture diagram, the system works as follows: The end-user logs in and is identified as either a manager or an employee. Nitin Eusebius is a Sr.

AWS

AWS AI AI ML

Towards ML-enabled cleaning robots

Google Research AI blog

APRIL 7, 2023

Combining the strengths of RL and of optimal control We propose an end-to-end approach for table wiping that consists of four components: (1) sensing the environment, (2) planning high-level wiping waypoints with RL, (3) computing trajectories for the whole-body system (i.e.,

ML

ML ML System Architecture

Build an Amazon SageMaker Model Registry approval and promotion workflow with human intervention

AWS Machine Learning Blog

JANUARY 10, 2024

ML Engineer at Tiger Analytics. The large machine learning (ML) model development lifecycle requires a scalable model release process similar to that of software development. Model developers often work together in developing ML models and require a robust MLOps platform to work in.

ML

ML ML AWS Machine Learning

Create a multimodal chatbot tailored to your unique dataset with Amazon Bedrock FMs

AWS Machine Learning Blog

OCTOBER 14, 2024

The following system architecture represents the logic flow when a user uploads an image, asks a question, and receives a text response grounded by the text dataset stored in OpenSearch. This script can be acquired directly from Amazon S3 using aws s3 cp s3://aws-blogs-artifacts-public/artifacts/ML-16363/deploy.sh.

AWS

AWS AI AI System Architecture

Meeting customer needs with our ML platform redesign

Snorkel AI

MAY 3, 2023

In this article, we share our journey and hope that it helps you design better machine learning systems. Table of contents Why we needed to redesign our interactive ML system In this section, we’ll go over the market forces and technological shifts that compelled us to re-architect our ML system.

ML

ML ML Machine Learning Machine Learning

Build multi-agent systems with LangGraph and Amazon Bedrock

AWS Machine Learning Blog

APRIL 14, 2025

The key features of LangGraph Studio are: Visual agent graphs The IDEs visualization tools allow developers to represent agent flows as intuitive graphic wheels, making it straightforward to understand and modify complex system architectures. Rupinder Grewal is a Senior AI/ML Specialist Solutions Architect with AWS.

AWS

AWS ML ML AI

Ray jobs on Amazon SageMaker HyperPod: scalable and resilient distributed AI

AWS Machine Learning Blog

APRIL 2, 2025

Ray promotes the same coding patterns for both a simple machine learning (ML) experiment and a scalable, resilient production application. Overview of Ray This section provides a high-level overview of the Ray tools and frameworks for AI/ML workloads. We primarily focus on ML training use cases.

Clustering

Clustering AWS AI AI

Accelerate machine learning time to value with Amazon SageMaker JumpStart and PwC’s MLOps accelerator

AWS Machine Learning Blog

MAY 23, 2023

With organizations increasingly investing in machine learning (ML), ML adoption has become an integral part of business transformation strategies. However, implementing ML into production comes with various considerations, notably being able to navigate the world of AI safely, strategically, and responsibly.

Machine Learning

Machine Learning Machine Learning AWS ML

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

AWS Machine Learning Blog

FEBRUARY 24, 2023

AWS recently released Amazon SageMaker geospatial capabilities to provide you with satellite imagery and geospatial state-of-the-art machine learning (ML) models, reducing barriers for these types of use cases. For more information, refer to Preview: Use Amazon SageMaker to Build, Train, and Deploy ML Models Using Geospatial Data.

ML

ML ML AWS Data Pipeline

Innovating at speed: BMW’s generative AI solution for cloud incident analysis

AWS Machine Learning Blog

MARCH 5, 2025

It requires checking many systems and teams, many of which might be failing, because theyre interdependent. Developers need to reason about the system architecture, form hypotheses, and follow the chain of components until they have located the one that is the culprit.

AWS

AWS AI AI Machine Learning

Announcing the First Speakers for the Virtual Agentic AI Summit in July

ODSC - Open Data Science

JUNE 12, 2025

His career bridges machine learning research and startup innovation, with previous roles including leading the ML monitoring team at Robust Intelligence, conducting self-driving AI research at Uber ATG, and developing recommendation systems at Quora. Agentic AI — where autonomous systems act, react, and adapt — breaks that mold.

Machine Learning

Machine Learning Machine Learning AI AI

Unbundling the Graph in GraphRAG

O'Reilly Media

NOVEMBER 19, 2024

What’s old becomes new again: Substitute the term “notebook” with “blackboard” and “graph-based agent” with “control shell” to return to the blackboard system architectures for AI from the 1970s–1980s. See the Hearsay-II project , BB1 , and lots of papers by Barbara Hayes-Roth and colleagues. Does GraphRAG improve results?

Database

Database Natural Language Processing AI AI

Moderate your Amazon IVS live stream using Amazon Rekognition

AWS Machine Learning Blog

NOVEMBER 17, 2023

Amazon Rekognition Content Moderation , a capability of Amazon Rekognition , automates and streamlines image and video moderation workflows without requiring machine learning (ML) experience. This process involves the utilization of both ML and non-ML algorithms. In this section, we briefly introduce the system architecture.

AWS

AWS ML ML Algorithm

This AI newsletter is all you need (#36)

Towards AI

MARCH 1, 2023

The technology behind GitHub’s new code search This post provides a high-level explanation of the inner workings of GitHub’s new code search and offers a glimpse into the system architecture and technical underpinnings of the product.

AI

AI AI Machine Learning Machine Learning

Why AI Agents Are Reshaping AI: What You’ll Learn from ODSC East 2025

ODSC - Open Data Science

MARCH 31, 2025

Building Multimodal AI Agents: Agentic RAG with Vision-Language Models Suman Debnath, Principal AI/ML Advocate at Amazon WebServices Learn how to create AI agents that integrate both vision and language using retrieval-augmented generation (RAG).

AI

AI AI System Architecture Data Science

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The compute clusters used in these scenarios are composed of more than thousands of AI accelerators such as GPUs or AWS Trainium and AWS Inferentia , custom machine learning (ML) chips designed by Amazon Web Services (AWS) to accelerate deep learning workloads in the cloud. Because you use p4de.24xlarge You can then take the easy-ssh.sh

Clustering

Clustering AWS ML ML

How Vidmob is using generative AI to transform its creative data landscape

AWS Machine Learning Blog

SEPTEMBER 6, 2024

Understanding the intrinsic value of data network effects, Vidmob constructed a product and operational system architecture designed to be the industry’s most comprehensive RLHF solution for marketing creatives. Use case overview Vidmob aims to revolutionize its analytics landscape with generative AI.

AWS

AWS AI AI Data Scientist

Transforming financial analysis with CreditAI on Amazon Bedrock: Octus’s journey with AWS

AWS Machine Learning Blog

MARCH 10, 2025

Solution overview The following figure illustrates our system architecture for CreditAI on AWS, with two key paths: the document ingestion and content extraction workflow, and the Q&A workflow for live user query response. He specializes in generative AI, machine learning, and system design.

AWS

AWS Database AI AI

Customize DeepSeek-R1 671b model using Amazon SageMaker HyperPod recipes – Part 2

AWS Machine Learning Blog

MAY 14, 2025

Our team continually expands our recipes based on customer feedback and emerging machine learning (ML) trends, making sure you have the necessary tools for successful AI model training. Arun Kumar Lokanatha is a Senior ML Solutions Architect with the Amazon SageMaker team.

Clustering

Clustering AWS ML ML

Build a water consumption forecasting solution for a water utility agency using Amazon Forecast

AWS Machine Learning Blog

JANUARY 30, 2023

Amazon Forecast is a fully managed service that uses machine learning (ML) to generate highly accurate forecasts, without requiring any prior ML experience. With Forecast, there are no servers to provision or ML models to build manually. Delete the S3 bucket.

ML

ML ML AWS System Architecture

Redesigning Snorkel’s interactive machine learning systems

Snorkel AI

MAY 3, 2023

In this article, we share our journey and hope that it helps you design better machine learning systems. Table of contents Why we needed to redesign our interactive ML system In this section, we’ll go over the market forces and technological shifts that compelled us to re-architect our ML system.

Machine Learning

Machine Learning Machine Learning ML ML

Redesigning Snorkel’s interactive machine learning systems

Snorkel AI

MAY 3, 2023

In this article, we share our journey and hope that it helps you design better machine learning systems. Table of contents Why we needed to redesign our interactive ML system In this section, we’ll go over the market forces and technological shifts that compelled us to re-architect our ML system.

Machine Learning

Machine Learning Machine Learning ML ML

Creating asynchronous AI agents with Amazon Bedrock

AWS Machine Learning Blog

MARCH 13, 2025

Agent broker methodology Following an agent broker pattern, the system is still fundamentally event-driven, with actions triggered by the arrival of messages. New agents can be added to handle specific types of messages without changing the overall system architecture.

AI

AI AI AWS Artificial Intelligence

Multi-account support for Amazon SageMaker HyperPod task governance

AWS Machine Learning Blog

JUNE 6, 2025

Before AWS, Anoop held several leadership roles at startups and large corporations, primarily focusing on silicon and system architecture of AI infrastructure. Rajesh Ramchander is a Principal ML Engineer in Professional Services at AWS. Kareem Syed-Mohammed is a Product Manager at AWS.

Clustering

Clustering AWS Data Scientist ML

Automating product description generation with Amazon Bedrock

AWS Machine Learning Blog

NOVEMBER 24, 2023

The system architecture comprises several core components: UI portal – This is the user interface (UI) designed for vendors to upload product images. As an ML enthusiast, Dhaval is driven by his passion for creating impactful solutions that bring positive change.

AWS

AWS Database AI AI

Google Research, 2022 & beyond: Robotics

Google Research AI blog

FEBRUARY 14, 2023

Further improvements are gained by utilizing a novel structured dynamical systems architecture and combining RL with trajectory optimization , supported by novel solvers. We improved the efficiency of RL approaches by incorporating prior information, including predictive information , adversarial motion priors , and guide policies.

Algorithm

Algorithm System Architecture Deep Learning Deep Learning

A Guide to LLMOps: Large Language Model Operations

Heartbeat

JANUARY 9, 2024

" The LLMOps Steps LLMs, sophisticated artificial intelligence (AI) systems trained on enormous text and code datasets, have changed the game in various fields, from natural language processing to content generation. Deployment : The adapted LLM is integrated into this stage's planned application or system architecture.

Natural Language Processing

Natural Language Processing Machine Learning Machine Learning Artificial Intelligence

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Flipboard

DECEMBER 6, 2023

He is a multi patent inventor with three granted patents and his experience spans multiple technology domains including telecom, networking, application integration, AI/ML, and cloud deployments. She leads machine learning (ML) projects in various domains such as computer vision, natural language processing and generative AI.

SQL

SQL Database AWS Machine Learning

How to Build an Experiment Tracking Tool [Learnings From Engineers Behind Neptune]

The MLOps Blog

APRIL 17, 2023

As an MLOps engineer on your team, you are often tasked with improving the workflow of your data scientists by adding capabilities to your ML platform or by building standalone tools for them to use. Giving your data scientists a platform to track the progress of their ML projects. Experiment tracking is one such capability.

Data Scientist

Data Scientist ML ML Machine Learning

Build verifiable explainability into financial services workflows with Automated Reasoning checks for Amazon Bedrock Guardrails

AWS Machine Learning Blog

FEBRUARY 19, 2025

Rather than using probabilistic approaches such as traditional machine learning (ML), Automated Reasoning tools rely on mathematical logic to definitively verify compliance with policies and provide certainty (under given assumptions) about what a system will or wont do. However, its important to understand its limitations.

AWS

AWS Natural Language Processing Machine Learning Machine Learning

10 industries that use distributed computing

IBM Journey to AI blog

JULY 18, 2024

Computing Computing is being dominated by major revolutions in artificial intelligence (AI) and machine learning (ML). The algorithms that empower AI and ML require large volumes of training data, in addition to strong and steady amounts of processing power. Distributed computing supplies both.

Cloud Computing

Cloud Computing Database Internet of Things ML

Exalytics, Exalogic, and Exadata

Pickl AI

DECEMBER 10, 2024

The systems architecture combines Oracles hardware expertise with software optimisation to deliver unmatched performance. Evolving Use Cases Engineered systems are expanding their applications beyond traditional database and middleware optimisation. Core Features Exalytics is engineered for speed and scalability.

Database

Database Analytics Analytics Business Intelligence

How Amazon Shopping uses Amazon Rekognition Content Moderation to review harmful images in product reviews

AWS Machine Learning Blog

AUGUST 15, 2023

The decision handler determines the moderation action and provides reasons for its decision based on the ML models’ output, thereby deciding whether the image required a further review by a human moderator or could be automatically approved or rejected.

ML

ML ML AWS Machine Learning

Mitigating risk: AWS backbone network traffic prediction using GraphStorm

Flipboard

JANUARY 15, 2025

System architecture for GNN-based network traffic prediction In this section, we propose a system architecture for enhancing operational safety within a complex network, such as the ones we discussed earlier. To learn how to use GraphStorm to solve a broader class of ML problems on graphs, see the GitHub repo.

AWS

AWS Machine Learning Machine Learning System Architecture

Generative AI for agriculture: How Agmatix is improving agriculture with Amazon Bedrock

AWS Machine Learning Blog

NOVEMBER 12, 2024

There are various technologies that help operationalize and optimize the process of field trials, including data management and analytics, IoT, remote sensing, robotics, machine learning (ML), and now generative AI. The transformed data acts as the input to AI/ML services. AWS Lambda is then used to further enrich the data.

AWS

AWS AI AI Data Lakes

Optimizing AI responsiveness: A practical guide to Amazon Bedrock latency-optimized inference

AWS Machine Learning Blog

JANUARY 28, 2025

In this section, we explore how different system components and architectural decisions impact overall application responsiveness. System architecture and end-to-end latency considerations In production environments, overall system latency extends far beyond model inference time.

AI

AI AI AWS ML

Reduce ML training costs with Amazon SageMaker HyperPod

Real value, real time: Production AI with Amazon SageMaker and Tecton

Webinars

Trending Sources

Rad AI reduces real-time inference latency by 50% using Amazon SageMaker

Webinars

Apple Workshop on Privacy-Preserving Machine Learning 2024

Going beyond AI assistants: Examples from Amazon.com reinventing industries with generative AI

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

Towards ML-enabled cleaning robots

Build an Amazon SageMaker Model Registry approval and promotion workflow with human intervention

Create a multimodal chatbot tailored to your unique dataset with Amazon Bedrock FMs

Meeting customer needs with our ML platform redesign

Build multi-agent systems with LangGraph and Amazon Bedrock

Ray jobs on Amazon SageMaker HyperPod: scalable and resilient distributed AI

Accelerate machine learning time to value with Amazon SageMaker JumpStart and PwC’s MLOps accelerator

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

Innovating at speed: BMW’s generative AI solution for cloud incident analysis

Announcing the First Speakers for the Virtual Agentic AI Summit in July

Unbundling the Graph in GraphRAG

Moderate your Amazon IVS live stream using Amazon Rekognition

This AI newsletter is all you need (#36)

Why AI Agents Are Reshaping AI: What You’ll Learn from ODSC East 2025

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod

How Vidmob is using generative AI to transform its creative data landscape

Transforming financial analysis with CreditAI on Amazon Bedrock: Octus’s journey with AWS

Customize DeepSeek-R1 671b model using Amazon SageMaker HyperPod recipes – Part 2

Build a water consumption forecasting solution for a water utility agency using Amazon Forecast

Redesigning Snorkel’s interactive machine learning systems

Redesigning Snorkel’s interactive machine learning systems

Creating asynchronous AI agents with Amazon Bedrock

Multi-account support for Amazon SageMaker HyperPod task governance

Automating product description generation with Amazon Bedrock

Google Research, 2022 & beyond: Robotics

A Guide to LLMOps: Large Language Model Operations

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

How to Build an Experiment Tracking Tool [Learnings From Engineers Behind Neptune]

Build verifiable explainability into financial services workflows with Automated Reasoning checks for Amazon Bedrock Guardrails

10 industries that use distributed computing

Exalytics, Exalogic, and Exadata

How Amazon Shopping uses Amazon Rekognition Content Moderation to review harmful images in product reviews

Mitigating risk: AWS backbone network traffic prediction using GraphStorm

Generative AI for agriculture: How Agmatix is improving agriculture with Amazon Bedrock

Optimizing AI responsiveness: A practical guide to Amazon Bedrock latency-optimized inference

Stay Connected