Remove Apache Kafka Remove AWS Remove Database
article thumbnail

Apache Kafka use cases: Driving innovation across diverse industries

IBM Journey to AI blog

Apache Kafka is an open-source , distributed streaming platform that allows developers to build real-time, event-driven applications. With Apache Kafka, developers can build applications that continuously use streaming data records and deliver real-time experiences to users. How does Apache Kafka work?

article thumbnail

Real-time fraud detection using AWS serverless and machine learning services

AWS Machine Learning Blog

The same architecture applies if you use Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a data streaming service. This approach allows you to react to the potentially fraudulent transactions in real time as you store each transaction in a database and inspect it before processing further.

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

AWS Machine Learning Blog

To ensure real-time updates of ball recovery times, we have implemented Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a central solution for data streaming and messaging. A Lambda function retrieves all recovery times from the relevant Kafka topic and stores them in an Amazon Aurora Serverless database.

AWS 87
article thumbnail

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

AWS Machine Learning Blog

Bundesliga and AWS have collaborated to perform an in-depth examination to study the quantification of achievements of Bundesliga’s keepers. The BMF logic itself (except for the ML model) runs on an AWS Fargate container. This Bundesliga Match Fact was developed among a team of Bundesliga and AWS experts.

article thumbnail

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

AWS Machine Learning Blog

In this post, we demonstrate how to build a robust real-time anomaly detection solution for streaming time series data using Amazon Managed Service for Apache Flink and other AWS managed services. It offers an AWS CloudFormation template for straightforward deployment in an AWS account.

AWS 102
article thumbnail

Big data engineering simplified: Exploring roles of distributed systems

Data Science Dojo

Its characteristics can be summarized as follows: Volume : Big Data involves datasets that are too large to be processed by traditional database management systems. databases), semi-structured data (e.g., Amazon S3: Amazon Simple Storage Service (S3) is a scalable object storage service provided by Amazon Web Services (AWS).

Big Data 195
article thumbnail

Navigating the Big Data Frontier: A Guide to Efficient Handling

Women in Big Data

Components of a Big Data Pipeline Data Sources (Collection): Data originates from various sources, such as databases, APIs, and log files. Examples include transactional databases, social media feeds, and IoT sensors. This phase ensures quality and consistency using frameworks like Apache Spark or AWS Glue.