Apache Kafka, Database and ML - Data Science Current

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

Use streaming ingestion with Amazon SageMaker Feature Store and Amazon MSK to make ML-backed decisions in near-real time

AWS Machine Learning Blog

APRIL 19, 2023

Businesses are increasingly using machine learning (ML) to make near-real-time decisions, such as placing an ad, assigning a driver, recommending a product, or even dynamically pricing products and services. Apache Flink is a popular framework and engine for processing data streams. 0 … 1248 Nov-02 12:14:31 32.45

ML

ML ML Apache Kafka SQL

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

AWS Machine Learning Blog

SEPTEMBER 11, 2024

In this post, we demonstrate how to build a robust real-time anomaly detection solution for streaming time series data using Amazon Managed Service for Apache Flink and other AWS managed services. This solution employs machine learning (ML) for anomaly detection, and doesn’t require users to have prior AI expertise.

AWS

AWS ML ML Apache Kafka

Webinars

Maximizing Profit and Productivity: The New Era of AI-Powered Accounting

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Streaming Data Pipelines: What Are They and How to Build One

Precisely

DECEMBER 28, 2023

More than ever, advanced analytics, ML, and AI are providing the foundation for innovation, efficiency, and profitability. One very popular platform is Apache Kafka , a powerful open-source tool used by thousands of companies. But in all likelihood, Kafka doesn’t natively connect with the applications that contain your data.

Data Pipeline

Data Pipeline Apache Kafka Big Data Big Data

Real-time artificial intelligence and event processing

IBM Journey to AI blog

NOVEMBER 29, 2023

Aggregates as predictive insights : Aggregates, which consolidate data from various sources across your business environment, can serve as valuable predictors for machine learning (ML) algorithms. Event processing helps continuously update and refine our understanding of ongoing business scenarios.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Apache Kafka AI

Bundesliga Match Facts Shot Speed – Who fires the hardest shots in the Bundesliga?

AWS Machine Learning Blog

NOVEMBER 3, 2023

m How it’s implemented In our quest to accurately determine shot speed during live matches, we’ve implemented a cutting-edge solution using Amazon Managed Streaming for Apache Kafka (Amazon MSK). We’ve implemented an AWS Lambda function with the specific task of retrieving the calculated shot speed from the relevant Kafka topic.

AWS

AWS Apache Kafka Data Scientist Data Science

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

AWS Machine Learning Blog

MARCH 30, 2023

The result is a machine learning (ML)-powered insight that allows fans to easily evaluate and compare the goalkeepers’ proficiencies. An ML model is trained through Amazon SageMaker , using data from four seasons of the first and second Bundesliga, encompassing all shots that landed on target (either resulting in a goal or being saved).

Machine Learning

Machine Learning Machine Learning AWS ML

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

Managing unstructured data is essential for the success of machine learning (ML) projects. This article will discuss managing unstructured data for AI and ML projects. You will learn the following: Why unstructured data management is necessary for AI and ML projects. How to properly manage unstructured data.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

AWS Machine Learning Blog

MARCH 30, 2023

To ensure real-time updates of ball recovery times, we have implemented Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a central solution for data streaming and messaging. A Lambda function retrieves all recovery times from the relevant Kafka topic and stores them in an Amazon Aurora Serverless database.

AWS

AWS Machine Learning Machine Learning Apache Kafka

Unveiling Developers’ Technologies and Tools Usage in Large and Small and Medium-sized Enterprises…

Mlearning.ai

AUGUST 4, 2023

The focus of this investigation revolves around understanding their industry distribution, age demographics, developer types, and their adoption of various programming languages, databases, platforms, web frameworks, miscellaneous technologies, technical tools, new collaboration tools, and AI-powered search tools. NET Framework (1.0–4.8)’

Database

Database Apache Kafka SQL AI

Training Models on Streaming Data [Practical Guide]

The MLOps Blog

FEBRUARY 5, 2023

There are a number of tools that can help with streaming data collection and processing, some popular ones include: Apache Kafka : An open-source, distributed event streaming platform that can handle millions of events per second. It can be used to collect, store, and process streaming data in real-time. Happy Learning!

Machine Learning

Machine Learning Machine Learning Data Pipeline Apache Kafka

Build Data Pipelines: Comprehensive Step-by-Step Guide

Pickl AI

JULY 8, 2024

Database Extraction: Retrieval from structured databases using query languages like SQL. Common options include: Relational Databases: Structured storage supporting ACID transactions, suitable for structured data. NoSQL Databases: Flexible, scalable solutions for unstructured or semi-structured data.

Data Pipeline

Data Pipeline Data Quality Database Apache Kafka

The Backbone of Data Engineering: 5 Key Architectural Patterns Explained

Mlearning.ai

MAY 16, 2023

It is used to extract data from various sources, transform the data to fit a specific data model or schema, and then load the transformed data into a target system such as a data warehouse or a database. The events can be published to a message broker such as Apache Kafka or Google Cloud Pub/Sub.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

A typical data pipeline involves the following steps or processes through which the data passes before being consumed by a downstream process, such as an ML model training process. If a typical ML project involves standard pre-processing steps – why not make it reusable? Relational database connectors are available.

Data Pipeline

Data Pipeline ETL SQL Data Quality

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

It often involves specialized databases designed to handle this kind of atomic, temporal data. Technologies like Apache Kafka, often used in modern CDPs, use log-based approaches to stream customer events between systems in real-time. It’s precise but can impact database performance.

Data Models

Data Models Data Modeling Apache Kafka Data Lakes

ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)

The MLOps Blog

AUGUST 11, 2023

There comes a time when every ML practitioner realizes that training a model in Jupyter Notebook is just one small part of the entire project. At that point, the Data Scientists or ML Engineers become curious and start looking for such implementations. What are ML pipeline architecture design patterns?

ML

ML ML Machine Learning Machine Learning

Did Big Data Deliver Business Transformation & Improved CX?

Alation

AUGUST 4, 2022

New Big Data Concepts vs Cloud Delivered Databases? So, what has the emergence of cloud databases done to change big data? Spark, Tensorflow, Apache Kafka, et cetera, are all out found in cloud databases,” points out Jones. To improve ML and Ethics, data literacy training is critical.

Big Data

Big Data Big Data Apache Kafka Data Lakes

Predicting the Future of Data Science

Pickl AI

DECEMBER 4, 2024

The rise of advanced technologies such as Artificial Intelligence (AI), Machine Learning (ML) , and Big Data analytics is reshaping industries and creating new opportunities for Data Scientists. Apache Kafka), organisations can now analyse vast amounts of data as it is generated. Here are five key trends to watch.

Data Science

Data Science Data Scientist Machine Learning Machine Learning

Top Big Data Tools Every Data Professional Should Know

Pickl AI

FEBRUARY 23, 2025

Best Big Data Tools Popular tools such as Apache Hadoop, Apache Spark, Apache Kafka, and Apache Storm enable businesses to store, process, and analyse data efficiently. Machine Learning Integration : Built-in ML capabilities streamline model development and deployment.

Big Data

Big Data Big Data Apache Hadoop Apache Kafka

Building the future of construction analytics: CONXAI’s AI inference on Amazon EKS

AWS Machine Learning Blog

FEBRUARY 7, 2025

However, it lacked essential services required for machine learning (ML) applications, such as frontend and backend infrastructure, DNS, load balancers, scaling, blob storage, and managed databases. At that time, the application was deployed as a single monolithic container, which included Kafka and a database.

Analytics

Analytics Analytics AWS Clustering

Data Science Current

Streaming Machine Learning Without a Data Lake

Use streaming ingestion with Amazon SageMaker Feature Store and Amazon MSK to make ML-backed decisions in near-real time

Webinars

Trending Sources

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

Webinars

Streaming Data Pipelines: What Are They and How to Build One

Real-time artificial intelligence and event processing

Bundesliga Match Facts Shot Speed – Who fires the hardest shots in the Bundesliga?

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

How to Manage Unstructured Data in AI and Machine Learning Projects

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

Unveiling Developers’ Technologies and Tools Usage in Large and Small and Medium-sized Enterprises…

Training Models on Streaming Data [Practical Guide]

Build Data Pipelines: Comprehensive Step-by-Step Guide

The Backbone of Data Engineering: 5 Key Architectural Patterns Explained

Comparing Tools For Data Processing Pipelines

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)

Did Big Data Deliver Business Transformation & Improved CX?

Predicting the Future of Data Science

Top Big Data Tools Every Data Professional Should Know

Building the future of construction analytics: CONXAI’s AI inference on Amazon EKS

Stay Connected