Apache Kafka, Blog and Data Lakes - Data Science Current

Apache Kafka

Blog

Data Lakes

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

AWS Machine Learning Blog

SEPTEMBER 11, 2024

It initially sources input time series data from Amazon Managed Streaming for Apache Kafka (Amazon MSK) using this live stream for model training. Post-training, the model continues to process incoming data points from the stream. It evaluates these points against the historical trends of the corresponding time series.

AWS

AWS ML ML Apache Kafka

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Did Big Data Deliver Business Transformation & Improved CX?

Alation

AUGUST 4, 2022

And where data was available, the ability to access and interpret it proved problematic. Big data can grow too big fast. Left unchecked, data lakes became data swamps. Some data lake implementations required expensive ‘cleansing pumps’ to make them navigable again. Subscribe to Alation's Blog.

Big Data

Big Data Big Data Apache Kafka Data Lakes

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

What is Data Ingestion? Understanding the Basics

Pickl AI

JULY 25, 2024

From extracting information from databases and spreadsheets to ingesting streaming data from IoT devices and social media platforms, It’s the foundation upon which data-driven initiatives are built. Batch Processing In this method, data is collected over a period and then processed in groups or batches.

Apache Kafka

Apache Kafka Data Lakes Data Warehouse Data Quality

How Netflix Applies Big Data Across Business Verticals: Insights and Strategies

Pickl AI

SEPTEMBER 18, 2024

Introduction Netflix has transformed the entertainment landscape, not just through its vast library of content but also by leveraging Big Data across various business verticals. As a pioneer in the streaming industry, Netflix utilises advanced data analytics to enhance user experience, optimise operations, and drive strategic decisions.

Big Data

Big Data Big Data Apache Kafka Big Data Analytics

Introduction to Apache NiFi and Its Architecture

Pickl AI

JULY 30, 2024

With its user-friendly interface and robust architecture, NiFi simplifies the complexities of data integration, making it an essential component for modern data-driven enterprises. This blog delves into the fundamentals of Apache NiFi, its architecture, and how it can leverage for effective data flow management.

ETL

ETL Data Lakes Big Data Big Data

Big Data Syllabus: A Comprehensive Overview

Pickl AI

AUGUST 9, 2024

A well-structured syllabus for Big Data encompasses various aspects, including foundational concepts, technologies, data processing techniques, and real-world applications. This blog aims to provide a comprehensive overview of a typical Big Data syllabus, covering essential topics that aspiring data professionals should master.

Big Data

Big Data Big Data Big Data Analytics Big Data Analytics

Build Data Pipelines: Comprehensive Step-by-Step Guide

Pickl AI

JULY 8, 2024

Summary: This blog explains how to build efficient data pipelines, detailing each step from data collection to final delivery. Introduction Data pipelines play a pivotal role in modern data architecture by seamlessly transporting and transforming raw data into valuable insights.

Data Pipeline

Data Pipeline Data Quality Database Apache Kafka

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

Data Processing : You need to save the processed data through computations such as aggregation, filtering and sorting. Data Storage : To store this processed data to retrieve it over time – be it a data warehouse or a data lake.

Data Pipeline

Data Pipeline ETL SQL Data Quality

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

AWS Machine Learning Blog

MARCH 30, 2023

For every xSaves prediction, it produces a message with the prediction as a payload, which then gets distributed by a central message broker running on Amazon Managed Streaming for Apache Kafka (Amazon MSK). The information also gets stored in a data lake for future auditing and model improvements.

Machine Learning

Machine Learning Machine Learning AWS ML

Streaming Machine Learning Without a Data Lake

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

Webinars

Trending Sources

Did Big Data Deliver Business Transformation & Improved CX?

Webinars

What is Data Ingestion? Understanding the Basics

How Netflix Applies Big Data Across Business Verticals: Insights and Strategies

Introduction to Apache NiFi and Its Architecture

Big Data Syllabus: A Comprehensive Overview

Build Data Pipelines: Comprehensive Step-by-Step Guide

Comparing Tools For Data Processing Pipelines

Bundesliga Match Fact Keeper Efficiency: Comparing keepers’ performances objectively using machine learning on AWS

Stay Connected