Analytics and Apache Kafka - Data Science Current

Apache Kafka Architecture and Use Cases Explained

Analytics Vidhya

JULY 22, 2022

That’s why you need to know about Apache Kafka, a publish-subscribe messaging system you can use to build distributed applications. The post Apache Kafka Architecture and Use Cases Explained appeared first on Analytics Vidhya. It is scalable and fault-tolerant, making […].

Apache Kafka

Apache Kafka Big Data Big Data Data Science

Handling Streaming Data with Apache Kafka – A First Look

Analytics Vidhya

JUNE 21, 2022

The post Handling Streaming Data with Apache Kafka – A First Look appeared first on Analytics Vidhya. Streaming Data is generated continuously, by multiple data sources say, sensors, server logs, stock prices, etc. These records are usually small and in the order […].

Apache Kafka

Apache Kafka Data Science Analytics Analytics

Apache Kafka Use Cases and Installation Guide

Analytics Vidhya

OCTOBER 3, 2022

The post Apache Kafka Use Cases and Installation Guide appeared first on Analytics Vidhya. As applications cover more aspects of our daily lives, it is increasingly difficult to provide users with a quick response. Source: kafka.apache.org Caching is used to solve […].

Apache Kafka

Apache Kafka Data Science Analytics Analytics

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Exploring Partitions and Consumer Groups in Apache Kafka

Analytics Vidhya

AUGUST 2, 2022

Introduction Earlier, I had introduced basic concepts of Apache Kafka in my blog on Analytics Vidhya(link is available under references). This article introduced concepts involved in Apache Kafka and further built the understanding by using the python API of Kafka to write some […].

Apache Kafka

Apache Kafka Data Science Python Analytics

Introduction to Apache Kafka: Fundamentals and Working

Analytics Vidhya

DECEMBER 30, 2022

The post Introduction to Apache Kafka: Fundamentals and Working appeared first on Analytics Vidhya. Introduction Have you ever wondered how Instagram recommends similar kinds of reels while you are scrolling through your feed or ad recommendations for similar products that you were browsing on Amazon?

Apache Kafka

Apache Kafka Data Science Analytics Analytics

Build a Scalable Data Pipeline with Apache Kafka

Analytics Vidhya

MARCH 10, 2023

Introduction Apache Kafka is a framework for dealing with many real-time data streams in a way that is spread out. It was made on LinkedIn and shared with the public in 2011.

Apache Kafka

Apache Kafka Data Pipeline Analytics Analytics

A Detailed Guide of Interview Questions on Apache Kafka

Analytics Vidhya

APRIL 28, 2023

Introduction Apache Kafka is an open-source publish-subscribe messaging application initially developed by LinkedIn in early 2011. It is a famous Scala-coded data processing tool that offers low latency, extensive throughput, and a unified platform to handle the data in real-time.

Apache Kafka

Apache Kafka Analytics Analytics Hadoop

Apache Kafka: A Metaphorical Introduction to Event Streaming for Data Scientists and Data Engineers

Analytics Vidhya

NOVEMBER 2, 2020

Overview Learn about viewing data as streams of immutable events in contrast to mutable containers Understand how Apache Kafka captures real-time data through event. The post Apache Kafka: A Metaphorical Introduction to Event Streaming for Data Scientists and Data Engineers appeared first on Analytics Vidhya.

Apache Kafka

Apache Kafka Data Scientist Data Engineering Data Engineering

Creating a Data Science Pipeline for Real-Time Analytics Using Apache Kafka and Spark

KDnuggets

APRIL 1, 2025

This article explains how to create a system that processes data in real time using Apache Kafka and Spark.

Apache Kafka

Apache Kafka Data Science Analytics Analytics

Build a Simple Realtime Data Pipeline

Analytics Vidhya

SEPTEMBER 22, 2022

Dale Carnegie” Apache Kafka is a Software Framework for storing, reading, and analyzing streaming data. The post Build a Simple Realtime Data Pipeline appeared first on Analytics Vidhya. Only knowledge that is used sticks in your mind.- The Internet of Things(IoT) devices can generate a large […].

Data Pipeline

Data Pipeline Apache Kafka Internet of Things Data Science

Top 15 Big Data Softwares to Know About in 2023

Analytics Vidhya

JULY 12, 2023

Best Big Data Softwares - Apache Hadoop, Apache Spark, apache Kafka, Apache Storm, Apache Cassandra, Apache Hive, zoho & more.

Apache Kafka

Apache Kafka Big Data Big Data Apache Hadoop

Maximizing your event-driven architecture investments: Unleashing the power of Apache Kafka with IBM Event Automation

IBM Journey to AI blog

FEBRUARY 12, 2024

At the forefront of this event-driven revolution is Apache Kafka, the widely recognized and dominant open-source technology for event streaming. While most enterprises have already recognized how Apache Kafka provides a strong foundation for EDA, they often fall behind in unlocking its true potential.

Apache Kafka

Apache Kafka EDA SQL Database

Stream ingest data from Kafka to Amazon Bedrock Knowledge Bases using custom connectors

AWS Machine Learning Blog

APRIL 18, 2025

Solution overview: Build a generative AI stock price analyzer with RAG For this post, we implement a RAG architecture with Amazon Bedrock Knowledge Bases using a custom connector and topics built with Amazon Managed Streaming for Apache Kafka (Amazon MSK) for a user who may be interested to understand stock price trends.

Apache Kafka

Apache Kafka AWS Clustering Database

Apache Kafka and Apache Flink: An open-source match made in heaven

IBM Journey to AI blog

NOVEMBER 3, 2023

Apache Kafka and Apache Flink working together Anyone who is familiar with the stream processing ecosystem is familiar with Apache Kafka: the de-facto enterprise standard for open-source event streaming. With Apache Kafka, you get a raw stream of events from everything that is happening within your business.

Apache Kafka

Apache Kafka Data Warehouse Data Pipeline Big Data

Apache Kafka use cases: Driving innovation across diverse industries

IBM Journey to AI blog

SEPTEMBER 4, 2024

Apache Kafka is an open-source , distributed streaming platform that allows developers to build real-time, event-driven applications. With Apache Kafka, developers can build applications that continuously use streaming data records and deliver real-time experiences to users. How does Apache Kafka work?

Apache Kafka

Apache Kafka Internet of Things Data Pipeline Clustering

Hybrid Vs. Multi-Cloud: 5 Key Comparisons in Kafka Architectures

Smart Data Collective

AUGUST 17, 2022

You can safely use an Apache Kafka cluster for seamless data movement from the on-premise hardware solution to the data lake using various cloud services like Amazon’s S3 and others. 5 Key Comparisons in Different Apache Kafka Architectures. 5 Key Comparisons in Different Apache Kafka Architectures.

Apache Kafka

Apache Kafka ETL Data Lakes AWS

22 Widely Used Data Science and Machine Learning Tools in 2020

Analytics Vidhya

JUNE 27, 2020

The post 22 Widely Used Data Science and Machine Learning Tools in 2020 appeared first on Analytics Vidhya. Overview There are a plethora of data science tools out there – which one should you pick up? Here’s a list of over 20.

Data Science

Data Science Machine Learning Machine Learning Analytics

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

Complex Event Processing (CEP)

Dataconomy

MARCH 11, 2025

Complex Event Processing (CEP) is at the forefront of modern analytics, enabling organizations to extract valuable insights from vast streams of real-time data. Real-time data management The importance of real-time data in todays analytics landscape cannot be overstated.

Apache Kafka

Apache Kafka Machine Learning Machine Learning Data Mining

Building the future of construction analytics: CONXAI’s AI inference on Amazon EKS

AWS Machine Learning Blog

FEBRUARY 7, 2025

Users can add and manage new cameras, view footage, perform analytical searches, and enforce GDPR compliance with automatic person anonymization. It is backed by Amazon Managed Streaming for Apache Kafka (Amazon MSK) (8). The next important step is to use these model results with proper analytics and data science.

Analytics

Analytics Analytics AWS Clustering

9 Must-Have Skills to Become a Data Engineer!

Analytics Vidhya

DECEMBER 4, 2020

appeared first on Analytics Vidhya. Overview Know which are the top 9 skills required to be a data engineer Find suitable resources to learn about these tools By no. The post 9 Must-Have Skills to Become a Data Engineer!

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Data sips and bites: An evening of data insights

Dataconomy

JULY 29, 2024

Talks and insights Mikhail Epikhin: Navigating the processor landscape for Apache Kafka Mikhail Epikhin began the session by sharing his team’s research on optimizing Managed Service for Apache Kafka. His presentation focused on the performance and efficiency of different instance types and processor architectures.

Apache Kafka

Apache Kafka Data Pipeline Data Warehouse ETL

Building a Pizza Delivery Service with a Real-Time Analytics Stack

ODSC - Open Data Science

JUNE 1, 2023

Be sure to check out his talk, “ Building a Real-time Analytics Application for a Pizza Delivery Service ,” there! Gartner defines Real-Time Analytics as follows: Real-time analytics is the discipline that applies logic and mathematics to data to provide insights for making better decisions quickly.

Analytics

Analytics Analytics Apache Kafka Data Science

Major Differences: Kafka vs RabbitMQ

Pickl AI

MARCH 13, 2025

Two of the most popular message brokers are RabbitMQ and Apache Kafka. In this blog, we will explore RabbitMQ vs Kafka, their key differences, and when to use each. Kafka excels in real-time data streaming and scalability. RabbitMQ uses a push-based model, while Kafka follows a pull-based model.

Apache Kafka

Apache Kafka Big Data Big Data Data Pipeline

Top Big Data Tools Every Data Professional Should Know

Pickl AI

FEBRUARY 23, 2025

Ultimately, leveraging Big Data analytics provides a competitive advantage and drives innovation across various industries. Competitive Advantage Organisations that leverage Big Data Analytics can stay ahead of the competition by anticipating market trends and consumer preferences.

Big Data

Big Data Big Data Apache Hadoop Apache Kafka

Best Data Engineering Tools Every Engineer Should Know

Pickl AI

MARCH 19, 2025

Python, SQL, and Apache Spark are essential for data engineering workflows. Real-time data processing with Apache Kafka enables faster decision-making. Apache Spark Apache Spark is a powerful data processing framework that efficiently handles Big Data. The global Big Data and data engineering market, valued at $75.55

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Enhanced diagnostics flow with LLM and Amazon Bedrock agent integration

Flipboard

JUNE 3, 2025

Solution overview The Noodoe AI-enhanced diagnostics flow is built on a multi-step process that combines data collection, AI-powered analytics, and seamless translation for global accessibility, as illustrated in the following figure. Read on to discover how AI is transforming EV charging management.

AWS

AWS Apache Kafka Database AI

How to Unlock Real-Time Analytics with Snowflake?

phData

MAY 3, 2024

Leveraging real-time analytics to make informed decisions is the golden standard for virtually every business that collects data. If you have the Snowflake Data Cloud (or are considering migrating to Snowflake ), you’re a blog away from taking a step closer to real-time analytics. Why Pursue Real-Time Analytics for Your Organization?

Apache Kafka

Apache Kafka Analytics Analytics ETL

Event-driven architecture (EDA) enables a business to become more aware of everything that’s happening, as it’s happening

IBM Journey to AI blog

JANUARY 8, 2024

They often use Apache Kafka as an open technology and the de facto standard for accessing events from a various core systems and applications. IBM provides an Event Streams capability build on Apache Kafka that makes events manageable across an entire enterprise.

EDA

EDA Apache Kafka Clustering Data Governance

What Are AI Credits and How Can Data Scientists Use Them?

ODSC - Open Data Science

APRIL 23, 2025

Confluent Confluent provides a robust data streaming platform built around Apache Kafka. empower teams to explore use cases in real-time analytics, personalized recommendations, and search. With AI credits, teams can streamline the annotation process using intelligent suggestions and quality control mechanisms.

Data Scientist

Data Scientist Azure Apache Kafka ML

Big Data – Lambda or Kappa Architecture?

Data Science Blog

JUNE 27, 2023

Big Data Analytics stands apart from conventional data processing in its fundamental nature. It receives batch views from the batch layer and near-real-time views from the speed layer, utilizing this data to facilitate standard reporting and ad hoc analytics.

Big Data

Big Data Big Data Apache Kafka Database

Streaming Data Pipelines: What Are They and How to Build One

Precisely

DECEMBER 28, 2023

More than ever, advanced analytics, ML, and AI are providing the foundation for innovation, efficiency, and profitability. It also allows for applications, analytics, and reporting to process information as it happens. One very popular platform is Apache Kafka , a powerful open-source tool used by thousands of companies.

Data Pipeline

Data Pipeline Apache Kafka Big Data Big Data

Apache Flink for all: Making Flink consumable across all areas of your business

IBM Journey to AI blog

AUGUST 29, 2024

Apache Flink takes raw events and processes them, making them more relevant in the broader business context. The unique advantages of Apache Flink Apache Flink augments event streaming technologies like Apache Kafka to enable businesses to respond to events more effectively in real time.

Apache Kafka

Apache Kafka Hadoop ETL Data Pipeline

Big data engineering simplified: Exploring roles of distributed systems

Data Science Dojo

JULY 24, 2023

Apache Flink for stream processing: Wrapping up In conclusion, stream processing with distributed systems like Apache Kafka, Apache Flink, and Apache Spark Streaming empowers organizations to harness real-time data insights, enabling timely decision-making and enhanced user experiences.

Big Data

Big Data Big Data Data Engineering Data Engineering

Navigating the Big Data Frontier: A Guide to Efficient Handling

Women in Big Data

OCTOBER 9, 2024

After this, the data is analyzed, business logic is applied, and it is processed for further analytical tasks like visualization or machine learning. Data Ingestion: Data is collected and funneled into the pipeline using batch or real-time methods, leveraging tools like Apache Kafka, AWS Kinesis, or custom ETL scripts.

Big Data

Big Data Big Data Apache Kafka Data Pipeline

The Rise of Streaming Data and Its Cost Efficiency – How Did We Get Here?

insideBIGDATA

JUNE 25, 2024

In this contributed article, Sijie Guo, Founder and CEO of Streamnative, believes that with remote work entrenched in the post-pandemic enterprise, organizations are restructuring their technology stack and software strategy for a new, distributed workforce.

Apache Kafka

Apache Kafka Big Data Big Data Analytics

Use streaming ingestion with Amazon SageMaker Feature Store and Amazon MSK to make ML-backed decisions in near-real time

AWS Machine Learning Blog

APRIL 19, 2023

Streaming ingestion – An Amazon Kinesis Data Analytics for Apache Flink application backed by Apache Kafka topics in Amazon Managed Streaming for Apache Kafka (MSK) (Amazon MSK) calculates aggregated features from a transaction stream, and an AWS Lambda function updates the online feature store.

ML

ML ML Apache Kafka SQL

Real-time fraud detection using AWS serverless and machine learning services

AWS Machine Learning Blog

MARCH 10, 2023

The same architecture applies if you use Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a data streaming service. You can use this metadata in your data analytics solutions, machine learning model training tasks, or visualizations and dashboards that consume transaction data. An example use case is claims processing.

Machine Learning

Machine Learning Machine Learning AWS Apache Kafka

Architecting Real-Time Analytics for Speed and Scale

Dataversity

JUNE 30, 2023

The demand for instant results is not limited […] The post Architecting Real-Time Analytics for Speed and Scale appeared first on DATAVERSITY. If Netflix takes too long to load or the nearest Lyft is too far, users are quick to switch to alternative options.

Analytics

Analytics Analytics Apache Kafka Database

Real-time artificial intelligence and event processing

IBM Journey to AI blog

NOVEMBER 29, 2023

Stream analytics can be used to help improve the speed and accuracy of models’ predictions. IBM Event Automation is a fully composable solution, built on open technologies, with capabilities for: Event streaming : Collect and distribute raw streams of real-time business events with enterprise-grade Apache Kafka.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Apache Kafka AI

The Rise of Streaming Data and Its Cost Efficiency – How Did We Get Here?

insideBIGDATA

JUNE 25, 2024

In this contributed article, Sijie Guo, Founder and CEO of Streamnative, believes that with remote work entrenched in the post-pandemic enterprise, organizations are restructuring their technology stack and software strategy for a new, distributed workforce.

Apache Kafka

Apache Kafka Big Data Big Data Analytics

11 Open-Source Data Engineering Tools Every Pro Should Use

ODSC - Open Data Science

FEBRUARY 6, 2024

Apache Kafka For data engineers dealing with real-time data, Apache Kafka is a game-changer. REGISTER NOW Data Orchestration and Workflow Management Apache Airflow Apache Airflow is renowned for its ability to build and schedule complex data pipelines.

Data Engineering

Data Engineering Data Engineering Data Engineering Data Engineer

Top 15 Data Analytics Projects in 2023 for beginners to Experienced

Pickl AI

JULY 20, 2023

Top 15 Data Analytics Projects in 2023 for Beginners to Experienced Levels: Data Analytics Projects allow aspirants in the field to display their proficiency to employers and acquire job roles. However, you might be looking for a guide to help you understand the different types of Data Analytics projects you may undertake.

Analytics

Analytics Analytics Big Data Big Data

How Netflix Applies Big Data Across Business Verticals: Insights and Strategies

Pickl AI

SEPTEMBER 18, 2024

As a pioneer in the streaming industry, Netflix utilises advanced data analytics to enhance user experience, optimise operations, and drive strategic decisions. Data in Motion Technologies like Apache Kafka facilitate real-time processing of events and data, allowing Netflix to respond swiftly to user interactions and operational needs.

Big Data

Big Data Big Data Apache Kafka Big Data Analytics

Apache Kafka Architecture and Use Cases Explained

Handling Streaming Data with Apache Kafka – A First Look

Webinars

Trending Sources

Apache Kafka Use Cases and Installation Guide

Webinars

Exploring Partitions and Consumer Groups in Apache Kafka

Introduction to Apache Kafka: Fundamentals and Working

Build a Scalable Data Pipeline with Apache Kafka

A Detailed Guide of Interview Questions on Apache Kafka

Apache Kafka: A Metaphorical Introduction to Event Streaming for Data Scientists and Data Engineers

Creating a Data Science Pipeline for Real-Time Analytics Using Apache Kafka and Spark

Build a Simple Realtime Data Pipeline

Top 15 Big Data Softwares to Know About in 2023

Maximizing your event-driven architecture investments: Unleashing the power of Apache Kafka with IBM Event Automation

Stream ingest data from Kafka to Amazon Bedrock Knowledge Bases using custom connectors

Apache Kafka and Apache Flink: An open-source match made in heaven

Apache Kafka use cases: Driving innovation across diverse industries

Hybrid Vs. Multi-Cloud: 5 Key Comparisons in Kafka Architectures

22 Widely Used Data Science and Machine Learning Tools in 2020

Streaming Machine Learning Without a Data Lake

Complex Event Processing (CEP)

Building the future of construction analytics: CONXAI’s AI inference on Amazon EKS

9 Must-Have Skills to Become a Data Engineer!

Data sips and bites: An evening of data insights

Building a Pizza Delivery Service with a Real-Time Analytics Stack

Major Differences: Kafka vs RabbitMQ

Top Big Data Tools Every Data Professional Should Know

Best Data Engineering Tools Every Engineer Should Know

Enhanced diagnostics flow with LLM and Amazon Bedrock agent integration

How to Unlock Real-Time Analytics with Snowflake?

Event-driven architecture (EDA) enables a business to become more aware of everything that’s happening, as it’s happening

What Are AI Credits and How Can Data Scientists Use Them?

Big Data – Lambda or Kappa Architecture?

Streaming Data Pipelines: What Are They and How to Build One

Apache Flink for all: Making Flink consumable across all areas of your business

Big data engineering simplified: Exploring roles of distributed systems

Navigating the Big Data Frontier: A Guide to Efficient Handling

The Rise of Streaming Data and Its Cost Efficiency – How Did We Get Here?

Use streaming ingestion with Amazon SageMaker Feature Store and Amazon MSK to make ML-backed decisions in near-real time

Real-time fraud detection using AWS serverless and machine learning services

Architecting Real-Time Analytics for Speed and Scale

Real-time artificial intelligence and event processing

The Rise of Streaming Data and Its Cost Efficiency – How Did We Get Here?

11 Open-Source Data Engineering Tools Every Pro Should Use

Top 15 Data Analytics Projects in 2023 for beginners to Experienced

How Netflix Applies Big Data Across Business Verticals: Insights and Strategies

Stay Connected