AI and Apache Kafka - Data Science Current

VAST Data Adds Blocks to Unified Storage Platform

insideBIGDATA

FEBRUARY 19, 2025

VAST also added the VAST Event Broker, an Apache Kafka-compatible event streaming service for real-time data ingestion and […]

Apache Kafka

Apache Kafka AI AI

Supernovas, Black Holes and Streaming Data

databricks

AUGUST 12, 2024

The blog explores data streams from NASA satellites using Apache Kafka and Databricks. It demonstrates ingestion and transformation with Delta Live Tables in SQL and AI/BI-powered analysis of supernova events.

Apache Kafka

Apache Kafka SQL AI AI

Maximizing your event-driven architecture investments: Unleashing the power of Apache Kafka with IBM Event Automation

IBM Journey to AI blog

FEBRUARY 12, 2024

At the forefront of this event-driven revolution is Apache Kafka, the widely recognized and dominant open-source technology for event streaming. While most enterprises have already recognized how Apache Kafka provides a strong foundation for EDA, they often fall behind in unlocking its true potential.

Apache Kafka

Apache Kafka EDA SQL Database

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Apache Kafka and Apache Flink: An open-source match made in heaven

IBM Journey to AI blog

NOVEMBER 3, 2023

Apache Kafka and Apache Flink working together Anyone who is familiar with the stream processing ecosystem is familiar with Apache Kafka: the de-facto enterprise standard for open-source event streaming. With Apache Kafka, you get a raw stream of events from everything that is happening within your business.

Apache Kafka

Apache Kafka Data Warehouse Data Pipeline Big Data

Apache Kafka use cases: Driving innovation across diverse industries

IBM Journey to AI blog

SEPTEMBER 4, 2024

Apache Kafka is an open-source , distributed streaming platform that allows developers to build real-time, event-driven applications. With Apache Kafka, developers can build applications that continuously use streaming data records and deliver real-time experiences to users. How does Apache Kafka work?

Apache Kafka

Apache Kafka Internet of Things Data Pipeline Clustering

Real-Time Sentiment Analysis with Kafka and PySpark

Towards AI

FEBRUARY 29, 2024

Last Updated on February 29, 2024 by Editorial Team Author(s): Hira Akram Originally published on Towards AI. Within this article, we will explore the significance of these pipelines and utilise robust tools such as Apache Kafka and Spark to manage vast streams of data efficiently.

Apache Kafka

Apache Kafka SQL Clustering Data Pipeline

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

What Are AI Credits and How Can Data Scientists Use Them?

ODSC - Open Data Science

APRIL 23, 2025

In todays fast-moving machine learning and AI landscape, access to top-tier tools and infrastructure is a game-changer for any data science team. Thats why AI creditsvouchers that grant free or discounted access to cloud services and machine learning platformsare increasingly valuable. AI Credit Partners: Whos OfferingWhat?

Data Scientist

Data Scientist Azure Apache Kafka ML

The winning combination for real-time insights: Messaging and event-driven architecture

IBM Journey to AI blog

APRIL 2, 2024

However, IBM MQ and Apache Kafka can sometimes be viewed as competitors, taking each other on in terms of speed, availability, cost and skills. MQ and Apache Kafka: Teammates Simply put, they are different technologies with different strengths, albeit often perceived to be quite similar. Interested in learning more?

Apache Kafka

Apache Kafka Clustering SQL AI

Level up your Kafka applications with schemas

IBM Journey to AI blog

NOVEMBER 21, 2023

Apache Kafka is a well-known open-source event store and stream processing platform and has grown to become the de facto standard for data streaming. Apache Kafka transfers data without validating the information in the messages.

Apache Kafka

Apache Kafka Clustering Data Quality Data Governance

Accelerate your speed of business with IBM Event Automation

IBM Journey to AI blog

MAY 9, 2023

IBM Event Automation provides an intuitive and integrated experience for distributing, discovering and processing business events across the organization: Event distribution: Collect raw streams of real-time business events with enterprise-grade Apache Kafka.

Apache Kafka

Apache Kafka Business Intelligence Business Intelligence

Event-driven architecture (EDA) enables a business to become more aware of everything that’s happening, as it’s happening

IBM Journey to AI blog

JANUARY 8, 2024

They often use Apache Kafka as an open technology and the de facto standard for accessing events from a various core systems and applications. IBM provides an Event Streams capability build on Apache Kafka that makes events manageable across an entire enterprise.

EDA

EDA Apache Kafka Clustering Data Governance

Real-time artificial intelligence and event processing

IBM Journey to AI blog

NOVEMBER 29, 2023

By leveraging AI for real-time event processing, businesses can connect the dots between disparate events to detect and respond to new trends, threats and opportunities. AI and event processing: a two-way street An event-driven architecture is essential for accelerating the speed of business.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Apache Kafka AI

Apache Flink for all: Making Flink consumable across all areas of your business

IBM Journey to AI blog

AUGUST 29, 2024

The unique advantages of Apache Flink Apache Flink augments event streaming technologies like Apache Kafka to enable businesses to respond to events more effectively in real time. Integration: Integrates seamlessly with other data systems and platforms, including Apache Kafka, Spark, Hadoop and various databases.

Apache Kafka

Apache Kafka Hadoop ETL Data Pipeline

Streaming Data Pipelines: What Are They and How to Build One

Precisely

DECEMBER 28, 2023

More than ever, advanced analytics, ML, and AI are providing the foundation for innovation, efficiency, and profitability. One very popular platform is Apache Kafka , a powerful open-source tool used by thousands of companies. But in all likelihood, Kafka doesn’t natively connect with the applications that contain your data.

Data Pipeline

Data Pipeline Apache Kafka Big Data Big Data

Five scalability pitfalls to avoid with your Kafka application

IBM Journey to AI blog

NOVEMBER 9, 2023

Apache Kafka is a high-performance, highly scalable event streaming platform. To unlock Kafka’s full potential, you need to carefully consider the design of your application. It’s all too easy to write Kafka applications that perform poorly or eventually hit a scalability brick wall.

Apache Kafka

Apache Kafka Algorithm Clustering

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

AWS Machine Learning Blog

SEPTEMBER 11, 2024

This solution employs machine learning (ML) for anomaly detection, and doesn’t require users to have prior AI expertise. It initially sources input time series data from Amazon Managed Streaming for Apache Kafka (Amazon MSK) using this live stream for model training. Product Manager for the Amazon SageMaker service.

AWS

AWS ML ML Apache Kafka

11 Open-Source Data Engineering Tools Every Pro Should Use

ODSC - Open Data Science

FEBRUARY 6, 2024

Apache Kafka For data engineers dealing with real-time data, Apache Kafka is a game-changer. REGISTER NOW Data Orchestration and Workflow Management Apache Airflow Apache Airflow is renowned for its ability to build and schedule complex data pipelines. Interested in attending an ODSC event?

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Machine Learning with MATLAB and Amazon SageMaker

Flipboard

NOVEMBER 21, 2023

For this particular use case, you can use streaming ingestion with Amazon SageMaker Feature Store and Amazon Managed Streaming for Apache Kafka, MSK, to make machine learning-backed decisions in near real-time. Shun Mao is a Senior AI/ML Partner Solutions Architect in the Emerging Technologies team at Amazon Web Services.

Machine Learning

Machine Learning Machine Learning AWS Decision Trees

Use streaming ingestion with Amazon SageMaker Feature Store and Amazon MSK to make ML-backed decisions in near-real time

AWS Machine Learning Blog

APRIL 19, 2023

Streaming ingestion – An Amazon Kinesis Data Analytics for Apache Flink application backed by Apache Kafka topics in Amazon Managed Streaming for Apache Kafka (MSK) (Amazon MSK) calculates aggregated features from a transaction stream, and an AWS Lambda function updates the online feature store.

ML

ML ML Apache Kafka SQL

Getting started with Kafka client metrics

IBM Journey to AI blog

MARCH 14, 2024

Apache Kafka stands as a widely recognized open source event store and stream processing platform. One key advantage of opting for managed Kafka services is the delegation of responsibility for broker and operational metrics, allowing users to focus solely on metrics specific to applications.

Apache Kafka

Apache Kafka Data Pipeline

Building a Pizza Delivery Service with a Real-Time Analytics Stack

ODSC - Open Data Science

JUNE 1, 2023

We’re going to assume that the pizza service already captures orders in Apache Kafka and is also keeping a record of its customers and the products that they sell in MySQL. Apache Pinot is a real-time OLAP database built at LinkedIn to deliver scalable real-time analytics with low latency.

Analytics

Analytics Analytics Apache Kafka Data Science

All of the Free Virtual Sessions Coming to ODSC Europe 2023

ODSC - Open Data Science

JUNE 7, 2023

If you are unable to join us at the Tobacco Dock this June, we’ll be hosting lots of engaging training sessions, workshops, and talks on our virtual platform. Check out our confirmed sessions below. Get your ODSC Europe Virtual passes today and get ready to advance your career with new skills and connections.

Apache Kafka

Apache Kafka Machine Learning Machine Learning Data Science

Pictures and Highlights from ODSC Europe 2023

ODSC - Open Data Science

JULY 22, 2023

The week was filled with engaging sessions on top topics in data science, innovation in AI, and smiling faces that we haven’t seen in a while. We’re a few weeks removed from ODSC Europe 2023 and we couldn’t have left on a better note. Keynotes Our main keynote sessions were held on the virtual side of the conference.

Apache Kafka

Apache Kafka Machine Learning Machine Learning Data Science

Transitioning off Amazon Lookout for Metrics

AWS Machine Learning Blog

OCTOBER 9, 2024

Customers can use the CloudFormation template to bring up an application stack that receives time-series data from an Amazon Managed Streaming for Apache Kafka (Amazon MSK) streaming source and performs near-real-time anomaly detection in the streaming data. About the Author Nirmal Kumar is Sr.

AWS

AWS ML ML Data Quality

How Netflix Applies Big Data Across Business Verticals: Insights and Strategies

Pickl AI

SEPTEMBER 18, 2024

Data in Motion Technologies like Apache Kafka facilitate real-time processing of events and data, allowing Netflix to respond swiftly to user interactions and operational needs. Data at Rest This includes storage solutions such as S3 Data Warehouse and Cassandra. What Technologies Does Netflix Use for Its Big Data Infrastructure?

Big Data

Big Data Big Data Apache Kafka Big Data Analytics

Bundesliga Match Facts Shot Speed – Who fires the hardest shots in the Bundesliga?

AWS Machine Learning Blog

NOVEMBER 3, 2023

m How it’s implemented In our quest to accurately determine shot speed during live matches, we’ve implemented a cutting-edge solution using Amazon Managed Streaming for Apache Kafka (Amazon MSK). He is passionate about enabling customers on their data and artificial intelligence (AI) journey to the cloud.

AWS

AWS Apache Kafka Data Scientist Data Science

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

AWS Machine Learning Blog

MARCH 30, 2023

To ensure real-time updates of ball recovery times, we have implemented Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a central solution for data streaming and messaging. He follows his passion for a broad range of sports, music, and AI in his spare time.

AWS

AWS Machine Learning Machine Learning Apache Kafka

Why Software Engineers Should Be Embracing AI: A Guide to Staying Ahead

ODSC - Open Data Science

OCTOBER 9, 2024

The rapid evolution of AI is transforming nearly every industry/domain, and software engineering is no exception. Well, the thing is that AI technologies are doing a few things. If you’re not leveraging AI yet, it’s time to start. At West, you’ll learn even more about AI’s role in reshaping software engineering.

Apache Kafka

Apache Kafka AI AI Machine Learning

Watch the Top ODSC Europe 2023 Virtual Sessions Here

ODSC - Open Data Science

JULY 14, 2023

AI and Bias: How to Detect It and How to Prevent It Sandra Wachter, PhD | Professor, Technology and Regulation | Oxford Internet Institute, University of Oxford In recognition of the extensive biases and inequality that are present in training data, there has been much work done to test for bias in machine learning and AI systems.

Machine Learning

Machine Learning Machine Learning Apache Kafka Data Science

Predicting the Future of Data Science

Pickl AI

DECEMBER 4, 2024

Summary: The future of Data Science is shaped by emerging trends such as advanced AI and Machine Learning, augmented analytics, and automated processes. Key Takeaways AI and Machine Learning will advance significantly, enhancing predictive capabilities across industries. Here are five key trends to watch.

Data Science

Data Science Data Scientist Machine Learning Machine Learning

Unlock the knowledge in your Slack workspace with Slack connector for Amazon Q Business

AWS Machine Learning Blog

OCTOBER 9, 2024

Amazon Q Business is a fully managed, generative AI-powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. With RAG, generative AI enhances its responses by incorporating relevant information retrieved from a curated dataset.

AWS

AWS Apache Kafka Data Scientist Database Administration

Why your event-driven architecture needs advanced event governance

IBM Journey to AI blog

AUGUST 22, 2024

In recognizing the benefits of event-driven architectures, many companies have turned to Apache Kafka for their event streaming needs. Apache Kafka enables scalable, fault-tolerant and real-time processing of streams of data—but how do you manage and properly utilize the sheer amount of data your business ingests every second?

EDA

EDA Apache Kafka Clustering

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

This article will discuss managing unstructured data for AI and ML projects. You will learn the following: Why unstructured data management is necessary for AI and ML projects. How to leverage Generative AI to manage unstructured data Benefits of applying proper unstructured data management processes to your AI/ML project.

Machine Learning

Machine Learning Machine Learning Data Lakes AI

How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize

AWS Machine Learning Blog

JANUARY 6, 2023

TR has been an early adopter of ML with Amazon SageMaker , and their maturity in the AI/ML domain meant that they had collated a significant dataset of relevant data within a data warehouse, which the team could train a personalization model with. Applied AI Specialist Architect at AWS. Vamshi Krishna Enabothala is a Sr.

AWS

AWS Data Warehouse ML ML

Did Big Data Deliver Business Transformation & Improved CX?

Alation

AUGUST 4, 2022

Spark, Tensorflow, Apache Kafka, et cetera, are all out found in cloud databases,” points out Jones. We also need to “learn about both better AI/ML /analysis tools and understanding the implicit and explicit biases that exist within them.” You can] see that it works before going all-in.”.

Big Data

Big Data Big Data Apache Kafka Data Lakes

What is Data Ingestion? Understanding the Basics

Pickl AI

JULY 25, 2024

Apache Kafka An open-source platform designed for real-time data streaming. Popular options include Apache Kafka for real-time streaming, Apache Spark for batch and stream processing, Talend for ETL, and cloud-based solutions like AWS Glue, Azure Data Factory, and Google Cloud Dataflow.

Apache Kafka

Apache Kafka Data Lakes Data Warehouse Data Quality

IBM continues to support OpenSource AsyncAPI in breaking the boundaries of event driven architectures

IBM Journey to AI blog

JULY 12, 2024

With its intuitive UI, it makes it easy to produce a valid AsyncAPI document for any Kafka cluster or system that adheres to the Apache Kafka protocol. One of the key benefits of event endpoint management is that it allows you to describe events in a standardized way according to the AysncAPI specification.

Apache Kafka

Apache Kafka Clustering

Unveiling Developers’ Technologies and Tools Usage in Large and Small and Medium-sized Enterprises…

Mlearning.ai

AUGUST 4, 2023

The focus of this investigation revolves around understanding their industry distribution, age demographics, developer types, and their adoption of various programming languages, databases, platforms, web frameworks, miscellaneous technologies, technical tools, new collaboration tools, and AI-powered search tools. This is followed by bing ai.

Database

Database Apache Kafka SQL AI

Exploring Database Management Systems in Social Media Giants

Pickl AI

OCTOBER 21, 2024

In response, Twitter has implemented various solutions, including Apache Kafka, a distributed streaming platform that helps manage the data flow from user interactions. Using Kafka, Twitter can effectively handle high-throughput data streams, enabling users to receive timely notifications and updates.

Database

Database Apache Kafka Machine Learning Machine Learning

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Among these tools, Apache Hadoop, Apache Spark, and Apache Kafka stand out for their unique capabilities and widespread usage. Apache Hadoop Hadoop is a powerful framework that enables distributed storage and processing of large data sets across clusters of computers.

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

A Simple Guide to Real-Time Data Ingestion

Pickl AI

JULY 24, 2023

Utilising data streaming platforms such as Apache Kafka, Apache Flink, or Apache Spark Streaming, data is gathered from many sources and processed in real-time or close to real-time. The post A Simple Guide to Real-Time Data Ingestion appeared first on Pickl AI.

Internet of Things

Internet of Things Apache Kafka ETL Azure

A Comprehensive Guide to the main components of Big Data

Pickl AI

DECEMBER 2, 2024

Real-time processing allows organisations to make timely decisions based on current data rather than relying on historical information.Technologies enabling real-time analytics include: Stream Processing Frameworks: Tools like Apache Kafka facilitate the continuous ingestion and processing of streaming data.

Big Data

Big Data Big Data Data Lakes Apache Hadoop

VAST Data Adds Blocks to Unified Storage Platform

Supernovas, Black Holes and Streaming Data

Webinars

Trending Sources

Maximizing your event-driven architecture investments: Unleashing the power of Apache Kafka with IBM Event Automation

Webinars

Apache Kafka and Apache Flink: An open-source match made in heaven

Apache Kafka use cases: Driving innovation across diverse industries

Real-Time Sentiment Analysis with Kafka and PySpark

Streaming Machine Learning Without a Data Lake

What Are AI Credits and How Can Data Scientists Use Them?

The winning combination for real-time insights: Messaging and event-driven architecture

Level up your Kafka applications with schemas

Accelerate your speed of business with IBM Event Automation

Event-driven architecture (EDA) enables a business to become more aware of everything that’s happening, as it’s happening

Real-time artificial intelligence and event processing

Apache Flink for all: Making Flink consumable across all areas of your business

Streaming Data Pipelines: What Are They and How to Build One

Five scalability pitfalls to avoid with your Kafka application

Anomaly detection in streaming time series data with online learning using Amazon Managed Service for Apache Flink

11 Open-Source Data Engineering Tools Every Pro Should Use

Machine Learning with MATLAB and Amazon SageMaker

Use streaming ingestion with Amazon SageMaker Feature Store and Amazon MSK to make ML-backed decisions in near-real time

Getting started with Kafka client metrics

Building a Pizza Delivery Service with a Real-Time Analytics Stack

All of the Free Virtual Sessions Coming to ODSC Europe 2023

Pictures and Highlights from ODSC Europe 2023

Transitioning off Amazon Lookout for Metrics

How Netflix Applies Big Data Across Business Verticals: Insights and Strategies

Bundesliga Match Facts Shot Speed – Who fires the hardest shots in the Bundesliga?

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

Why Software Engineers Should Be Embracing AI: A Guide to Staying Ahead

Watch the Top ODSC Europe 2023 Virtual Sessions Here

Predicting the Future of Data Science

Unlock the knowledge in your Slack workspace with Slack connector for Amazon Q Business

Why your event-driven architecture needs advanced event governance

Top Big Data Interview Questions for 2025

How to Manage Unstructured Data in AI and Machine Learning Projects

How Thomson Reuters delivers personalized content subscription plans at scale using Amazon Personalize

Did Big Data Deliver Business Transformation & Improved CX?

What is Data Ingestion? Understanding the Basics

IBM continues to support OpenSource AsyncAPI in breaking the boundaries of event driven architectures

Unveiling Developers’ Technologies and Tools Usage in Large and Small and Medium-sized Enterprises…

Exploring Database Management Systems in Social Media Giants

Discover the Most Important Fundamentals of Data Engineering

A Simple Guide to Real-Time Data Ingestion

A Comprehensive Guide to the main components of Big Data

Stay Connected