2011 and Clustering - Data Science Current

Build a Scalable Data Pipeline with Apache Kafka

Analytics Vidhya

MARCH 10, 2023

It was made on LinkedIn and shared with the public in 2011. Introduction Apache Kafka is a framework for dealing with many real-time data streams in a way that is spread out.

Apache Kafka

Apache Kafka Data Pipeline Analytics Analytics

A Detailed Guide of Interview Questions on Apache Kafka

Analytics Vidhya

APRIL 28, 2023

Introduction Apache Kafka is an open-source publish-subscribe messaging application initially developed by LinkedIn in early 2011. It is a famous Scala-coded data processing tool that offers low latency, extensive throughput, and a unified platform to handle the data in real-time.

Apache Kafka

Apache Kafka Analytics Analytics Hadoop

The Simple Magic of Consistent Hashing (2011)

Hacker News

SEPTEMBER 22, 2024

Here you have a number of nodes in a cluster of databases, or in a cluster of web caches. How do you figure out where the data for a particular key goes in that cluster? The simplicity of consistent hashing is pretty mind-blowing.

Clustering

Clustering Database

Webinars

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

OCP Summit 2024: The open future of networking hardware for AI

Hacker News

OCTOBER 15, 2024

At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters. DSF: Scheduled fabric that is disaggregated and open Network performance and availability play an important role in extracting the best performance out of our AI training clusters.

Clustering

Clustering AI AI ML

Understanding Reinforcement Learning and Multi-Agent Systems: A Beginner’s Guide to MARL (Part 1)

Towards AI

MARCH 11, 2025

When we learn by grouping similar items, we call it clustering. Author(s): Arthur Kakande Originally published on Towards AI. Photo by Hyundai Motor Group on Unsplash When we learn from labeled data, we call it supervised learning. When we learn by observing rewards or gains, we call it reinforcement learning.

Algorithm

Algorithm Supervised Learning Clustering AI

Unraveling the Web: Navigating Databases in Web Technology

Towards AI

APRIL 22, 2024

Contrary to relational databases, these do not conform to a pre-defined schema, operate in distributed clusters, and store unstructured or semi-structured data thus they are also referred to as non-relational databases. New SQL databases — NewSQL is a modern form of relational database system that sits between SQL and NoSQL.

Database

Database SQL Clustering Big Data

What Is Retrieval-Augmented Generation?

Hacker News

NOVEMBER 15, 2023

IBM’s Watson became a TV celebrity in 2011 when it handily beat two human champions on the Jeopardy! When complete, the work, which ran on a cluster of NVIDIA GPUs, showed how to make generative AI models more authoritative and trustworthy. Today, LLMs are taking question-answering systems to a whole new level.

Database

Database AI AI Natural Language Processing

Understanding earthquakes: what map visualizations teach us

Cambridge Intelligence

NOVEMBER 8, 2023

Tōhoku earthquake in the Pacific Ocean, which caused the 2011 tsunami. in 2010-2011, we use the time bar sliders to select what we want. Earthquake data as a geospatial visualization A map view reveals what we’d expect, that the largest clusters of earthquakes occur where tectonic plates meet. We can filter by time, too.

Data Visualization

Data Visualization Clustering Database Data Models

From text to dream job: Building an NLP-based job recommender at Talent.com with Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 23, 2023

Founded in 2011, Talent.com is one of the world’s largest sources of employment. In the SMDDP architecture (as shown in the following figure), distributed training is also scaled using a cluster of many nodes. The company combines paid job listings from their clients with public job listings into a single searchable platform.

AWS

AWS Deep Learning Deep Learning Machine Learning

Open source data visualization options: we compare 5 tools

Cambridge Intelligence

FEBRUARY 20, 2025

js) D3 makes sense for media organizations such as The New York Times […] where a single graphic may be seen by a million readers d3js.org History: First created by Stanford alumni and released in 2011. GET YOUR FREE GUIDE Popular open source data visualization options D3 network graph tools (D3.js)

Data Visualization

Data Visualization Algorithm Data Analyst Clustering

Question answering using Retrieval Augmented Generation with foundation models in Amazon SageMaker JumpStart

AWS Machine Learning Blog

MAY 2, 2023

There are a few limitations of using off-the-shelf pre-trained LLMs: They’re usually trained offline, making the model agnostic to the latest information (for example, a chatbot trained from 2011–2018 has no information about COVID-19). They’re mostly trained on general domain corpora, making them less effective on domain-specific tasks.

Algorithm

Algorithm Machine Learning Machine Learning Natural Language Processing

How spaCy Works

Explosion

FEBRUARY 18, 2015

The tutorial also recommends the use of Brown cluster features, and case normalization features, as these make the model more robust and domain independent. The following tweaks: I use Brown cluster features — these help a lot; I redesigned the feature set. I think this is still the best approach, so it’s what I implemented in spaCy.

Algorithm

Algorithm Python Clustering

Major Differences: Kafka vs RabbitMQ

Pickl AI

MARCH 13, 2025

RabbitMQ runs on multiple nodes in a cluster, ensuring high availability and system reliability. Since its launch in 2011, Kafka has become a leader in event-driven architectures, powering large-scale distributed systems across industries. It also offers plug-ins to expand its features, making it adaptable for different business needs.

Apache Kafka

Apache Kafka Big Data Big Data Data Pipeline

Data Science Current

Build a Scalable Data Pipeline with Apache Kafka

A Detailed Guide of Interview Questions on Apache Kafka

Webinars

Trending Sources

The Simple Magic of Consistent Hashing (2011)

Webinars

OCP Summit 2024: The open future of networking hardware for AI

Understanding Reinforcement Learning and Multi-Agent Systems: A Beginner’s Guide to MARL (Part 1)

Unraveling the Web: Navigating Databases in Web Technology

What Is Retrieval-Augmented Generation?

Understanding earthquakes: what map visualizations teach us

From text to dream job: Building an NLP-based job recommender at Talent.com with Amazon SageMaker

Open source data visualization options: we compare 5 tools

Question answering using Retrieval Augmented Generation with foundation models in Amazon SageMaker JumpStart

How spaCy Works

Major Differences: Kafka vs RabbitMQ

Stay Connected