Database, Document and K-nearest Neighbors

Implementing Approximate Nearest Neighbor Search with KD-Trees

PyImageSearch

DECEMBER 23, 2024

Or think about a real-time facial recognition system that must match a face in a crowd to a database of thousands. This is where Approximate Nearest Neighbor (ANN) search algorithms come into play. Imagine a database with billions of samples ( ) (e.g., product specifications, movie metadata, documents, etc.)

K-nearest Neighbors

K-nearest Neighbors Algorithm Deep Learning Deep Learning

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

Flipboard

JANUARY 24, 2025

You can then run searches for the top K documents in an index that are most similar to a given query vector, which could be a question, keyword, or content (such as an image, audio clip, or text) that has been encoded by the same ML model. To learn more, refer to the documentation.

K-nearest Neighbors

K-nearest Neighbors ML ML Algorithm

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

AWS Machine Learning Blog

NOVEMBER 15, 2024

This centralized system consolidates a wide range of data sources, including detailed reports, FAQs, and technical documents. The system integrates structured data, such as tables containing product properties and specifications, with unstructured text documents that provide in-depth product descriptions and usage guidelines.

Database

Database SQL Data Analysis Data Analysis

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

AWS Machine Learning Blog

NOVEMBER 13, 2024

It works by analyzing the visual content to find similar images in its database. Store embeddings : Ingest the generated embeddings into an OpenSearch Serverless vector index, which serves as the vector database for the solution. Display results : Display the top K similar results to the user. b64encode(resized_image).decode('utf-8')

AWS

AWS Database K-nearest Neighbors AI

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Flipboard

NOVEMBER 17, 2023

The Retrieval-Augmented Generation (RAG) framework augments prompts with external data from multiple sources, such as document repositories, databases, or APIs, to make foundation models effective for domain-specific tasks. Set up the database access and network access.

K-nearest Neighbors

K-nearest Neighbors AWS Clustering Database

Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock

AWS Machine Learning Blog

DECEMBER 6, 2023

Such data often lacks the specialized knowledge contained in internal documents available in modern businesses, which is typically needed to get accurate answers in domains such as pharmaceutical research, financial investigation, and customer support. For example, imagine that you are planning next year’s strategy of an investment company.

SQL

SQL AWS Analytics Analytics

Benchmarking Amazon Nova and GPT-4o models with FloTorch

AWS Machine Learning Blog

MARCH 11, 2025

One of the most critical applications for LLMs today is Retrieval Augmented Generation (RAG), which enables AI models to ground responses in enterprise knowledge bases such as PDFs, internal documents, and structured data. Vector database FloTorch selected Amazon OpenSearch Service as a vector database for its high-performance metrics.

K-nearest Neighbors

K-nearest Neighbors AWS Database AI

Use DeepSeek with Amazon OpenSearch Service vector database and Amazon SageMaker

Flipboard

FEBRUARY 7, 2025

This post shows you how to set up RAG using DeepSeek-R1 on Amazon SageMaker with an OpenSearch Service vector database as the knowledge base. You will create a connector to SageMaker with Amazon Titan Text Embeddings V2 to create embeddings for a set of documents with population statistics. Examine the code in run_rag.py.

Database

Database AWS Python ML

Semantic image search for articles using Amazon Rekognition, Amazon SageMaker foundation models, and Amazon OpenSearch Service

AWS Machine Learning Blog

SEPTEMBER 8, 2023

OpenSearch Service allows you to store vectors and other data types in an index, and offers rich functionality that allows you to search for documents using vectors and measuring the semantical relatedness, which we use in this post. Using the k-nearest neighbors (k-NN) algorithm, you define how many images to return in your results.

K-nearest Neighbors

K-nearest Neighbors AWS ML ML

How IDIADA optimized its intelligent chatbot with Amazon Bedrock

AWS Machine Learning Blog

FEBRUARY 25, 2025

These included document translations, inquiries about IDIADAs internal services, file uploads, and other specialized requests. This approach allows for tailored responses and processes for different types of user needs, whether its a simple question, a document translation, or a complex inquiry about IDIADAs services.

Algorithm

Algorithm Machine Learning Machine Learning K-nearest Neighbors

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2

AWS Machine Learning Blog

APRIL 19, 2024

We stored the embeddings in a vector database and then used the Large Language-and-Vision Assistant (LLaVA 1.5-7b) 7b) model to generate text responses to user questions based on the most similar slide retrieved from the vector database. The OSI pipeline ingests the data as documents into an OpenSearch Serverless index.

AWS

AWS ML ML Database

Approximate Nearest Neighbor with Locality Sensitive Hashing (LSH)

PyImageSearch

JANUARY 27, 2025

Home Table of Contents Approximate Nearest Neighbor with Locality Sensitive Hashing (LSH) What Is Locality Sensitive Hashing (LSH)? Another example is in the field of text document similarity. Imagine you have a vast library of documents and want to identify near-duplicate documents or find documents similar to a query document.

K-nearest Neighbors

K-nearest Neighbors Algorithm Data Preparation Database

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 1

AWS Machine Learning Blog

JANUARY 30, 2024

We use OpenSearch Serverless as a vector database for storing embeddings generated by the Titan Multimodal Embeddings model. In the user interaction phase, a question from the user is converted into embeddings and a similarity search is run on the vector database to find a slide that could potentially contain answers to user question.

AWS

AWS ML ML K-nearest Neighbors

Five machine learning types to know

IBM Journey to AI blog

DECEMBER 20, 2023

Classification algorithms include logistic regression, k-nearest neighbors and support vector machines (SVMs), among others. Association algorithms allow data scientists to identify associations between data objects inside large databases, facilitating data visualization and dimensionality reduction.

Machine Learning

Machine Learning Machine Learning Supervised Learning Clustering

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

AWS Machine Learning Blog

AUGUST 26, 2024

You can integrate existing data from AWS data lakes, Amazon Simple Storage Service (Amazon S3) buckets, or Amazon Relational Database Service (Amazon RDS) instances with services such as Amazon Bedrock and Amazon Q. The Asynchronous Request Handler function stores results in a DynamoDB database along with the generated requestId.

AWS

AWS AI AI K-nearest Neighbors

Implement serverless semantic search of image and live video with Amazon Titan Multimodal Embeddings

AWS Machine Learning Blog

JUNE 3, 2024

You store the embeddings of the video frame as a k-nearest neighbors (k-NN) vector in your OpenSearch Service index with the reference to the video clip and the frame in the S3 bucket itself (Step 3). You split the video files into frames and save them in a S3 bucket (Step 1).

AWS

AWS K-nearest Neighbors ML ML

Power recommendations and search using an IMDb knowledge graph – Part 3

AWS Machine Learning Blog

JANUARY 6, 2023

OpenSearch Service offers kNN search, which can enhance search in use cases such as product recommendations, fraud detection, and image, video, and some specific semantic scenarios like document and query similarity. Solution overview.

AWS

AWS ML ML Machine Learning

Basic Data Science Terms Every Data Analyst Should Know

Pickl AI

SEPTEMBER 12, 2024

Key Components of Data Science Data Science consists of several key components that work together to extract meaningful insights from data: Data Collection: This involves gathering relevant data from various sources, such as databases, APIs, and web scraping. Data Cleaning: Raw data often contains errors, inconsistencies, and missing values.

Data Analyst

Data Analyst Data Science Machine Learning Machine Learning

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 1

AWS Machine Learning Blog

OCTOBER 24, 2024

The second post outlines how to work with multiple data formats such as structured data (tables, databases) and images. Broadly speaking, a retriever is a module that takes a query as input and outputs relevant documents from one or more knowledge sources relevant to that query.

AWS

AWS K-nearest Neighbors Database AI

Talk to your slide deck using multimodal foundation models on Amazon Bedrock – Part 3

AWS Machine Learning Blog

DECEMBER 10, 2024

We stored the embeddings in a vector database and then used the Large Language-and-Vision Assistant (LLaVA 1.5-7b) 7b) model to generate text responses to user questions based on the most similar slide retrieved from the vector database. SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images. 13636-13645.

AWS

AWS K-nearest Neighbors Database ML

Automatic file format detection in data migration projects

Dataconomy

DECEMBER 12, 2024

Databases to be migrated can have a wide range of data representations and contents. For the sake of argument, let’s ignore the fact that the use of such data types in databases is justified only in a few specific cases, as this problem often arises when migrating complex systems. in XML, CLOB, BLOB etc.

K-nearest Neighbors

K-nearest Neighbors Machine Learning Machine Learning Support Vector Machines

Data Science Current

Implementing Approximate Nearest Neighbor Search with KD-Trees

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

Webinars

Trending Sources

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2

Webinars

Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock

Benchmarking Amazon Nova and GPT-4o models with FloTorch

Use DeepSeek with Amazon OpenSearch Service vector database and Amazon SageMaker

Semantic image search for articles using Amazon Rekognition, Amazon SageMaker foundation models, and Amazon OpenSearch Service

How IDIADA optimized its intelligent chatbot with Amazon Bedrock

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2

Approximate Nearest Neighbor with Locality Sensitive Hashing (LSH)

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 1

Five machine learning types to know

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

Implement serverless semantic search of image and live video with Amazon Titan Multimodal Embeddings

Power recommendations and search using an IMDb knowledge graph – Part 3

Basic Data Science Terms Every Data Analyst Should Know

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 1

Talk to your slide deck using multimodal foundation models on Amazon Bedrock – Part 3

Automatic file format detection in data migration projects

Stay Connected