Algorithm, Document and ML - Data Science Current

Intelligent Document Processing with Azure Form Recognizer

Analytics Vidhya

MARCH 29, 2023

Introduction Intelligent document processing (IDP) is a technology that uses artificial intelligence (AI) and machine learning (ML) to automatically extract information from unstructured documents such as invoices, receipts, and forms.

Azure

Azure Artificial Intelligence Artificial Intelligence Machine Learning

Your guide to generative AI and ML at AWS re:Invent 2024

AWS Machine Learning Blog

NOVEMBER 19, 2024

This year, generative AI and machine learning (ML) will again be in focus, with exciting keynote announcements and a variety of sessions showcasing insights from AWS experts, customer stories, and hands-on experiences with AWS services. Visit the session catalog to learn about all our generative AI and ML sessions.

AWS

AWS ML ML AI

Automating complex document processing: How Onity Group built an intelligent solution using Amazon Bedrock

AWS Machine Learning Blog

MAY 20, 2025

In the mortgage servicing industry, efficient document processing can mean the difference between business growth and missed opportunities. Onity processes millions of pages across hundreds of document types annually, including legal documents such as deeds of trust where critical information is often contained within dense text.

AWS

AWS ML ML AI

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Flipboard

APRIL 23, 2025

Traditional keyword-based search mechanisms are often insufficient for locating relevant documents efficiently, requiring extensive manual review to extract meaningful insights. This solution improves the findability and accessibility of archival records by automating metadata enrichment, document classification, and summarization.

AWS

AWS ML ML AI

Intelligent document processing

Dataconomy

APRIL 30, 2025

Intelligent document processing (IDP) is transforming the way businesses manage their documentation and data management processes. By harnessing the power of emerging technologies, organizations can automate the extraction and handling of data from various document types, significantly enhancing operational workflows.

Natural Language Processing

Natural Language Processing Machine Learning Machine Learning ML

How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod

AWS Machine Learning Blog

MAY 15, 2025

The banking industry has long struggled with the inefficiencies associated with repetitive processes such as information extraction, document review, and auditing. To further enhance the capabilities of specialized information extraction solutions, advanced ML infrastructure is essential.

AWS

AWS ML ML Machine Learning

Are Model Explanations Useful in Practice? Rethinking How to Support Human-ML Interactions.

ML @ CMU

MARCH 31, 2023

Our work further motivates novel directions for developing and evaluating tools to support human-ML interactions. Model explanations have been touted as crucial information to facilitate human-ML interactions in many real-world applications where end users make decisions informed by ML predictions.

ML

ML ML Algorithm Machine Learning

AI/ML model validation

Dataconomy

APRIL 2, 2025

AI/ML model validation plays a crucial role in the development and deployment of machine learning and artificial intelligence systems. What is AI/ML model validation? AI/ML model validation is a systematic process that ensures the reliability and accuracy of machine learning and artificial intelligence models.

ML

ML ML Machine Learning Machine Learning

Precise Software Solutions implements ML as a service on AWS to save time and money for federal agency

Flipboard

JANUARY 6, 2025

The platform helped the agency digitize and process forms, pictures, and other documents. The federal government agency Precise worked with needed to automate manual processes for document intake and image processing. The demand for modernization is growing, and Precise can help government agencies adopt AI/ML technologies.

AWS

AWS ML ML Machine Learning

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

Flipboard

JANUARY 24, 2025

Overview of vector search and the OpenSearch Vector Engine Vector search is a technique that improves search quality by enabling similarity matching on content that has been encoded by machine learning (ML) models into vectors (numerical encodings). These benchmarks arent designed for evaluating ML models.

K-nearest Neighbors

K-nearest Neighbors ML ML Algorithm

Elevating ML to new heights with distributed learning

Dataconomy

MAY 22, 2023

Machine learning is a branch of artificial intelligence that focuses on developing algorithms and models that can learn from data and make predictions or decisions without being explicitly programmed. There are various types of machine learning algorithms, including supervised learning, unsupervised learning, and reinforcement learning.

ML

ML ML Machine Learning Machine Learning

Media Production with AI: 7 Fields of Creativity in the Industry

Data Science Dojo

SEPTEMBER 25, 2024

By leveraging AI-powered algorithms, media producers can improve production processes and enhance creativity. Some key benefits of integrating the production process with AI are as follows: Personalization AI algorithms can analyze user data to offer personalized recommendations for movies, TV shows, and music.

AI

AI AI Algorithm Artificial Intelligence

Enhancing Search Relevancy with Cohere Rerank 3.5 and Amazon OpenSearch Service

Flipboard

DECEMBER 18, 2024

improves search results for best matching 25 (BM25), a keyword-based algorithm that performs lexical search, in addition to semantic search. Lexical search relies on exact keyword matching between the query and documents. For a natural language query searching for super hero toys, it retrieves documents containing those exact terms.

K-nearest Neighbors

K-nearest Neighbors AWS ML ML

Exploring All Types of Machine Learning Algorithms

Pickl AI

JANUARY 21, 2025

Summary: Machine Learning algorithms enable systems to learn from data and improve over time. These algorithms are integral to applications like recommendations and spam detection, shaping our interactions with technology daily. These intelligent predictions are powered by various Machine Learning algorithms.

Machine Learning

Machine Learning Machine Learning Algorithm Decision Trees

Master Vector Embeddings with Weaviate – A Comprehensive Series for You!

Data Science Dojo

JANUARY 22, 2025

Heres how embeddings power these advanced systems: Semantic Understanding LLMs use embeddings to represent words, sentences, and entire documents in a way that captures their semantic meaning. The process enables the models to find the most relevant sections of a document or dataset, improving the accuracy and relevance of their outputs.

Database

Database ML ML AI

A comprehensive comparison of RPA and ML

Dataconomy

MARCH 27, 2023

However, while RPA and ML share some similarities, they differ in functionality, purpose, and the level of human intervention required. In this article, we will explore the similarities and differences between RPA and ML and examine their potential use cases in various industries. What is machine learning (ML)?

ML

ML ML Machine Learning Machine Learning

Reproducible AI

Dataconomy

APRIL 14, 2025

Reproducible AI refers to the capability to duplicate machine learning (ML) processes accurately, ensuring consistent outcomes as initially intended. Consistency across ML pipelines Maintaining consistency in data across ML workflows is essential. Strategies to control or document random seeds can mitigate these effects.

AI

AI AI ML ML

Cost-effective document classification using the Amazon Titan Multimodal Embeddings Model

AWS Machine Learning Blog

APRIL 11, 2024

Organizations across industries want to categorize and extract insights from high volumes of documents of different formats. Manually processing these documents to classify and extract information remains expensive, error prone, and difficult to scale. Categorizing documents is an important first step in IDP systems.

Database

Database AWS Algorithm ML

Techniques for automatic summarization of documents using language models

Flipboard

DECEMBER 6, 2023

The model then uses a clustering algorithm to group the sentences into clusters. Implementation includes the following steps: The first step is to break down the large document, such as a book, into smaller sections, or chunks. It works by first embedding the sentences in the text using BERT.

AWS

AWS Clustering Artificial Intelligence Artificial Intelligence

John Snow Labs Medical LLMs are now available in Amazon SageMaker JumpStart

AWS Machine Learning Blog

NOVEMBER 25, 2024

You can try out the models with SageMaker JumpStart, a machine learning (ML) hub that provides access to algorithms, models, and ML solutions so you can quickly get started with ML. To learn more, refer to the API documentation. You can change these configurations by specifying non-default values in JumpStartModel.

AWS

AWS ML ML Machine Learning

Establishing an AI/ML center of excellence

AWS Machine Learning Blog

MAY 9, 2024

The rapid advancements in artificial intelligence and machine learning (AI/ML) have made these technologies a transformative force across industries. An effective approach that addresses a wide range of observed issues is the establishment of an AI/ML center of excellence (CoE). What is an AI/ML CoE?

ML

ML ML AI AI

Exploring alternatives and seamlessly migrating data from Amazon Lookout for Vision

AWS Machine Learning Blog

OCTOBER 10, 2024

Amazon Lookout for Vision , the AWS service designed to create customized artificial intelligence and machine learning (AI/ML) computer vision models for automated quality inspection, will be discontinuing on October 31, 2025.

AWS

AWS Machine Learning Machine Learning ML

Process formulas and charts with Anthropic’s Claude on Amazon Bedrock

AWS Machine Learning Blog

MARCH 21, 2025

Research papers and engineering documents often contain a wealth of information in the form of mathematical formulas, charts, and graphs. Navigating these unstructured documents to find relevant information can be a tedious and time-consuming task, especially when dealing with large volumes of data.

AWS

AWS AI AI Data Scientist

Navigating tomorrow: Role of AI and ML in information technology

Dataconomy

FEBRUARY 6, 2024

With the ability to analyze a vast amount of data in real-time, identify patterns, and detect anomalies, AI/ML-powered tools are enhancing the operational efficiency of businesses in the IT sector. Why does AI/ML deserve to be the future of the modern world? Let’s understand the crucial role of AI/ML in the tech industry.

ML

ML ML Machine Learning Machine Learning

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

Flipboard

DECEMBER 3, 2024

As a global leader in agriculture, Syngenta has led the charge in using data science and machine learning (ML) to elevate customer experiences with an unwavering commitment to innovation. Efficient metadata storage with Amazon DynamoDB – To support quick and efficient data retrieval, document metadata is stored in Amazon DynamoDB.

AWS

AWS AI AI Machine Learning

Fine-tune multimodal models for vision and text use cases on Amazon SageMaker JumpStart

AWS Machine Learning Blog

NOVEMBER 15, 2024

This significant improvement showcases how the fine-tuning process can equip these powerful multimodal AI systems with specialized skills for excelling at understanding and answering natural language questions about complex, document-based visual information. For a detailed walkthrough on fine-tuning the Meta Llama 3.2

ML

ML ML Python AWS

Package and deploy classical ML and LLMs easily with Amazon SageMaker, part 1: PySDK Improvements

Flipboard

NOVEMBER 30, 2023

Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and effortlessly build, train, and deploy machine learning (ML) models at any scale. Deploy traditional models to SageMaker endpoints In the following examples, we showcase how to use ModelBuilder to deploy traditional ML models.

ML

ML ML AWS Python

AI/ML-driven actionable insights and themes for Amazon third-party sellers using AWS

Flipboard

MARCH 7, 2023

This post presents a solution that uses a workflow and AWS AI and machine learning (ML) services to provide actionable insights based on those transcripts. We use multiple AWS AI/ML services, such as Contact Lens for Amazon Connect and Amazon SageMaker , and utilize a combined architecture.

ML

ML ML AWS AI

Enhance Your LLM Agents with BM25: Lightweight Retrieval That Works

Towards AI

APRIL 28, 2025

Prerequisites Before diving in, you should have: Basic AI/ML understanding: concepts like language models, embeddings, and model inference. Models like Sentence Transformers map words, sentences, or documents into high-dimensional vectors. It scores documents based on: 1. Author(s): Syed Affan Originally published on Towards AI.

Python

Python Database AI AI

Machine learning in software testing

Dataconomy

MAY 9, 2025

As traditional testing methods evolve, integrating advanced technologies like machine learning (ML) offers a new frontier for improving testing processes. Machine learning, in the context of software testing, refers to the application of algorithms that enable systems to learn from data and improve their performance over time.

Machine Learning

Machine Learning Machine Learning ML ML

LLM Agents Underscore One Truth: Data Is The Real Differentiator.

Towards AI

NOVEMBER 8, 2024

We don’t have better algorithms; we just have more data. Edited Photo by Taylor Vick on Unsplash In ML engineering, data quality isn’t just critical — it’s foundational. Yet, this perspective often gets sidelined and there was never a consensus in the ML community about it. Because of how ML practitioners were initially trained.

ML

ML ML Data Quality Algorithm

Run small language models cost-efficiently with AWS Graviton and Amazon SageMaker AI

Flipboard

JUNE 5, 2025

Amazon SageMaker AI provides a fully managed service for deploying these machine learning (ML) models with multiple inference options, allowing organizations to optimize for cost, latency, and throughput. invocations is the endpoint that receives client inference POST The format of the request and the response is up to the algorithm.

AWS

AWS AI AI ML

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

AWS Machine Learning Blog

NOVEMBER 14, 2024

We recently announced the general availability of cross-account sharing of Amazon SageMaker Model Registry using AWS Resource Access Manager (AWS RAM) , making it easier to securely share and discover machine learning (ML) models across your AWS accounts.

AWS

AWS ML ML Machine Learning

Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock

AWS Machine Learning Blog

DECEMBER 6, 2023

Such data often lacks the specialized knowledge contained in internal documents available in modern businesses, which is typically needed to get accurate answers in domains such as pharmaceutical research, financial investigation, and customer support. For example, imagine that you are planning next year’s strategy of an investment company.

SQL

SQL AWS Analytics Analytics

Accelerating UMAP: Processing 10 Million Records in Under a Minute With No Code Changes

ODSC - Open Data Science

JUNE 6, 2025

cuML brings GPU-acceleration to UMAP and HDBSCAN , in addition to scikit-learn algorithms. It dramatically improves algorithm performance for data-intensive tasks involving tens to hundreds of millions of records. To test drive cuML, try this notebook in Colab, and make sure to select a GPU runtime before getting started.

Clustering

Clustering Machine Learning Machine Learning Algorithm

Achieve rapid time-to-value business outcomes with faster ML model training using Amazon SageMaker Canvas

AWS Machine Learning Blog

MARCH 3, 2023

Machine learning (ML) can help companies make better business decisions through advanced analytics. Companies across industries apply ML to use cases such as predicting customer churn, demand forecasting, credit scoring, predicting late shipments, and improving manufacturing quality.

ML

ML ML Machine Learning Machine Learning

Build cost-effective RAG applications with Binary Embeddings in Amazon Titan Text Embeddings V2, Amazon OpenSearch Serverless, and Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

NOVEMBER 18, 2024

Amazon Titan Text Embeddings models generate meaningful semantic representations of documents, paragraphs, and sentences. It supports exact and approximate nearest-neighbor algorithms and multiple storage and matching engines. He is focused on OpenSearch Serverless and has years of experience in networking, security and AI/ML.

K-nearest Neighbors

K-nearest Neighbors AWS ML ML

Solve forecasting challenges for the retail and CPG industry using Amazon SageMaker Canvas

AWS Machine Learning Blog

JANUARY 21, 2025

In this post, we show you how Amazon Web Services (AWS) helps in solving forecasting challenges by customizing machine learning (ML) models for forecasting. This visual, point-and-click interface democratizes ML so users can take advantage of the power of AI for various business applications. One of these methods is quantiles.

ML

ML ML Algorithm AWS

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

AWS Machine Learning Blog

NOVEMBER 15, 2024

Principal wanted to use existing internal FAQs, documentation, and unstructured data and build an intelligent chatbot that could provide quick access to the right information for different roles. As Principal grew, its internal support knowledge base considerably expanded.

AWS

AWS AI AI Machine Learning

The innovators behind intelligent machines: A look at ML engineers

Dataconomy

MAY 2, 2023

They design, develop, and deploy the machine learning algorithms that power everything from self-driving cars to personalized recommendations. They also develop algorithms that are utilized to sort through relevant data, and scale predictive models to best suit the amount of data pertinent to the business. They build the future.

ML

ML ML Machine Learning Machine Learning

Google Research, 2022 & beyond: Algorithms for efficient deep learning

Google Research AI blog

FEBRUARY 7, 2023

The explosion in deep learning a decade ago was catapulted in part by the convergence of new algorithms and architectures, a marked increase in data, and access to greater compute. Below, we highlight a panoply of works that demonstrate Google Research’s efforts in developing new algorithms to address the above challenges.

Deep Learning

Deep Learning Deep Learning Algorithm ML

Improve Amazon Nova migration performance with data-aware prompt optimization

AWS Machine Learning Blog

APRIL 29, 2025

The following example shows how prompt optimization converts a typical prompt for a summarization task on Anthropics Claude Haiku into a well-structured prompt for an Amazon Nova model, with sections that begin with special markdown tags such as ## Task, ### Summarization Instructions , and ### Document to Summarize.

AWS

AWS ML ML AI

Master Data Annotation in LLMs: A Key to Smarter and Powerful AI!

Data Science Dojo

FEBRUARY 6, 2025

These models are trained using vast datasets and powered by sophisticated algorithms. Data annotation is the process of labeling data to make it understandable and usable for machine learning (ML) models. Legal documents, medical records, or scientific papers need experts who understand the terminology.

AI

AI AI ML ML

Gemma 3 27B model now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

AWS Machine Learning Blog

MAY 28, 2025

The second approach is using SageMaker JumpStart, a machine learning (ML) hub, with foundation models (FMs), built-in algorithms, and pre-built ML solutions. This resource includes integration examples, API documentation, and programming samples. Filter for Gemma as the provider and choose Gemma 3 27B Instruct.

AWS

AWS ML ML AI

Intelligent Document Processing with Azure Form Recognizer

Your guide to generative AI and ML at AWS re:Invent 2024

Webinars

Trending Sources

Automating complex document processing: How Onity Group built an intelligent solution using Amazon Bedrock

Webinars

Build an AI-powered document processing platform with open source NER model and LLM on Amazon SageMaker

Intelligent document processing

How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod

Are Model Explanations Useful in Practice? Rethinking How to Support Human-ML Interactions.

AI/ML model validation

Precise Software Solutions implements ML as a service on AWS to save time and money for federal agency

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

Elevating ML to new heights with distributed learning

Media Production with AI: 7 Fields of Creativity in the Industry

Enhancing Search Relevancy with Cohere Rerank 3.5 and Amazon OpenSearch Service

Exploring All Types of Machine Learning Algorithms

Master Vector Embeddings with Weaviate – A Comprehensive Series for You!

A comprehensive comparison of RPA and ML

Reproducible AI

Cost-effective document classification using the Amazon Titan Multimodal Embeddings Model

Techniques for automatic summarization of documents using language models

John Snow Labs Medical LLMs are now available in Amazon SageMaker JumpStart

Establishing an AI/ML center of excellence

Exploring alternatives and seamlessly migrating data from Amazon Lookout for Vision

Process formulas and charts with Anthropic’s Claude on Amazon Bedrock

Navigating tomorrow: Role of AI and ML in information technology

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

Fine-tune multimodal models for vision and text use cases on Amazon SageMaker JumpStart

Package and deploy classical ML and LLMs easily with Amazon SageMaker, part 1: PySDK Improvements

AI/ML-driven actionable insights and themes for Amazon third-party sellers using AWS

Enhance Your LLM Agents with BM25: Lightweight Retrieval That Works

Machine learning in software testing

LLM Agents Underscore One Truth: Data Is The Real Differentiator.

Run small language models cost-efficiently with AWS Graviton and Amazon SageMaker AI

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

Boosting RAG-based intelligent document assistants using entity extraction, SQL querying, and agents with Amazon Bedrock

Accelerating UMAP: Processing 10 Million Records in Under a Minute With No Code Changes

Achieve rapid time-to-value business outcomes with faster ML model training using Amazon SageMaker Canvas

Build cost-effective RAG applications with Binary Embeddings in Amazon Titan Text Embeddings V2, Amazon OpenSearch Serverless, and Amazon Bedrock Knowledge Bases

Solve forecasting challenges for the retail and CPG industry using Amazon SageMaker Canvas

Principal Financial Group uses QnABot on AWS and Amazon Q Business to enhance workforce productivity with generative AI

The innovators behind intelligent machines: A look at ML engineers

Google Research, 2022 & beyond: Algorithms for efficient deep learning

Improve Amazon Nova migration performance with data-aware prompt optimization

Master Data Annotation in LLMs: A Key to Smarter and Powerful AI!

Gemma 3 27B model now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

Stay Connected