AI, Data Lakes and Database - Data Science Current

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

Artificial Intelligence (AI) is all the rage, and rightly so. By now most of us have experienced how Gen AI and the LLMs (large language models) that fuel it are primed to transform the way we create, research, collaborate, engage, and much more. Can AIs responses be trusted? A data lake! Can it do it without bias?

Data Warehouse

Data Warehouse Hadoop Data Lakes Data Governance

Dremio Revolutionizes Lakehouse Analytics with Breakthrough Autonomous Performance Enhancements

insideBIGDATA

AUGUST 28, 2024

Dremio, the unified lakehouse platform for self-service analytics and AI, announced a breakthrough in data lake analytics performance capabilities, extending its leadership in self-optimizing, autonomous Iceberg data management.

Analytics

Analytics Analytics Data Lakes AI

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Best Practices for Data Lake Security

ODSC - Open Data Science

JUNE 22, 2023

While databases were the traditional way to store large amounts of data, a new storage method has developed that can store even more significant and varied amounts of data. These are called data lakes. What Are Data Lakes? However, even digital information has to be stored somewhere.

Data Lakes

Data Lakes Data Warehouse Database Data Science

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

In the ever-evolving world of big data, managing vast amounts of information efficiently has become a critical challenge for businesses across the globe. As data lakes gain prominence as a preferred solution for storing and processing enormous datasets, the need for effective data version control mechanisms becomes increasingly evident.

Data Lakes

Data Lakes Data Warehouse Database Big Data

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Flipboard

DECEMBER 11, 2024

Many of these applications are complex to build because they require collaboration across teams and the integration of data, tools, and services. Data engineers use data warehouses, data lakes, and analytics tools to load, transform, clean, and aggregate data.

SQL

SQL AWS Data Lakes AI

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

Businesses face significant hurdles when preparing data for artificial intelligence (AI) applications. The existence of data silos and duplication, alongside apprehensions regarding data quality, presents a multifaceted environment for organizations to manage.

AWS

AWS Database ETL AI

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

AWS Machine Learning Blog

DECEMBER 16, 2024

We spoke with Dr. Swami Sivasubramanian, Vice President of Data and AI, shortly after AWS re:Invent 2024 to hear his impressionsand to get insights on how the latest AWS innovations help meet the real-world needs of customers as they build and scale transformative generative AI applications. Canva uses AWS to power 1.2

AWS

AWS AI AI Data Warehouse

Sneak peek at Microsoft Fabric price and its promising features

Dataconomy

JUNE 1, 2023

Unified data storage : Fabric’s centralized data lake, Microsoft OneLake, eliminates data silos and provides a unified storage system, simplifying data access and retrieval. OneLake is designed to store a single copy of data in a unified location, leveraging the open-source Apache Parquet format.

Power BI

Power BI Data Lakes Azure Data Silos

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning Blog

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Using our AI assistant built on Amazon Q, team members are saving hours of time each week. This time adds up individually, but also collectively at the team and organizational level.

AWS

AWS Database AI AI

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning Blog

JANUARY 29, 2025

Generative AI can revolutionize organizations by enabling the creation of innovative applications that offer enhanced customer and employee experiences. In this post, we evaluate different generative AI operating model architectures that could be adopted.

AWS

AWS AI AI Database

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Pickl AI

NOVEMBER 15, 2023

Discover the nuanced dissimilarities between Data Lakes and Data Warehouses. Data management in the digital age has become a crucial aspect of businesses, and two prominent concepts in this realm are Data Lakes and Data Warehouses. It acts as a repository for storing all the data.

Data Lakes

Data Lakes Data Warehouse Database ETL

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications. The following diagram illustrates the solution architecture.

SQL

SQL Data Lakes Data Analyst AWS

Generate financial industry-specific insights using generative AI and in-context fine-tuning

AWS Machine Learning Blog

NOVEMBER 12, 2024

The following question requires complex industry knowledge-based analysis of data from multiple columns in the ETF database. He is focused on Big Data, Data Lakes, Streaming and batch Analytics services and generative AI technologies. Varun Mehta is a Sr. Solutions Architect at AWS.

SQL

SQL AWS AI AI

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Models Data Modeling Data Warehouse

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

By some estimates, unstructured data can make up to 80–90% of all new enterprise data and is growing many times faster than structured data. After decades of digitizing everything in your enterprise, you may have an enormous amount of data, but with dormant value. These services write the output to a data lake.

AWS

AWS ML ML Analytics

Generating value from enterprise data: Best practices for Text2SQL and generative AI

AWS Machine Learning Blog

JANUARY 4, 2024

Generative AI has opened up a lot of potential in the field of AI. One such area that is evolving is using natural language processing (NLP) to unlock new opportunities for accessing data through intuitive SQL queries. The primary goal is to automatically generate SQL queries from natural language text.

SQL

SQL Database AI AI

Exploring Open-Source Innovations: 13 Companies Offering Cutting-Edge Solutions

ODSC - Open Data Science

MARCH 21, 2025

In todays fast-paced data-driven world, open-source solutions are transforming industries by providing flexible, scalable, and community-driven innovations. Whether youre a data scientist, engineer, or AI researcher, tapping into open-source technologies can accelerate your work while fostering collaboration.

Data Scientist

Data Scientist Data Visualization Data Science Data Lakes

Drowning in Data? A Data Lake May Be Your Lifesaver

ODSC - Open Data Science

SEPTEMBER 29, 2023

Data management problems can also lead to data silos; disparate collections of databases that don’t communicate with each other, leading to flawed analysis based on incomplete or incorrect datasets. The data lake can then refine, enrich, index, and analyze that data. and various countries in Europe.

Data Lakes

Data Lakes Clustering Big Data Big Data

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Flipboard

NOVEMBER 17, 2023

Generative AI models have the potential to revolutionize enterprise operations, but businesses must carefully consider how to harness their power while overcoming challenges such as safeguarding data and ensuring the quality of AI-generated content. Set up the database access and network access.

K-nearest Neighbors

K-nearest Neighbors AWS Clustering Database

Data Engineering for IoT Applications: Unleashing the Power of the Internet of Things

Data Science Connect

JULY 28, 2023

Data Collection and Integration Data engineers are responsible for designing robust data collection systems that gather information from various IoT devices and sensors. This data is then integrated into centralized databases for further processing and analysis.

Internet of Things

Internet of Things Data Engineer Data Engineering Data Engineering

Simplifying Time Series Analysis for Data Scientists

ODSC - Open Data Science

SEPTEMBER 12, 2023

Be sure to check out his talk, “ What is a Time-series Database and Why do I Need One? Most data scientists are familiar with the concept of time series data and work with it often. The time series database (TSDB) , however, is still an underutilized tool in the data science community. at ODSC West 2023.

Data Scientist

Data Scientist Database Data Lakes Data Science

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

AWS Machine Learning Blog

MAY 30, 2024

Amazon Bedrock is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so one can choose from a wide range of FMs to find the model that is best suited for their use case. Prompt engineering Prompt engineering is crucial for the knowledge retrieval system.

AI

AI AI AWS Database

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Flipboard

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads. Learn more about the AWS zero-ETL future with newly launched AWS databases integrations with Amazon Redshift.

AWS

AWS Data Warehouse ETL SQL

10 Things AWS Can Do for Your SaaS Company

Smart Data Collective

FEBRUARY 20, 2022

Data storage databases. Your SaaS company can store and protect any amount of data using Amazon Simple Storage Service (S3), which is ideal for data lakes, cloud-native applications, and mobile apps. Artificial intelligence (AI). Well, let’s find out. Messages and notification. Cost-effective.

AWS

AWS Cloud Computing Data Lakes Database

Why companies need to accelerate data warehousing solution modernization

IBM Journey to AI blog

APRIL 24, 2023

Data warehouse is the base architecture for artificial intelligence and machine learning (AI/ML) solutions as well. Benefits of new data warehousing technology Everything is data, regardless of whether it’s structured, semi-structured, or unstructured.

Data Warehouse

Data Warehouse Data Lakes Database Big Data

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

Data scientists and ML engineers require capable tooling and sufficient compute for their work. To pave the way for the growth of AI, BMW Group needed to make a leap regarding scalability and elasticity while reducing operational overhead, software licensing, and hardware management.

ML

ML ML AWS AI

Open Data Lakes, Safeguarding Images From AI, Free Data Viz Tools, and 50% Off ODSC East

ODSC - Open Data Science

FEBRUARY 15, 2024

The Future of the Single Source of Truth is an Open Data Lake Organizations that strive for high-performance data systems are increasingly turning towards the ELT (Extract, Load, Transform) model using an open data lake. To DIY you need to: host an API, build a UI, and run or rent a database.

Data Lakes

Data Lakes Data Visualization Machine Learning Machine Learning

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

This post presents a solution that uses a generative artificial intelligence (AI) to standardize air quality data from low-cost sensors in Africa, specifically addressing the air quality data integration problem of low-cost sensors. This allows for data to be aggregated for further manufacturer-agnostic analysis.

AWS

AWS AI AI Python

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

AWS Machine Learning Blog

SEPTEMBER 1, 2023

Nowadays, the majority of our customers is excited about large language models (LLMs) and thinking how generative AI could transform their business. In this post, we discuss how to operationalize generative AI applications using MLOps principles leading to foundation model operations (FMOps).

AI

AI AI ML ML

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

AWS Machine Learning Blog

JUNE 21, 2024

To accomplish this, eSentire built AI Investigator, a natural language query tool for their customers to access security platform data by using AWS generative artificial intelligence (AI) capabilities. This helps customers quickly and seamlessly explore their security data and accelerate internal investigations.

AWS

AWS AI AI Natural Language Processing

Best 8 Data Version Control Tools for Machine Learning 2024

DagsHub

DECEMBER 11, 2023

Adding new data to the storage requires pulling the existing data, then calculating the new hash before pushing back the whole data. DVC lacks crucial relational database features, making it an unsuitable choice for those familiar with relational databases. So, Dolt’s integration with Git makes it easier to learn.

Machine Learning

Machine Learning Machine Learning Data Lakes Database

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

AWS Machine Learning Blog

JUNE 13, 2023

Organizations can maximize the value of their modern data architecture with generative AI solutions while innovating continuously. The natural language capabilities allow non-technical users to query data through conversational English rather than complex SQL. The following diagram illustrates the architecture.

Database

Database SQL AWS AI

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Flipboard

DECEMBER 6, 2023

These datasets are often a mix of numerical and text data, at times structured, unstructured, or semi-structured. needed to address some of these challenges in one of their many AI use cases built on AWS. The generated query is then run against the database to fetch the relevant context.

SQL

SQL Database AWS Machine Learning

Imperva optimizes SQL generation from natural language using Amazon Bedrock

AWS Machine Learning Blog

JUNE 20, 2024

Our goal was to improve the user experience of an existing application used to explore the counters and insights data. The data is stored in a data lake and retrieved by SQL using Amazon Athena. The problem Making data accessible to users through applications has always been a challenge.

SQL

SQL Database AWS Machine Learning

Big data

Dataconomy

FEBRUARY 25, 2025

This characteristic reflects the growing sources and types of data collected over time. Variety Variety delineates the different data types involved, encompassing structured data like databases, unstructured data such as text and multimedia content, and semi-structured data found in logs and sensor data.

Big Data

Big Data Big Data Data Lakes Machine Learning

5 Fast-Growing Data Management Trends in 2023

ODSC - Open Data Science

MAY 16, 2023

Comprehensive data privacy laws in at least four states are going into effect this year, and more than a dozen states have similar legislation in the works. Database management may become increasingly complex as organizations must account for more of these laws.

Database

Database Data Science Data Lakes Data Observability

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

AWS Machine Learning Blog

AUGUST 21, 2024

Amazon DataZone is a data management service that makes it quick and convenient to catalog, discover, share, and govern data stored in AWS, on-premises, and third-party sources. The sample dataset Upload the dataset to Amazon S3 and crawl the data to create an AWS Glue database and tables.

Machine Learning

Machine Learning Machine Learning Data Governance ML

Your guide to generative AI and ML at AWS re:Invent 2023

AWS Machine Learning Blog

NOVEMBER 22, 2023

Now all you need is some guidance on generative AI and machine learning (ML) sessions to attend at this twelfth edition of re:Invent. And although generative AI has appeared in previous events, this year we’re taking it to the next level. And although our track focuses on generative AI, many other tracks have related sessions.

AWS

AWS ML ML AI

How the Masters uses watsonx to manage its AI lifecycle

IBM Journey to AI blog

APRIL 9, 2024

Through a partnership spanning more than 25 years, IBM has helped the Augusta National Golf Club capture, analyze, distribute and use data to bring fans closer to the action, culminating in the AI-powered Masters digital experience and mobile app.

AI

AI AI Machine Learning Machine Learning

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

AWS Machine Learning Blog

AUGUST 26, 2024

At AWS, we are transforming our seller and customer journeys by using generative artificial intelligence (AI) across the sales lifecycle. Prospecting, opportunity progression, and customer engagement present exciting opportunities to utilize generative AI, using historical data, to drive efficiency and effectiveness.

AWS

AWS AI AI K-nearest Neighbors

How OLAP and AI can enable better business

IBM Journey to AI blog

DECEMBER 7, 2023

Online analytical processing (OLAP) database systems and artificial intelligence (AI) complement each other and can help enhance data analysis and decision-making when used in tandem. As AI techniques continue to evolve, innovative applications in the OLAP domain are anticipated.

Data Preparation

Data Preparation Database Data Analysis Data Analysis

What is Data Pipeline? A Detailed Explanation

Smart Data Collective

OCTOBER 17, 2022

A point of data entry in a given pipeline. Examples of an origin include storage systems like data lakes, data warehouses and data sources that include IoT devices, transaction processing applications, APIs or social media. The final point to which the data has to be eventually transferred is a destination.

Data Pipeline

Data Pipeline Data Warehouse ETL Exploratory Data Analysis

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC - Open Data Science

FEBRUARY 24, 2023

There are many well-known libraries and platforms for data analysis such as Pandas and Tableau, in addition to analytical databases like ClickHouse, MariaDB, Apache Druid, Apache Pinot, Google BigQuery, Amazon RedShift, etc. With Great Expectations , data teams can express what they “expect” from their data using simple assertions.

Exploratory Data Analysis

Exploratory Data Analysis Data Visualization Data Analysis Data Analysis

Streaming Machine Learning Without a Data Lake

Data Integrity for AI: What’s Old is New Again

Webinars

Trending Sources

Dremio Revolutionizes Lakehouse Analytics with Breakthrough Autonomous Performance Enhancements

Webinars

Best Practices for Data Lake Security

Data Version Control for Data Lakes: Handling the Changes in Large Scale

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Tackling AI’s data challenges with IBM databases on AWS

AWS re:Invent 2024 Highlights: Top takeaways from Swami Sivasubramanian to help customers manage generative AI at scale

Sneak peek at Microsoft Fabric price and its promising features

How AWS sales uses Amazon Q Business for customer engagement

Generative AI operating models in enterprise organizations with Amazon Bedrock

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

Generate financial industry-specific insights using generative AI and in-context fine-tuning

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Unstructured data management and governance using AWS AI/ML and analytics services

Generating value from enterprise data: Best practices for Text2SQL and generative AI

Exploring Open-Source Innovations: 13 Companies Offering Cutting-Edge Solutions

Drowning in Data? A Data Lake May Be Your Lifesaver

Retrieval-Augmented Generation with LangChain, Amazon SageMaker JumpStart, and MongoDB Atlas semantic search

Data Engineering for IoT Applications: Unleashing the Power of the Internet of Things

Simplifying Time Series Analysis for Data Scientists

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

AWS re:Invent 2023 Amazon Redshift Sessions Recap

10 Things AWS Can Do for Your SaaS Company

Why companies need to accelerate data warehousing solution modernization

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Open Data Lakes, Safeguarding Images From AI, Free Data Viz Tools, and 50% Off ODSC East

Improving air quality with generative AI

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

Best 8 Data Version Control Tools for Machine Learning 2024

Reinventing the data experience: Use generative AI and modern data architecture to unlock insights

How Q4 Inc. used Amazon Bedrock, RAG, and SQLDatabaseChain to address numerical and structured dataset challenges building their Q&A chatbot

Imperva optimizes SQL generation from natural language using Amazon Bedrock

Big data

5 Fast-Growing Data Management Trends in 2023

Unlock the power of data governance and no-code machine learning with Amazon SageMaker Canvas and Amazon DataZone

Your guide to generative AI and ML at AWS re:Invent 2023

How the Masters uses watsonx to manage its AI lifecycle

AWS empowers sales teams using generative AI solution built on Amazon Bedrock

How OLAP and AI can enable better business

What is Data Pipeline? A Detailed Explanation

11 Open Source Data Exploration Tools You Need to Know in 2023

Stay Connected