AI and Data Lakes - Data Science Current

Streaming Machine Learning Without a Data Lake

ODSC - Open Data Science

MAY 31, 2023

Be sure to check out his talk, “ Apache Kafka for Real-Time Machine Learning Without a Data Lake ,” there! The combination of data streaming and machine learning (ML) enables you to build one scalable, reliable, but also simple infrastructure for all machine learning tasks using the Apache Kafka ecosystem.

Data Lakes

Data Lakes Machine Learning Machine Learning Apache Kafka

Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

Flipboard

NOVEMBER 22, 2024

For example, in the bank marketing use case, the management account would be responsible for setting up the organizational structure for the bank’s data and analytics teams, provisioning separate accounts for data governance, data lakes, and data science teams, and maintaining compliance with relevant financial regulations.

Data Governance

Data Governance ML ML Data Lakes

Dremio Revolutionizes Lakehouse Analytics with Breakthrough Autonomous Performance Enhancements

insideBIGDATA

AUGUST 28, 2024

Dremio, the unified lakehouse platform for self-service analytics and AI, announced a breakthrough in data lake analytics performance capabilities, extending its leadership in self-optimizing, autonomous Iceberg data management.

Analytics

Analytics Analytics Data Lakes AI

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

Artificial Intelligence (AI) is all the rage, and rightly so. By now most of us have experienced how Gen AI and the LLMs (large language models) that fuel it are primed to transform the way we create, research, collaborate, engage, and much more. Can AIs responses be trusted? A data lake! Can it do it without bias?

Data Warehouse

Data Warehouse Hadoop Data Lakes Data Governance

Best Practices for Data Lake Security

ODSC - Open Data Science

JUNE 22, 2023

While databases were the traditional way to store large amounts of data, a new storage method has developed that can store even more significant and varied amounts of data. These are called data lakes. What Are Data Lakes? In many cases, this could mean using multiple security programs and platforms.

Data Lakes

Data Lakes Data Warehouse Database Data Science

KDnuggets News, January 18: 7 Best Platforms to Practice SQL • Explainable AI: 10 Python Libraries for Demystifying Your Model’s Decisions

KDnuggets

JANUARY 18, 2023

7 Best Platforms to Practice SQL • Explainable AI: 10 Python Libraries for Demystifying Your Model's Decisions • ChatGPT: Everything You Need to Know • Data Lakes and SQL: A Match Made in Data Heaven • Google Data Analytics Certification Review for 2023

SQL

SQL Data Lakes Python AI

Choosing a Data Lake Format: What to Actually Look For

ODSC - Open Data Science

AUGUST 15, 2023

Recently we’ve seen lots of posts about a variety of different file formats for data lakes. There’s Delta Lake, Hudi, Iceberg, and QBeast, to name a few. It can be tough to keep track of all these data lake formats — let alone figure out why (or if!) And I’m curious to see if you’ll agree.

Data Lakes

Data Lakes ETL Data Science Algorithm

Data Version Control for Data Lakes: Handling the Changes in Large Scale

ODSC - Open Data Science

SEPTEMBER 27, 2023

In the ever-evolving world of big data, managing vast amounts of information efficiently has become a critical challenge for businesses across the globe. As data lakes gain prominence as a preferred solution for storing and processing enormous datasets, the need for effective data version control mechanisms becomes increasingly evident.

Data Lakes

Data Lakes Data Warehouse Database Big Data

MinIO announces support for NVIDIA AI ecosystem with AIStor updates

Dataconomy

MARCH 17, 2025

MinIO, a provider of high-performance object storage for AI, announced several upcoming enhancements to its AIStor product at NVIDIA GTC. These updates are designed to deepen MinIO’s support for the NVIDIA AI ecosystem and improve the efficiency and utilization of AI infrastructure. It will increase CPU efficiency.

Data Lakes

Data Lakes AI AI ML

Interview – Business Intelligence und Process Mining ohne Vendor Lock-in!

Data Science Blog

FEBRUARY 7, 2023

Auch bei Process Mining tut sich gerade viel, Machine Learning hält Einzug ins Process Mining, Prozesse können immer granularer analysiert werden, auch unstrukturierte Daten können unter Einsatz von AI mit in die Analyse einbezogen werden usw. Was gerade zum Trend wird, ist der Aufbau eines Data Lakehouses.

Business Intelligence

Business Intelligence Business Intelligence Data Warehouse Data Lakes

How to Ensure Your New Cloud Data Lake Is Secure

Dataversity

MARCH 24, 2021

Enterprises migrating on-prem data environments to the cloud in pursuit of more robust, flexible, and integrated analytics and AI/ML capabilities are fueling a surge in cloud data lake implementations. The post How to Ensure Your New Cloud Data Lake Is Secure appeared first on DATAVERSITY.

Data Lakes

Data Lakes Cloud Data ML ML

A Bridge Between Data Lakes and Data Warehouses

Dataversity

JANUARY 28, 2021

It has been ten years since Pentaho Chief Technology Officer James Dixon coined the term “data lake.” While data warehouse (DWH) systems have had longer existence and recognition, the data industry has embraced the more […]. The post A Bridge Between Data Lakes and Data Warehouses appeared first on DATAVERSITY.

Data Lakes

Data Lakes Data Warehouse Data Quality Data Governance

Sneak peek at Microsoft Fabric price and its promising features

Dataconomy

JUNE 1, 2023

Unified data storage : Fabric’s centralized data lake, Microsoft OneLake, eliminates data silos and provides a unified storage system, simplifying data access and retrieval. OneLake is designed to store a single copy of data in a unified location, leveraging the open-source Apache Parquet format.

Power BI

Power BI Data Lakes Azure Data Silos

Data Swamp, Data Lake, Data Lakehouse: What to Know

Alation

OCTOBER 21, 2021

Data Swamp vs Data Lake. When you imagine a lake, it’s likely an idyllic image of a tree-ringed body of reflective water amid singing birds and dabbling ducks. I’ll take the lake, thank you very much. Many organizations have built a data lake to solve their data storage, access, and utilization challenges.

Data Lakes

Data Lakes Data Governance Data Warehouse Business Intelligence

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Flipboard

DECEMBER 11, 2024

Many of these applications are complex to build because they require collaboration across teams and the integration of data, tools, and services. Data engineers use data warehouses, data lakes, and analytics tools to load, transform, clean, and aggregate data.

SQL

SQL AWS Data Lakes AI

8 Data Lake Vendors to Make Your Data Life Easier in 2023

ODSC - Open Data Science

JUNE 7, 2023

To make your data management processes easier, here’s a primer on data lakes, and our picks for a few data lake vendors worth considering. What is a data lake? First, a data lake is a centralized repository that allows users or an organization to store and analyze large volumes of data.

Data Lakes

Data Lakes Azure Data Warehouse Hadoop

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

phData

SEPTEMBER 19, 2023

With the amount of data companies are using growing to unprecedented levels, organizations are grappling with the challenge of efficiently managing and deriving insights from these vast volumes of structured and unstructured data. What is a Data Lake? Consistency of data throughout the data lake.

Data Lakes

Data Lakes Data Models Data Modeling Data Warehouse

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Pickl AI

NOVEMBER 15, 2023

Discover the nuanced dissimilarities between Data Lakes and Data Warehouses. Data management in the digital age has become a crucial aspect of businesses, and two prominent concepts in this realm are Data Lakes and Data Warehouses. It acts as a repository for storing all the data.

Data Lakes

Data Lakes Data Warehouse Database ETL

Data Science & Analytics Industry Main Developments in 2021 and Key Trends for 2022

KDnuggets

DECEMBER 14, 2021

We have solicited insights from experts at industry-leading companies, asking: "What were the main AI, Data Science, Machine Learning Developments in 2021 and what key trends do you expect in 2022?" Read their opinions here.

Data Science

Data Science Machine Learning Machine Learning Analytics

Maximize the ROI of Your Enterprise Data Lake

Dataversity

OCTOBER 14, 2022

The data being talked about is useful for businesses to draw insights, formulate strategies, and understand trends and customer behavior, among others. […]. The post Maximize the ROI of Your Enterprise Data Lake appeared first on DATAVERSITY.

Data Lakes

Data Lakes Analytics Analytics Artificial Intelligence

Drowning in Data? A Data Lake May Be Your Lifesaver

ODSC - Open Data Science

SEPTEMBER 29, 2023

Data management problems can also lead to data silos; disparate collections of databases that don’t communicate with each other, leading to flawed analysis based on incomplete or incorrect datasets. One way to address this is to implement a data lake: a large and complex database of diverse datasets all stored in their original format.

Data Lakes

Data Lakes Clustering Big Data Big Data

Real-Time ML with Spark and SBERT, AI Coding Assistants, Data Lake Vendors, and ODSC East…

ODSC - Open Data Science

JUNE 1, 2023

Real-Time ML with Spark and SBERT, AI Coding Assistants, Data Lake Vendors, and ODSC East Highlights Getting Up to Speed on Real-Time Machine Learning with Spark and SBERT Learn more about real-time machine learning by using this approach that uses Apache Spark and SBERT. Is an AI Coding Assistant Right For You?

Data Lakes

Data Lakes ML ML Citizen Data Scientist

Generative AI operating models in enterprise organizations with Amazon Bedrock

AWS Machine Learning Blog

JANUARY 29, 2025

Generative AI can revolutionize organizations by enabling the creation of innovative applications that offer enhanced customer and employee experiences. In this post, we evaluate different generative AI operating model architectures that could be adopted.

AWS

AWS AI AI Database

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 8, 2024

As one of the largest AWS customers, Twilio engages with data, artificial intelligence (AI), and machine learning (ML) services to run their daily workloads. Data is the foundational layer for all generative AI and ML applications. The following diagram illustrates the solution architecture.

SQL

SQL Data Lakes Data Analyst AWS

Learn AI Together — Towards AI Community Newsletter #18

Towards AI

MARCH 28, 2024

Author(s): Towards AI Editorial Team Originally published on Towards AI. Good morning, AI enthusiasts! This week, I’m super excited to announce that we are finally releasing our book, ‘Building AI for Production; Enhancing LLM Abilities and Reliability with Fine-Tuning and RAG,’ where we gathered all our learnings.

Data Lakes

Data Lakes AI AI Azure

Generate financial industry-specific insights using generative AI and in-context fine-tuning

AWS Machine Learning Blog

NOVEMBER 12, 2024

He is focused on Big Data, Data Lakes, Streaming and batch Analytics services and generative AI technologies. He works with strategic customers who are using AI/ML to solve complex business problems. Varun Mehta is a Sr. Solutions Architect at AWS. Outside of work, he loves to spend time with his wife and kids

SQL

SQL AWS AI AI

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Tableau

JUNE 8, 2021

Domain experts, for example, feel they are still overly reliant on core IT to access the data assets they need to make effective business decisions. In all of these conversations there is a sense of inertia: Data warehouses and data lakes feel cumbersome and data pipelines just aren't agile enough.

Tableau

Tableau Data Lakes Data Warehouse SQL

Visualization for Clustering Methods, Gen AI & the Law, and Examples of Doman-Specific LLMS

ODSC - Open Data Science

AUGUST 31, 2023

Visualization for Clustering Methods Clustering methods are a big part of data science, and here’s a primer on how you can visualize them. Lemley on Generative AI and the Law Here’s what Mark A. Lemley, law Professor at Stanford, thinks about legal issues that arise from generative AI, the memorization problem, and more.

Clustering

Clustering Data Lakes Data Science Artificial Intelligence

Integrate foundation models into your code with Amazon Bedrock

AWS Machine Learning Blog

NOVEMBER 6, 2024

The rise of large language models (LLMs) and foundation models (FMs) has revolutionized the field of natural language processing (NLP) and artificial intelligence (AI). These powerful models, trained on vast amounts of data, can generate human-like text, answer questions, and even engage in creative writing tasks.

AWS

AWS Python Machine Learning Machine Learning

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

By some estimates, unstructured data can make up to 80–90% of all new enterprise data and is growing many times faster than structured data. After decades of digitizing everything in your enterprise, you may have an enormous amount of data, but with dormant value. These services write the output to a data lake.

AWS

AWS ML ML Analytics

4 ways generative AI addresses manufacturing challenges

IBM Journey to AI blog

APRIL 15, 2024

An inaccurate AI prediction in a marketing campaign is a minor nuisance, but an inaccurate AI prediction on a manufacturing shopfloor can be fatal. Or we create a data lake, which quickly degenerates to a data swamp. Summarization Summarization remains the top use case for generative AI (gen AI) technology.

AI

AI AI Data Lakes Analytics

Building a Business with a Real-Time Analytics Stack, Streaming ML Without a Data Lake, and…

ODSC - Open Data Science

MAY 24, 2023

Building a Business with a Real-Time Analytics Stack, Streaming ML Without a Data Lake, and Google’s PaLM 2 Building a Pizza Delivery Service with a Real-Time Analytics Stack The best businesses react quickly and with informed decisions. Here’s a use case of how you can use a real-time analytics stack to build a pizza delivery service.

Data Lakes

Data Lakes ML ML Analytics

Achieve your AI goals with an open data lakehouse approach

IBM Journey to AI blog

OCTOBER 4, 2023

Artificial intelligence (AI) is now at the forefront of how enterprises work with data to help reinvent operations, improve customer experiences, and maintain a competitive advantage. It’s no longer a nice-to-have, but an integral part of a successful data strategy. Why does AI need an open data lakehouse architecture?

Data Lakes

Data Lakes Data Warehouse AI AI

Data Engineering for IoT Applications: Unleashing the Power of the Internet of Things

Data Science Connect

JULY 28, 2023

These platforms provide data engineers with the flexibility to develop and deploy IoT applications efficiently. Data Lakes for Centralized Storage Data lakes serve as centralized repositories for storing raw and processed IoT data.

Internet of Things

Internet of Things Data Engineering Data Engineer Data Engineering

How AWS sales uses Amazon Q Business for customer engagement

AWS Machine Learning Blog

DECEMBER 11, 2024

Earlier this year, we published the first in a series of posts about how AWS is transforming our seller and customer journeys using generative AI. Using our AI assistant built on Amazon Q, team members are saving hours of time each week. This time adds up individually, but also collectively at the team and organizational level.

AWS

AWS Database AI AI

Introducing the technology behind watsonx.ai, IBM’s AI and data platform for enterprise

IBM Journey to AI blog

MAY 9, 2023

We stand on the frontier of an AI revolution. Over the past decade, deep learning arose from a seismic collision of data availability and sheer compute power, enabling a host of impressive AI capabilities. It sounds like a joke, but it’s not, as anyone who has tried to solve business problems with AI may know.

AI

AI AI Data Quality Data Lakes

Infor launches AI-powered revenue management solution for hospitality sector

Dataconomy

FEBRUARY 14, 2025

AI-driven revenue optimization The new system enables hoteliers to manage pricing dynamically , making data-driven adjustments across rooms, event spaces, and F&B outlets. The AI-powered automation provides forecasting, strategic planning assistance, and customizable rate management to improve overall profitability.

Data Lakes

Data Lakes AI AI Deep Learning

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Recent developments in generative AI models have further sped up the need of ML adoption across industries. However, implementing security, data privacy, and governance controls are still key challenges faced by customers when implementing ML workloads at scale.

ML

ML ML AWS Data Lakes

Exploring Open-Source Innovations: 13 Companies Offering Cutting-Edge Solutions

ODSC - Open Data Science

MARCH 21, 2025

In todays fast-paced data-driven world, open-source solutions are transforming industries by providing flexible, scalable, and community-driven innovations. Whether youre a data scientist, engineer, or AI researcher, tapping into open-source technologies can accelerate your work while fostering collaboration.

Data Scientist

Data Scientist Data Visualization Data Science Data Lakes

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

AWS Machine Learning Blog

JUNE 21, 2024

To accomplish this, eSentire built AI Investigator, a natural language query tool for their customers to access security platform data by using AWS generative artificial intelligence (AI) capabilities. This helps customers quickly and seamlessly explore their security data and accelerate internal investigations.

AWS

AWS AI AI Natural Language Processing

Is AI creative: Answering the unanswarable

Dataconomy

FEBRUARY 20, 2024

Every tech evangelist and their grandma was gushing about generative AI, hailing it as the dawn of a new creative epoch. The “wow” factor has waned, replaced by a nagging question: Is AI creative? Is AI creative? As AI matures, its ability to process information, adapt, and learn will exponentially increase.

AI

AI AI Algorithm Data Lakes

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

This post presents a solution that uses a generative artificial intelligence (AI) to standardize air quality data from low-cost sensors in Africa, specifically addressing the air quality data integration problem of low-cost sensors. Qiong (Jo) Zhang , PhD, is a Senior Partner Solutions Architect at AWS, specializing in AI/ML.

AWS

AWS Python AI AI

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

Data scientists and ML engineers require capable tooling and sufficient compute for their work. To pave the way for the growth of AI, BMW Group needed to make a leap regarding scalability and elasticity while reducing operational overhead, software licensing, and hardware management.

ML

ML ML AWS AI

MAS AI/ML Modernization Accelerator: Air Compressor Use Case

IBM Data Science in Practice

JANUARY 9, 2024

One groundbreaking technology that has emerged as a game-changer is asset performance management (APM) artificial intelligence (AI). However, embarking on the journey of implementing artificial intelligence (AI) in your asset performance management strategy can be both exciting and daunting.

ML

ML ML AI AI

Streaming Machine Learning Without a Data Lake

Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

Webinars

Trending Sources

Dremio Revolutionizes Lakehouse Analytics with Breakthrough Autonomous Performance Enhancements

Webinars

Data Integrity for AI: What’s Old is New Again

Best Practices for Data Lake Security

KDnuggets News, January 18: 7 Best Platforms to Practice SQL • Explainable AI: 10 Python Libraries for Demystifying Your Model’s Decisions

Choosing a Data Lake Format: What to Actually Look For

Data Version Control for Data Lakes: Handling the Changes in Large Scale

MinIO announces support for NVIDIA AI ecosystem with AIStor updates

Interview – Business Intelligence und Process Mining ohne Vendor Lock-in!

How to Ensure Your New Cloud Data Lake Is Secure

A Bridge Between Data Lakes and Data Warehouses

Sneak peek at Microsoft Fabric price and its promising features

Data Swamp, Data Lake, Data Lakehouse: What to Know

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

8 Data Lake Vendors to Make Your Data Life Easier in 2023

What Are the Best Data Modeling Methodologies & Processes for My Data Lake?

Data Lakes Vs. Data Warehouse: Its significance and relevance in the data world

Data Science & Analytics Industry Main Developments in 2021 and Key Trends for 2022

Maximize the ROI of Your Enterprise Data Lake

Drowning in Data? A Data Lake May Be Your Lifesaver

Real-Time ML with Spark and SBERT, AI Coding Assistants, Data Lake Vendors, and ODSC East…

Generative AI operating models in enterprise organizations with Amazon Bedrock

How Twilio generated SQL using Looker Modeling Language data with Amazon Bedrock

Learn AI Together — Towards AI Community Newsletter #18

Generate financial industry-specific insights using generative AI and in-context fine-tuning

How Databricks and Tableau customers are fueling innovation with data lakehouse architecture

Visualization for Clustering Methods, Gen AI & the Law, and Examples of Doman-Specific LLMS

Integrate foundation models into your code with Amazon Bedrock

Unstructured data management and governance using AWS AI/ML and analytics services

4 ways generative AI addresses manufacturing challenges

Building a Business with a Real-Time Analytics Stack, Streaming ML Without a Data Lake, and…

Achieve your AI goals with an open data lakehouse approach

Data Engineering for IoT Applications: Unleashing the Power of the Internet of Things

How AWS sales uses Amazon Q Business for customer engagement

Introducing the technology behind watsonx.ai, IBM’s AI and data platform for enterprise

Infor launches AI-powered revenue management solution for hospitality sector

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Exploring Open-Source Innovations: 13 Companies Offering Cutting-Edge Solutions

eSentire delivers private and secure generative AI interactions to customers with Amazon SageMaker

Is AI creative: Answering the unanswarable

Improving air quality with generative AI

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

MAS AI/ML Modernization Accelerator: Air Compressor Use Case

Stay Connected