Artificial Intelligence, Data Pipeline and Data Warehouse

Securing the data pipeline, from blockchain to AI

Dataconomy

OCTOBER 8, 2024

Generative artificial intelligence is the talk of the town in the technology world today. These challenges are primarily due to how data is collected, stored, moved and analyzed. With most AI models, their training data will come from hundreds of different sources, any one of which could present problems.

Data Pipeline

Data Pipeline AI AI Data Warehouse

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

MAY 16, 2023

But with the sheer amount of data continually increasing, how can a business make sense of it? Robust data pipelines. What is a Data Pipeline? A data pipeline is a series of processing steps that move data from its source to its destination. The answer?

Data Pipeline

Data Pipeline Data Governance Data Lakes Data Warehouse

Supercharge your data strategy: Integrate and innovate today leveraging data integration

IBM Journey to AI blog

OCTOBER 22, 2024

Data is the differentiator as business leaders look to utilize their competitive edge as they implement generative AI (gen AI). Leaders feel the pressure to infuse their processes with artificial intelligence (AI) and are looking for ways to harness the insights in their data platforms to fuel this movement.

Data Silos

Data Silos Data Pipeline DataOps Business Intelligence

Webinars

How to Achieve High-Accuracy Results When Using LLMs

Maximizing Profit and Productivity: The New Era of AI-Powered Accounting

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

MAY 16, 2023

But with the sheer amount of data continually increasing, how can a business make sense of it? Robust data pipelines. What is a Data Pipeline? A data pipeline is a series of processing steps that move data from its source to its destination. The answer?

Data Pipeline

Data Pipeline Data Governance Data Lakes Data Warehouse

How to use foundation models and trusted governance to manage AI workflow risk

IBM Journey to AI blog

OCTOBER 16, 2023

Artificial intelligence (AI) adoption is still in its early stages. The Stanford Institute for Human-Centered Artificial Intelligence’s Center for Research on Foundation Models (CRFM) recently outlined the many risks of foundation models, as well as opportunities. Trustworthiness is critical.

AI

AI AI Data Warehouse ML

A Primer to Scaling Pandas

ODSC - Open Data Science

AUGUST 23, 2023

Run pandas at scale on your data warehouse Most enterprise data teams store their data in a database or data warehouse, such as Snowflake, BigQuery, or DuckDB. Ponder solves this problem by translating your pandas code to SQL that can be understood by your data warehouse.

Data Warehouse

Data Warehouse Data Science Database SQL

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. A typical modern data stack consists of the following: A data warehouse.

Data Warehouse

Data Warehouse ETL Tableau Cloud Data

What Does a Data Engineering Job Involve in 2024?

ODSC - Open Data Science

JANUARY 30, 2024

So let’s do a quick overview of the job of data engineer, and maybe you might find a new interest. Building and maintaining data pipelines Data integration is the process of combining data from multiple sources into a single, consistent view. Think of data engineers as the architects of the data ecosystem.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Data science vs data analytics: Unpacking the differences

IBM Journey to AI blog

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Analytics Analytics Data Scientist

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Introduction ETL plays a crucial role in Data Management. This process enables organisations to gather data from various sources, transform it into a usable format, and load it into data warehouses or databases for analysis. Loading The transformed data is loaded into the target destination, such as a data warehouse.

ETL

ETL Data Warehouse Data Quality Data Governance

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC - Open Data Science

FEBRUARY 24, 2023

Great Expectations provides support for different data backends such as flat file formats, SQL databases, Pandas dataframes and Sparks, and comes with built-in notification and data documentation functionality. You can watch it on demand here.

Exploratory Data Analysis

Exploratory Data Analysis Data Visualization Data Analysis Data Analysis

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Pickl AI

JULY 25, 2023

Data engineers are essential professionals responsible for designing, constructing, and maintaining an organization’s data infrastructure. They create data pipelines, ETL processes, and databases to facilitate smooth data flow and storage. Read more to know. Cloud Platforms: AWS, Azure, Google Cloud, etc.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

It is supported by querying, governance, and open data formats to access and share data across the hybrid cloud. Through workload optimization across multiple query engines and storage tiers, organizations can reduce data warehouse costs by up to 50 percent.

AI

AI AI Machine Learning Machine Learning

How data stores and governance impact your AI initiatives

IBM Journey to AI blog

OCTOBER 12, 2023

Securing AI models and their access to data While AI models need flexibility to access data across a hybrid infrastructure, they also need safeguarding from tampering (unintentional or otherwise) and, especially, protected access to data.

AI

AI AI Data Scientist Data Governance

ETL Process Explained: Essential Steps for Effective Data Management

Pickl AI

OCTOBER 17, 2024

It is a data integration process that involves extracting data from various sources, transforming it into a suitable format, and loading it into a target system, typically a data warehouse. ETL is the backbone of effective data management, ensuring organisations can leverage their data for informed decision-making.

ETL

ETL Data Warehouse SQL Data Quality

AI-Powered ETL Pipeline Orchestration: Multi-Agent Systems in the Era of Generative AI

ODSC - Open Data Science

FEBRUARY 19, 2025

The stepsinclude: Extraction : Data is collected from multiple sources (databases, APIs, flatfiles). Transformation : Data is cleaned, formatted, and enriched. Loading : The processed data is stored in data warehouses or datalakes. Data privacy & ethics: AI-driven ETL must adhere to governance frameworks.

ETL

ETL AI AI Data Warehouse

11 Open-Source Data Engineering Tools Every Pro Should Use

ODSC - Open Data Science

FEBRUARY 6, 2024

This open-source streaming platform enables the handling of high-throughput data feeds, ensuring that data pipelines are efficient, reliable, and capable of handling massive volumes of data in real-time. Its open-source nature means it’s continually evolving, thanks to contributions from its user community.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

What is Data Ingestion? Understanding the Basics

Pickl AI

JULY 25, 2024

In this blog, we’ll delve into the intricacies of data ingestion, exploring its challenges, best practices, and the tools that can help you harness the full potential of your data. Batch Processing In this method, data is collected over a period and then processed in groups or batches.

Apache Kafka

Apache Kafka Data Lakes Data Warehouse Data Quality

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

AWS Machine Learning Blog

NOVEMBER 1, 2023

Jeff Newburn is a Senior Software Engineering Manager leading the Data Engineering team at Logikcull – A Reveal Technology. He oversees the company’s data initiatives, including data warehouses, visualizations, analytics, and machine learning.

AWS

AWS Machine Learning Machine Learning ML

The Data Integration Solution Checklist: Top 10 Considerations

Precisely

MAY 13, 2024

If you’re in the market for a data integration solution, there are many things to consider – including the flexibility of integration solutions, the availability of a strong network of service providers, and the vendor’s reputation for thought leadership in the integration space.

Data Governance

Data Governance Data Pipeline Cloud Data Data Quality

How Investment Banks and Asset Managers Should Be Leveraging Data in Snowflake

phData

APRIL 18, 2023

Faced with these challenges, asset servicers have acquired numerous technologies over time to meet their risk management, fund analytics, and settlement needs, leading to data fragmentation and inheriting complex data flows. Data movements lead to high costs of ETL and rising data management TCO.

Data Silos

Data Silos ETL Clustering Analytics

How to Use Fivetran to Ingest Salesforce Data into Snowflake

phData

SEPTEMBER 25, 2024

In this blog, we will provide a comprehensive overview of ETL considerations, introduce key tools such as Fivetran, Salesforce, and Snowflake AI Data Cloud , and demonstrate how to set up a pipeline and ingest data between Salesforce and Snowflake using Fivetran. It can onboard chunks of data from different systems into one.

ETL

ETL Database Data Warehouse Analytics

How to Shift from Data Science to Data Engineering

ODSC - Open Data Science

JANUARY 18, 2024

This individual is responsible for building and maintaining the infrastructure that stores and processes data; the kinds of data can be diverse, but most commonly it will be structured and unstructured data. They’ll also work with software engineers to ensure that the data infrastructure is scalable and reliable.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Data Analytics in the Age of AI, When to Use RAG, Examples of Data Visualization with D3 and Vega…

ODSC - Open Data Science

APRIL 4, 2024

Find out how to weave data reliability and quality checks into the execution of your data pipelines and more. Dagster+ Launch April 12th, NYC and San Francisco Join us for the unveiling of Dagster+, the next generation of Dagster Cloud.

Data Visualization

Data Visualization Analytics Analytics Big Data Analytics

Announcing the First Speakers for the 2024 Data Engineering Summit

ODSC - Open Data Science

FEBRUARY 15, 2024

These systems represent data as knowledge graphs and implement graph traversal algorithms to help find content in massive datasets. These systems are not only useful for a wide range of industries, they are fun for data engineers to work on.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Guide to Digital Transformation: Data-first Architecture

Dataversity

APRIL 30, 2021

The goal of digital transformation remains the same as ever – to become more data-driven. We have learned how to gain a competitive advantage by capturing business events in data. Events are data snap-shots of complex activity sourced from the web, customer systems, ERP transactions, social media, […].

Data Pipeline

Data Pipeline Data Warehouse Data Governance Data Quality

Choosing the Right ETL Platform: Benefits for Data Integration

Pickl AI

OCTOBER 15, 2024

ETL (Extract, Transform, Load) is a core process in data integration that involves extracting data from various sources, transforming it into a usable format, and loading it into a target system, such as a data warehouse. Apache Nifi Apache Nifi is an open-source ETL tool that automates data flow between systems.

ETL

ETL Azure AWS Data Governance

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022.

SQL

SQL ML ML Python

Your Complete Roadmap to Become an Azure Data Scientist

Pickl AI

SEPTEMBER 5, 2024

The platform’s integration with Azure services ensures a scalable and secure environment for Data Science projects. Azure Synapse Analytics Previously known as Azure SQL Data Warehouse , Azure Synapse Analytics offers a limitless analytics service that combines big data and data warehousing.

Azure

Azure Data Scientist Data Science Machine Learning

Future trends in ETL

Dataconomy

FEBRUARY 12, 2024

ELT advocates for loading raw data directly into storage systems, often cloud-based, before transforming it as necessary. This shift leverages the capabilities of modern data warehouses, enabling faster data ingestion and reducing the complexities associated with traditional transformation-heavy ETL processes.

ETL

ETL Data Governance Machine Learning Machine Learning

Shaping the future: OMRON’s data-driven journey with AWS

AWS Machine Learning Blog

APRIL 3, 2025

OMRONs data strategyrepresented on ODAPalso allowed the organization to unlock generative AI use cases focused on tangible business outcomes and enhanced productivity. When needed, the system can access an ODAP data warehouse to retrieve additional information.

AWS

AWS Data Governance Data Silos SQL

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

With all this packaged into a well-governed platform, Snowflake continues to set the standard for data warehousing and beyond. Snowflake supports data sharing and collaboration across organizations without the need for complex data pipelines. Dataiku and Snowflake: A Good Combo?

Machine Learning

Machine Learning Machine Learning Data Science ML

The Ultimate Modern Data Stack Migration Guide

phData

JULY 18, 2023

With the birth of cloud data warehouses, data applications, and generative AI , processing large volumes of data faster and cheaper is more approachable and desired than ever. First up, let’s dive into the foundation of every Modern Data Stack, a cloud-based data warehouse.

Data Warehouse

Data Warehouse Analytics Analytics Cloud Data

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Additionally, Feast promotes feature reuse, so the time spent on data preparation is reduced greatly.

AWS

AWS Machine Learning Machine Learning ML

Scale knowledge management use cases with generative AI

IBM Journey to AI blog

JULY 27, 2023

Artificial intelligence is disrupting many different areas of business. Powering a knowledge management system with a data lakehouse Organizations need a data lakehouse to target data challenges that come with deploying an AI-powered knowledge management system. A data lakehouse is a fit-for-purpose data store.

AI

AI AI Data Scientist Data Quality

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Journey to AI blog

AUGUST 4, 2023

It’s distributed both in the cloud and on-premises, allowing extensive use and movement across clouds, apps and networks, as well as stores of data at rest. An architecture designed for data democratization aims to be flexible, integrated, agile and secure to enable the use of data and artificial intelligence (AI) at scale.

Data Lakes

Data Lakes AI AI Data Governance

Securing the data pipeline, from blockchain to AI

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Webinars

Trending Sources

Supercharge your data strategy: Integrate and innovate today leveraging data integration

Webinars

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

How to use foundation models and trusted governance to manage AI workflow risk

A Primer to Scaling Pandas

The Modern Data Stack Explained: What The Future Holds

What Does a Data Engineering Job Involve in 2024?

Data science vs data analytics: Unpacking the differences

Maximising Efficiency with ETL Data: Future Trends and Best Practices

11 Open Source Data Exploration Tools You Need to Know in 2023

The Data Dilemma: Exploring the Key Differences Between Data Science and Data Engineering

Exploring the AI and data capabilities of watsonx

How data stores and governance impact your AI initiatives

ETL Process Explained: Essential Steps for Effective Data Management

AI-Powered ETL Pipeline Orchestration: Multi-Agent Systems in the Era of Generative AI

11 Open-Source Data Engineering Tools Every Pro Should Use

What is Data Ingestion? Understanding the Basics

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

The Data Integration Solution Checklist: Top 10 Considerations

How Investment Banks and Asset Managers Should Be Leveraging Data in Snowflake

How to Use Fivetran to Ingest Salesforce Data into Snowflake

How to Shift from Data Science to Data Engineering

Data Analytics in the Age of AI, When to Use RAG, Examples of Data Visualization with D3 and Vega…

Announcing the First Speakers for the 2024 Data Engineering Summit

Guide to Digital Transformation: Data-first Architecture

Choosing the Right ETL Platform: Benefits for Data Integration

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

Your Complete Roadmap to Become an Azure Data Scientist

Future trends in ETL

Shaping the future: OMRON’s data-driven journey with AWS

How Dataiku and Snowflake Strengthen the Modern Data Stack

The Ultimate Modern Data Stack Migration Guide

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Scale knowledge management use cases with generative AI

Data democratization: How data architecture can drive business decisions and AI initiatives

Stay Connected