Remove Data Lakes Remove Data Profiling Remove Download
article thumbnail

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

LakeFS LakeFS is an open-source platform that provides data lake versioning and management capabilities. It sits between the data lake and cloud object storage, allowing you to version and control changes to data lakes at scale. Share features across the organization.

article thumbnail

How to Build ETL Data Pipeline in ML

The MLOps Blog

ETL data pipeline architecture | Source: Author Data Discovery: Data can be sourced from various types of systems, such as databases, file systems, APIs, or streaming sources. We also need data profiling i.e. data discovery, to understand if the data is appropriate for ETL.

ETL 59
article thumbnail

Comparing Tools For Data Processing Pipelines

The MLOps Blog

Data Processing : You need to save the processed data through computations such as aggregation, filtering and sorting. Data Storage : To store this processed data to retrieve it over time – be it a data warehouse or a data lake.