AI, Data Pipeline and Data Preparation

Analyze security findings faster with no-code data preparation using generative AI and Amazon SageMaker Canvas

AWS Machine Learning Blog

FEBRUARY 1, 2024

Data is the foundation to capturing the maximum value from AI technology and solving business problems quickly. To unlock the potential of generative AI technologies, however, there’s a key prerequisite: your data needs to be appropriately prepared.

Data Preparation

Data Preparation AWS AI AI

Data Fabric and Address Verification Interface

IBM Data Science in Practice

NOVEMBER 28, 2022

Implementing a data fabric architecture is the answer. What is a data fabric? Data fabric is defined by IBM as “an architecture that facilitates the end-to-end integration of various data pipelines and cloud environments through the use of intelligent and automated systems.” This leaves more time for data analysis.

Data Pipeline

Data Pipeline Data Quality Data Preparation Data Governance

Step-by-step guide: Generative AI for your business

IBM Journey to AI blog

JULY 30, 2024

Generative artificial intelligence (gen AI) is transforming the business world by creating new opportunities for innovation, productivity and efficiency. This guide offers a clear roadmap for businesses to begin their gen AI journey. Most teams should include at least four types of team members.

AI

AI AI Data Scientist Data Preparation

Webinars

How to Achieve High-Accuracy Results When Using LLMs

Maximizing Profit and Productivity: The New Era of AI-Powered Accounting

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

AWS Machine Learning Blog

OCTOBER 24, 2024

Conventional ML development cycles take weeks to many months and requires sparse data science understanding and ML development skills. Business analysts’ ideas to use ML models often sit in prolonged backlogs because of data engineering and data science team’s bandwidth and data preparation activities.

Data Warehouse

Data Warehouse Machine Learning Machine Learning Cloud Data

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Flipboard

NOVEMBER 24, 2023

Data scientists and ML engineers require capable tooling and sufficient compute for their work. To pave the way for the growth of AI, BMW Group needed to make a leap regarding scalability and elasticity while reducing operational overhead, software licensing, and hardware management.

ML

ML ML AWS AI

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

This post presents a solution that uses a generative artificial intelligence (AI) to standardize air quality data from low-cost sensors in Africa, specifically addressing the air quality data integration problem of low-cost sensors. Qiong (Jo) Zhang , PhD, is a Senior Partner Solutions Architect at AWS, specializing in AI/ML.

AWS

AWS Python AI AI

Enhance call center efficiency using batch inference for transcript summarization with Amazon Bedrock

AWS Machine Learning Blog

AUGUST 21, 2024

In the following sections, we provide a detailed, step-by-step guide on implementing these new capabilities, covering everything from data preparation to job submission and output analysis. This use case serves to illustrate the broader potential of the feature for handling diverse data processing tasks.

AWS

AWS Data Preparation ML ML

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

IBM Journey to AI blog

AUGUST 12, 2024

AIOPs refers to the application of artificial intelligence (AI) and machine learning (ML) techniques to enhance and automate various aspects of IT operations (ITOps). However, they differ fundamentally in their purpose and level of specialization in AI and ML environments.

Big Data

Big Data Big Data ML ML

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

is our enterprise-ready next-generation studio for AI builders, bringing together traditional machine learning (ML) and new generative AI capabilities powered by foundation models. With watsonx.ai, businesses can effectively train, validate, tune and deploy AI models with confidence and at scale across their enterprise.

AI

AI AI Machine Learning Machine Learning

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

See also Thoughtworks’s guide to Evaluating MLOps Platforms End-to-end MLOps platforms End-to-end MLOps platforms provide a unified ecosystem that streamlines the entire ML workflow, from data preparation and model development to deployment and monitoring. Might be useful Unlike manual, homegrown, or open-source solutions, neptune.ai

Machine Learning

Machine Learning Machine Learning ML ML

2024 Mexican Grand Prix: Formula 1 Prediction Challenge Results

Ocean Protocol

NOVEMBER 28, 2024

Yunus focused on building a robust data pipeline, merging historical and current-season data to create a comprehensive dataset. Yunus secured third place by delivering a flexible, well-documented solution that bridged data science and Formula 1 strategy. Follow Ocean on Twitter or Telegram to stay up to date.

Cross Validation

Cross Validation Decision Trees Data Scientist Data Science

Building Scalable AI Pipelines with MLOps: A Guide for Software Engineers

ODSC - Open Data Science

OCTOBER 7, 2024

In today’s landscape, AI is becoming a major focus in developing and deploying machine learning models. It isn’t just about writing code or creating algorithms — it requires robust pipelines that handle data, model training, deployment, and maintenance. As datasets grow, scalable data ingestion and storage become critical.

Machine Learning

Machine Learning Machine Learning AI AI

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

AWS Machine Learning Blog

OCTOBER 18, 2023

Purina used artificial intelligence (AI) and machine learning (ML) to automate animal breed detection at scale. The solution focuses on the fundamental principles of developing an AI/ML application workflow of data preparation, model training, model evaluation, and model monitoring.

AWS

AWS ML ML Machine Learning

Unlocking Tabular Data’s Hidden Potential

ODSC - Open Data Science

MAY 10, 2023

Unfortunately, even the data science industry — which should recognize tabular data’s true value — often underestimates its relevance in AI. Many mistakenly equate tabular data with business intelligence rather than AI, leading to a dismissive attitude toward its sophistication.

Data Scientist

Data Scientist Data Science Deep Learning Deep Learning

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

AWS Machine Learning Blog

MARCH 1, 2023

Continuous ML model retraining is one method to overcome this challenge by relearning from the most recent data. This requires not only well-designed features and ML architecture, but also data preparation and ML pipelines that can automate the retraining process.

AWS

AWS ML ML ETL

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Flipboard

AUGUST 17, 2023

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, ML, and application development. Above all, this solution offers you a native Spark way to implement an end-to-end data pipeline from Amazon Redshift to SageMaker.

ML

ML ML AWS Data Warehouse

Discover the Most Important Fundamentals of Data Engineering

Pickl AI

NOVEMBER 4, 2024

Effective data governance enhances quality and security throughout the data lifecycle. What is Data Engineering? Data Engineering is designing, constructing, and managing systems that enable data collection, storage, and analysis. They are crucial in ensuring data is readily available for analysis and reporting.

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

Orchestrate Ray-based machine learning workflows using Amazon SageMaker

AWS Machine Learning Blog

SEPTEMBER 18, 2023

Ray AI Runtime (AIR) reduces friction of going from development to production. Amazon SageMaker Pipelines allows orchestrating the end-to-end ML lifecycle from data preparation and training to model deployment as automated workflows. We set up an end-to-end Ray-based ML workflow, orchestrated using SageMaker Pipelines.

Machine Learning

Machine Learning Machine Learning ML ML

10 Best Data Engineering Books [Beginners to Advanced]

Pickl AI

AUGUST 1, 2023

The primary goal of Data Engineering is to transform raw data into a structured and usable format that can be easily accessed, analyzed, and interpreted by Data Scientists, analysts, and other stakeholders. Future of Data Engineering The Data Engineering market will expand from $18.2

Data Engineering

Data Engineering Data Engineering Data Engineer Data Engineering

3 Takeaways from Gartner’s 2018 Data and Analytics Summit

DataRobot Blog

APRIL 1, 2018

Today’s data management and analytics products have infused artificial intelligence (AI) and machine learning (ML) algorithms into their core capabilities. These modern tools will auto-profile the data, detect joins and overlaps, and offer recommendations. 2) Line of business is taking a more active role in data projects.

Analytics

Analytics Analytics Data Preparation Citizen Data Scientist

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

phData

AUGUST 2, 2024

Efficient data transformation and processing are crucial for data analytics and generating insights. Snowflake AI Data Cloud is one of the most powerful platforms, including storage services supporting complex data. Integrating Snowflake with dbt adds another layer of automation and control to the data pipeline.

Data Pipeline

Data Pipeline Python Database SQL

MLOps and the evolution of data science

IBM Journey to AI blog

AUGUST 11, 2023

Machine learning (ML), a subset of artificial intelligence (AI), is an important piece of data-driven innovation. Machine learning engineers take massive datasets and use statistical methods to create algorithms that are trained to find patterns and uncover key insights in data mining projects.

Data Science

Data Science Machine Learning Machine Learning ML

Use Snowflake as a data source to train ML models with Amazon SageMaker

AWS Machine Learning Blog

MARCH 8, 2023

In order to train a model using data stored outside of the three supported storage services, the data first needs to be ingested into one of these services (typically Amazon S3). This requires building a data pipeline (using tools such as Amazon SageMaker Data Wrangler ) to move data into Amazon S3.

ML

ML ML AWS Python

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Pickl AI

OCTOBER 17, 2024

Summary: This article explores the significance of ETL Data in Data Management. It highlights key components of the ETL process, best practices for efficiency, and future trends like AI integration and real-time processing, ensuring organisations can leverage their data effectively for strategic decision-making.

ETL

ETL Data Warehouse Data Quality Data Governance

Google’s Dr. Arsanjani on Enterprise Foundation Model Challenges

Snorkel AI

MARCH 2, 2023

Ali Arsanjani, director of cloud partner engineering at Google Cloud , presented a talk entitled “Challenges and Ethics of DLM and LLM Adoption in the Enterprise” at Snorkel AI’s recent Foundation Model Virtual Summit. Data preparation, train and tune, deploy and monitor. Hope you can all hear me well.

Machine Learning

Machine Learning Machine Learning Data Preparation ML

Google’s Arsanjani on Enterprise Foundation Model Challenges

Snorkel AI

MARCH 2, 2023

Ali Arsanjani, director of cloud partner engineering at Google Cloud , presented a talk entitled “Challenges and Ethics of DLM and LLM Adoption in the Enterprise” at Snorkel AI’s recent Foundation Model Virtual Summit. Data preparation, train and tune, deploy and monitor. Hope you can all hear me well.

Machine Learning

Machine Learning Machine Learning Data Preparation ML

Using ChatGPT for Data Science

Pickl AI

FEBRUARY 8, 2023

Data Scientists and Data Analysts have been using ChatGPT for Data Science to generate codes and answers rapidly. Data Manipulation The process through which you can change the data according to your project requirement for further data analysis is known as Data Manipulation.

Data Science

Data Science Data Scientist Machine Learning Machine Learning

LLMOps vs. MLOps: Understanding the Differences

Iguazio

FEBRUARY 8, 2024

Data engineers, data scientists and other data professional leaders have been racing to implement gen AI into their engineering efforts. LLM examples include GPT, BERT, and similar advanced AI systems. They are characterized by their enormous size, complexity, and the vast amount of data they process.

ML

ML ML Data Scientist Machine Learning

How Alteryx & Snowflake Accelerates Analytics

phData

FEBRUARY 24, 2023

Alteryx provides organizations with an opportunity to automate access to data, analytics , data science, and process automation all in one, end-to-end platform. Its capabilities can be split into the following topics: automating inputs & outputs, data preparation, data enrichment, and data science.

Analytics

Analytics Analytics Database Python

Data Quality in Machine Learning

Pickl AI

JULY 24, 2024

Future Trends and Innovations The evolving landscape of data, characterised by exponential growth, increasing complexity, and the imperative for real-time insights, is driving rapid advancements in data quality practices. AI and Machine Learning These are emerging as powerful tools for enhancing data quality.

Data Quality

Data Quality Machine Learning Machine Learning Clean Data

Popular Data Transformation Tools: Importance and Best Practices

Pickl AI

OCTOBER 10, 2024

Informatica Informatica is a powerful enterprise-grade data management platform offering a range of tools for data integration, transformation, and governance. It supports batch and real-time data processing, making it a preferred choice for large enterprises with complex data workflows.

Data Quality

Data Quality AWS Machine Learning Machine Learning

How to Use Fivetran to Ingest Salesforce Data into Snowflake

phData

SEPTEMBER 25, 2024

In this blog, we will provide a comprehensive overview of ETL considerations, introduce key tools such as Fivetran, Salesforce, and Snowflake AI Data Cloud , and demonstrate how to set up a pipeline and ingest data between Salesforce and Snowflake using Fivetran. What is Fivetran?

ETL

ETL Database Data Warehouse Analytics

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snorkel AI

MAY 26, 2023

[link] Ahmad Khan, head of artificial intelligence and machine learning strategy at Snowflake gave a presentation entitled “Scalable SQL + Python ML Pipelines in the Cloud” about his company’s Snowpark service at Snorkel AI’s Future of Data-Centric AI virtual conference in August 2022. Welcome everybody.

SQL

SQL ML ML Python

Your Complete Roadmap to Become an Azure Data Scientist

Pickl AI

SEPTEMBER 5, 2024

Data Preparation: Cleaning, transforming, and preparing data for analysis and modelling. The Microsoft Certified: Azure Data Scientist Associate certification is highly recommended, as it focuses on the specific tools and techniques used within Azure.

Azure

Azure Data Scientist Data Science Machine Learning

How Does Snowpark Work?

phData

FEBRUARY 7, 2024

Snowpark Use Cases Data Science Streamlining data preparation and pre-processing: Snowpark’s Python, Java, and Scala libraries allow data scientists to use familiar tools for wrangling and cleaning data directly within Snowflake, eliminating the need for separate ETL pipelines and reducing context switching.

Python

Python ML ML SQL

A review of purpose-built accelerators for financial services

AWS Machine Learning Blog

SEPTEMBER 11, 2024

These activities cover disparate fields such as basic data processing, analytics, and machine learning (ML). And finally, some activities, such as those involved with the latest advances in artificial intelligence (AI), are simply not practically possible, without hardware acceleration.

AWS

AWS ML ML Clustering

How to Choose MLOps Tools: In-Depth Guide for 2024

DagsHub

APRIL 21, 2024

A traditional machine learning (ML) pipeline is a collection of various stages that include data collection, data preparation, model training and evaluation, hyperparameter tuning (if needed), model deployment and scaling, monitoring, security and compliance, and CI/CD.

Machine Learning

Machine Learning Machine Learning ML ML

Common Pitfalls in Computer Vision Projects

DagsHub

MARCH 5, 2024

Computer vision is a subfield of artificial intelligence (AI) that teaches computers to see, observe, and interpret visual cues in the world. Preprocess data to mirror real-world deployment conditions. Thorough validation procedures: Evaluate model performance on unseen data during validation, resembling real-world distribution.

Cross Validation

Cross Validation Algorithm Data Pipeline Data Preparation

Train An Emotion Recognition Model Using Multiple Datasets-Part 1

Mlearning.ai

JUNE 21, 2023

We then go over all the project components and processes, from data preparation, model training, and experiment tracking to model evaluation, to equip you with the skills to construct your own emotion recognition model. BECOME a WRITER at MLearning.ai // FREE ML Tools // Clearview AI Mlearning.ai

Deep Learning

Deep Learning Deep Learning ML ML

How to Build an End-To-End ML Pipeline

The MLOps Blog

MAY 9, 2023

Pipelines and platforms are related concepts in MLOps, but they refer to different aspects of the machine learning workflow. The platform typically includes components for the ML ecosystem like data management, feature stores, experiment trackers, a model registry, a testing environment, model serving, and model management.

ML

ML ML Machine Learning Machine Learning

How Dataiku and Snowflake Strengthen the Modern Data Stack

phData

NOVEMBER 4, 2024

It must integrate seamlessly across data technologies in the stack to execute various workflows—all while maintaining a strong focus on performance and governance. Two key technologies that have become foundational for this type of architecture are the Snowflake AI Data Cloud and Dataiku. Let’s say your company makes cars.

Machine Learning

Machine Learning Machine Learning Data Science ML

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

AWS Machine Learning Blog

SEPTEMBER 18, 2024

The ZMP analyzes billions of structured and unstructured data points to predict consumer intent by using sophisticated artificial intelligence (AI) to personalize experiences at scale. Further expanding the capabilities of AI in marketing, Zeta Global has developed AI Lookalikes.

AWS

AWS Machine Learning Machine Learning ML

Introducing the DataRobot AI Cloud: A Closer Look

DataRobot

SEPTEMBER 14, 2021

Time and time again, we hear about the need for AI to support cross-functional teams and users. To provide the ability to integrate diverse data sources. To offer the flexibility to deploy AI solutions anywhere. The DataRobot AI Cloud Platform is the culmination of nearly a decade of pioneering AI innovation, representing 1.5

AI

AI AI Data Pipeline Data Preparation

Analyze security findings faster with no-code data preparation using generative AI and Amazon SageMaker Canvas

Data Fabric and Address Verification Interface

Webinars

Trending Sources

Step-by-step guide: Generative AI for your business

Webinars

Enhance your Amazon Redshift cloud data warehouse with easier, simpler, and faster machine learning using Amazon SageMaker Canvas

Accelerating AI/ML development at BMW Group with Amazon SageMaker Studio

Improving air quality with generative AI

Enhance call center efficiency using batch inference for transcript summarization with Amazon Bedrock

AIOps vs. MLOps: Harnessing big data for “smarter” ITOPs

Exploring the AI and data capabilities of watsonx

MLOps Landscape in 2023: Top Tools and Platforms

2024 Mexican Grand Prix: Formula 1 Prediction Challenge Results

Building Scalable AI Pipelines with MLOps: A Guide for Software Engineers

Optimize pet profiles for Purina’s Petfinder application using Amazon Rekognition Custom Labels and AWS Step Functions

Unlocking Tabular Data’s Hidden Potential

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Discover the Most Important Fundamentals of Data Engineering

Orchestrate Ray-based machine learning workflows using Amazon SageMaker

10 Best Data Engineering Books [Beginners to Advanced]

3 Takeaways from Gartner’s 2018 Data and Analytics Summit

How Do You Call Snowflake Stored Procedures Using dbt Hooks?

MLOps and the evolution of data science

Use Snowflake as a data source to train ML models with Amazon SageMaker

Maximising Efficiency with ETL Data: Future Trends and Best Practices

Google’s Dr. Arsanjani on Enterprise Foundation Model Challenges

Google’s Arsanjani on Enterprise Foundation Model Challenges

Using ChatGPT for Data Science

LLMOps vs. MLOps: Understanding the Differences

How Alteryx & Snowflake Accelerates Analytics

Data Quality in Machine Learning

Popular Data Transformation Tools: Importance and Best Practices

How to Use Fivetran to Ingest Salesforce Data into Snowflake

Snowflake Snowpark: cloud SQL and Python ML pipelines

Snowflake Snowpark: cloud SQL and Python ML pipelines

Your Complete Roadmap to Become an Azure Data Scientist

How Does Snowpark Work?

A review of purpose-built accelerators for financial services

How to Choose MLOps Tools: In-Depth Guide for 2024

Common Pitfalls in Computer Vision Projects

Train An Emotion Recognition Model Using Multiple Datasets-Part 1

How to Build an End-To-End ML Pipeline

How Dataiku and Snowflake Strengthen the Modern Data Stack

Building an efficient MLOps platform with OSS tools on Amazon ECS with AWS Fargate

Introducing the DataRobot AI Cloud: A Closer Look

Stay Connected