Data Quality, Demo and ML - Data Science Current

Real value, real time: Production AI with Amazon SageMaker and Tecton

AWS Machine Learning Blog

DECEMBER 4, 2024

Businesses are under pressure to show return on investment (ROI) from AI use cases, whether predictive machine learning (ML) or generative AI. Only 54% of ML prototypes make it to production, and only 5% of generative AI use cases make it to production. Using SageMaker, you can build, train and deploy ML models.

ML

ML ML AWS AI

Alation 2022.2: Open Data Quality Initiative and Enhanced Data Governance

Alation

MAY 24, 2022

generally available on May 24, Alation introduces the Open Data Quality Initiative for the modern data stack, giving customers the freedom to choose the data quality vendor that’s best for them with the added confidence that those tools will integrate seamlessly with Alation’s Data Catalog and Data Governance application.

Data Quality

Data Quality Data Governance ETL Data Observability

Speed up Your ML Projects With Spark

Towards AI

JUNE 25, 2024

This is the first one, where we look at some functions for data quality checks, which are the initial steps I take in EDA. We will use this table to demo and test our custom functions. Let’s get started. 🤠 🔗 All code and config are available on GitHub. The three functions below are created for this purpose. .")

ML

ML ML EDA Data Wrangling

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

Alignment to other tools in the organization’s tech stack Consider how well the MLOps tool integrates with your existing tools and workflows, such as data sources, data engineering platforms, code repositories, CI/CD pipelines, monitoring systems, etc. and Pandas or Apache Spark DataFrames.

Machine Learning

Machine Learning Machine Learning ML ML

Meet FinGPT: An Open-Source Financial Large Language Model (LLMs)

Flipboard

JUNE 16, 2023

These vary from challenges in getting data, maintaining various data forms and kinds, and coping with inconsistent data quality to the crucial need for current information. – Application layer: This layer emphasizes the potential of FinGPT in the financial industry by showcasing real-world applications and demos.

Natural Language Processing

Natural Language Processing Artificial Intelligence Artificial Intelligence Data Quality

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

AWS Machine Learning Blog

DECEMBER 15, 2023

We are excited to announce the launch of Amazon DocumentDB (with MongoDB compatibility) integration with Amazon SageMaker Canvas , allowing Amazon DocumentDB customers to build and use generative AI and machine learning (ML) solutions without writing code. On the Import data page, for Data Source , choose DocumentDB and Add Connection.

Machine Learning

Machine Learning Machine Learning AWS ML

AI-Driven Data Integrity Innovations to Solve Your Top Data Management Challenges

Precisely

FEBRUARY 26, 2025

Boost engagement and adoptionwith integrated, persona-based insights access tailored, role-specific data quality scores, technical details, and relationships at-a-glance. Enhanced Data Catalog With new visual card views, you gain automated data quality scores, technical details, and tailored information for different roles.

Data Governance

Data Governance AI AI Data Quality

DeepSeek in My Engineer’s Eyes

Towards AI

FEBRUARY 18, 2025

Building a demo is one thing; scaling it to production is an entirely different beast. It has already inspired me to set new goals for 2025, and I hope it can do the same for other ML engineers. They also inspired a bunch of new potentials for ML engineers. Everything changed when Deepseek burst onto the scene a month ago.

ML

ML ML Data Quality Algorithm

Prioritizing employee well-being: An innovative approach with generative AI and Amazon SageMaker Canvas

AWS Machine Learning Blog

JUNE 3, 2024

Solution overview SageMaker Canvas brings together a broad set of capabilities to help data professionals prepare, build, train, and deploy ML models without writing any code. SageMaker Data Wrangler has also been integrated into SageMaker Canvas, reducing the time it takes to import, prepare, transform, featurize, and analyze data.

AWS

AWS ML ML AI

State of Machine Learning Survey Results Part Two

ODSC - Open Data Science

MARCH 13, 2023

Machine learning practitioners are often working with data at the beginning and during the full stack of things, so they see a lot of workflow/pipeline development, data wrangling, and data preparation. Some of the issues make perfect sense as they relate to data quality, with common issues being bad/unclean data and data bias.

Machine Learning

Machine Learning Machine Learning Data Wrangling Data Science

Future-Proofing Your App: Strategies for Building Long-Lasting Apps

Iguazio

MAY 29, 2024

In the application pipeline, teams can swap: Logging inputs + responses to various data sources (database, stream, file, etc.) Additional data sources (RAG, web search, etc.) Classical ML models and LLMs If using QLORA to fine-tune, teams can swap out domain specific fine tuned adapters while using the same base model (e.g.

Data Pipeline

Data Pipeline AI AI ML

Create SageMaker Pipelines for training, consuming and monitoring your batch use cases

AWS Machine Learning Blog

APRIL 21, 2023

The last attribute, Churn , is the attribute that we want the ML model to predict. See the following code: # Configure the Data Quality Baseline Job # Configure the transient compute environment check_job_config = CheckJobConfig( role=role_arn, instance_count=1, instance_type="ml.c5.xlarge",

Data Quality

Data Quality ML ML AWS

DagsHub x SwarmOne – Simplifying AI Model Development

DagsHub

DECEMBER 15, 2024

At DagsHub, we're building a platform to simplify your ML workflows. Every project consists of data, experiments, and models. At DagsHub we manage all those, and focus on helping you build and improve the quality of unstructured datasets, so you get high-performing models. Let’s dive into some details.

ML

ML ML AI AI

16 Companies Leading the Way in AI and Data Science

ODSC - Open Data Science

FEBRUARY 28, 2023

We couldn’t be more excited to announce our first group of partners for ODSC East 2023’s AI Expo and Demo Hall. These organizations are shaping the future of the AI and data science industries with their innovative products and services. Check them out below.

Data Science

Data Science Machine Learning Machine Learning AI

12 AI Insight Talks to Help Improve Your Company’s AI Game at ODSC West

ODSC - Open Data Science

OCTOBER 25, 2024

At the AI Expo and Demo Hall as part of ODSC West next week, you’ll have the opportunity to meet one-on-one with representatives from industry-leading organizations like Plot.ly, Google, Snowflake, Microsoft, and plenty more. Delphina Demo: AI-powered Data Scientist Jeremy Hermann | Co-founder at Delphina | Delphina.Ai

AI

AI AI Data Scientist Data Lakes

Best Practices for Metadata Management

Alation

JULY 19, 2021

Artificial intelligence and machine learning (AI and ML) are removing some of the burden of manual metadata management, which has grown too cumbersome for people to manage alone. Data intelligence integrates intelligence derived from active metadata into categories like data quality, governance, and profiling.

Data Governance

Data Governance Machine Learning Machine Learning Artificial Intelligence

Find Your AI Solutions at the ODSC West AI Expo

ODSC - Open Data Science

OCTOBER 20, 2023

At the AI Expo and Demo Hall as part of ODSC West in a few weeks, you’ll have the opportunity to meet one-on-one with representatives from industry-leading organizations like Microsoft Azure, Hewlett Packard, Iguazio, neo4j, Tangent Works, Qwak, Cloudera, and others.

AI

AI AI Data Science Machine Learning

Snorkel AI partners with Snowflake to bring data-centric AI to the Snowflake Data Cloud

Snorkel AI

JANUARY 24, 2023

Unfortunately accessing data across various locations and file types and then operationalizing that data for AI usage has traditionally been a painfully manual, time-consuming, and costly process. Ahmad Khan, Head of AI/ML Strategy at Snowflake, discusses the challenges of operationalizing ML in a recent talk.

AI

AI AI ML ML

Snorkel AI partners with Snowflake to bring data-centric AI to the Snowflake Data Cloud

Snorkel AI

JANUARY 24, 2023

Unfortunately accessing data across various locations and file types and then operationalizing that data for AI usage has traditionally been a painfully manual, time-consuming, and costly process. Ahmad Khan, Head of AI/ML Strategy at Snowflake, discusses the challenges of operationalizing ML in a recent talk.

AI

AI AI ML ML

Seldon and Snorkel AI partner to advance data-centric AI

Snorkel AI

JANUARY 31, 2023

Building a machine learning (ML) pipeline can be a challenging and time-consuming endeavor. Inevitably concept and data drift over time cause degradation in a model’s performance. For an ML project to be successful, teams must build an end-to-end MLOps workflow that is scalable, auditable, and adaptable.

AI

AI AI ML ML

Seldon and Snorkel AI partner to advance data-centric AI

Snorkel AI

JANUARY 31, 2023

Building a machine learning (ML) pipeline can be a challenging and time-consuming endeavor. Inevitably concept and data drift over time cause degradation in a model’s performance. For an ML project to be successful, teams must build an end-to-end MLOps workflow that is scalable, auditable, and adaptable.

AI

AI AI ML ML

Announcing the ODSC West 2023 Preliminary Schedule

ODSC - Open Data Science

SEPTEMBER 20, 2023

Tuesday is the first day of the AI Expo and Demo Hall , where you can connect with our conference partners and check out the latest developments and research from leading tech companies. Finally, get ready for some All Hallows Eve fun with Halloween Data After Dark , featuring a costume contest, candy, and more. What’s next?

Data Wrangling

Data Wrangling Data Science Machine Learning Machine Learning

How AI facilitates more fair and accurate credit scoring

Snorkel AI

OCTOBER 4, 2023

Artificial intelligence and machine learning (AI/ML) offer new avenues for credit scoring solutions and could usher in a new era of fairness, efficiency, and risk management. Traditional credit scoring models rely on static variables and historical data like income, employment, and debt-to-income ratio. Supercharge predictive modeling.

AI

AI AI ML ML

How to import Databricks data into Snorkel Flow

Snorkel AI

JUNE 2, 2023

Databricks customers can now access millions of rows of data seamlessly within the Snorkel Flow platform thanks to a new Databricks connector. Weeks later, on June 29, Snorkel AI Founding Engineer and Product Director Vincent Chen will present at “ Building AI-Powered Products with Foundation Models ” at the Databricks Data + AI Summit.

SQL

SQL Machine Learning Machine Learning ML

How to import Databricks data into Snorkel Flow

Snorkel AI

JUNE 2, 2023

Databricks customers can now access millions of rows of data seamlessly within the Snorkel Flow platform thanks to a new Databricks connector. Weeks later, on June 29, Snorkel AI Founding Engineer and Product Director Vincent Chen will present at “ Building AI-Powered Products with Foundation Models ” at the Databricks Data + AI Summit.

SQL

SQL Machine Learning Machine Learning ML

Definite Guide to Building a Machine Learning Platform

The MLOps Blog

MARCH 21, 2023

From gathering and processing data to building models through experiments, deploying the best ones, and managing them at scale for continuous value in production—it’s a lot. As the number of ML-powered apps and services grows, it gets overwhelming for data scientists and ML engineers to build and deploy models at scale.

Machine Learning

Machine Learning Machine Learning Data Scientist ML

Scale knowledge management use cases with generative AI

IBM Journey to AI blog

JULY 27, 2023

Request a demo to see how watsonx can put AI to work There’s no AI, without IA AI is only as good as the data that informs it, and the need for the right data foundation has never been greater. A data lakehouse with multiple query engines and storage can allow engineers to share data in open formats.

AI

AI AI Data Scientist Data Quality

How AI facilitates more fair and accurate credit scoring

Snorkel AI

OCTOBER 4, 2023

Artificial intelligence and machine learning (AI/ML) offer new avenues for credit scoring solutions and could usher in a new era of fairness, efficiency, and risk management. Traditional credit scoring models rely on static variables and historical data like income, employment, and debt-to-income ratio. Book a demo today.

AI

AI AI ML ML

AI in Stock Trading : Unlocking Profits

Pickl AI

NOVEMBER 3, 2023

It went from simple rule-based systems to advanced data-driven algorithms. Today, real-time trading choices are made by AI using the combined power of big data, machine learning (ML), and predictive analytics. Algorithms for ML: AI models employ ML to adjust and find correlations, gradually improving accuracy.

AI

AI AI Artificial Intelligence Artificial Intelligence

Snorkel AI Teams with Google Cloud and Vertex AI to speed AI deployment

Snorkel AI

MARCH 14, 2023

Snorkel AI and Google Cloud have partnered to help organizations successfully transform raw, unstructured data into actionable AI-powered systems. Snorkel Flow easily deploys on Google Cloud infrastructure, ingests data from Google Cloud data sources, and integrates with Google Cloud’s AI and Data Cloud services.

AI

AI AI Data Scientist ML

Snorkel AI Teams with Google Cloud and Vertex AI to speed AI deployment

Snorkel AI

MARCH 14, 2023

Snorkel AI and Google Cloud have partnered to help organizations successfully transform raw, unstructured data into actionable AI-powered systems. Snorkel Flow easily deploys on Google Cloud infrastructure, ingests data from Google Cloud data sources, and integrates with Google Cloud’s AI and Data Cloud services.

AI

AI AI Data Scientist ML

Modern Data Management Essentials: Exploring Data Fabric

Precisely

JULY 18, 2024

Key Takeaways Data Fabric is a modern data architecture that facilitates seamless data access, sharing, and management across an organization. Data management recommendations and data products emerge dynamically from the fabric through automation, activation, and AI/ML analysis of metadata.

Data Lakes

Data Lakes Data Warehouse Data Governance Machine Learning

LLMOps vs. MLOps: Understanding the Differences

Iguazio

FEBRUARY 8, 2024

They are characterized by their enormous size, complexity, and the vast amount of data they process. These elements need to be taken into consideration when managing, streamlining and deploying LLMs in ML pipelines, hence the specialized discipline of LLMOps. Data Pipeline - Manages and processes various data sources.

ML

ML ML Data Scientist AI

Generative AI in the Enterprise

O'Reilly Media

NOVEMBER 28, 2023

Few nonusers (2%) report that lack of data or data quality is an issue, and only 1.3% AI users are definitely facing these problems: 7% report that data quality has hindered further adoption, and 4% cite the difficulty of training a model on their data. Deploying and managing AI products isn’t simple.

AI

AI AI Data Analysis Data Analysis

How to Visualize Deep Learning Models

The MLOps Blog

NOVEMBER 14, 2023

This is where visualizations in ML come in. Graphical representations of structures and data flow within a deep learning model make its complexity easier to comprehend and enable insight into its decision-making process. Data scientists and ML engineers: Creating and training deep learning models is no easy feat.

Deep Learning

Deep Learning Deep Learning Data Scientist Machine Learning

Tackling AI’s data challenges with IBM databases on AWS

IBM Journey to AI blog

MARCH 14, 2024

Businesses face significant hurdles when preparing data for artificial intelligence (AI) applications. The existence of data silos and duplication, alongside apprehensions regarding data quality, presents a multifaceted environment for organizations to manage.

AWS

AWS Database ETL AI

6 Best Practices for Implementing Generative AI

Iguazio

JANUARY 14, 2025

Data management - Ensuring data quality through data ingestion, transformation, cleansing, versioning, tagging, labeling, indexing, and more. Development - High quality model training, fine-tuning or prompt tuning, validation and deployment with CI/CD for ML.

AI

AI AI Deep Learning Deep Learning

$100M+ ARR: Alation Achieves Centaur Status

Alation

SEPTEMBER 30, 2022

Our advanced search capability means that we can build the easiest-to-use and massively powerful interfaces for data stewards. Our ability to leverage ML for each allows us to make operations teams far more efficient. Our comprehensive connectors mean that we can build lineage much more efficiently. However, we can’t do it alone.

Data Governance

Data Governance Azure SQL Data Quality

Better Forecasting with AI-Powered Time Series Modeling

DataRobot Blog

DECEMBER 15, 2022

If your dataset is not in time order (time consistency is required for accurate Time Series projects), DataRobot can fix those gaps using the DataRobot Data Prep tool , a no-code tool that will get your data ready for Time Series forecasting. Prepare your data for Time Series Forecasting. Configuring an ML project.

Exploratory Data Analysis

Exploratory Data Analysis AI AI Machine Learning

Enterprise LLM Summit highlights the importance of data development

Snorkel AI

OCTOBER 27, 2023

Instead of exclusively relying on a singular data development technique, leverage a variety of techniques such as promoting, RAG, and fine-tuning for the most optimal outcome. Focus on improving data quality and transforming manual data development processes into programmatic operations to scale fine-tuning.

Data Scientist

Data Scientist Machine Learning Machine Learning AI

Top 10 Reasons for Alation with Snowflake: Reduce Risk with Active Data Governance

Alation

SEPTEMBER 7, 2021

In the next section, let’s take a deeper look into how these key attributes help data scientists and analysts make faster, more informed decisions, while supporting stewards in their quest to scale governance policies on the Data Cloud easily. Find Trusted Data. Verifying quality is time consuming.

Data Governance

Data Governance Data Scientist Data Quality Data Profiling

11 Trending LLM Topics Coming to ODSC West 2024

ODSC - Open Data Science

SEPTEMBER 17, 2024

Large Model Quality and Evaluation Anoop Sinha | Research Director, AI & Future Technologies | Google Large model development faces many challenges when it comes to ML quality and evaluation, including the coverage, scale, and wide use cases for what LLMs are used for. Check out a few of them below.

Database

Database Data Science ML ML

How to Implement Gen AI in Highly Regulated Environments: Financial Services and Telecommunications and More

Iguazio

AUGUST 9, 2024

We also show a banking chatbot demo that includes fine-tuning a model and adding guardrails. Data Management - Ensuring data quality through data ingestion, transformation, cleansing, versioning, tagging, labeling, indexing, and more. Using the same data for model improvement. The four pipelines include: 1.

AI

AI AI Data Science Deep Learning

How to Implement Gen AI in Highly Regulated Environments: Financial Services and Telecommunications and More

Iguazio

AUGUST 9, 2024

We also show a banking chatbot demo that includes fine-tuning a model and adding guardrails. Data Management - Ensuring data quality through data ingestion, transformation, cleansing, versioning, tagging, labeling, indexing, and more. Using the same data for model improvement. The four pipelines include: 1.

AI

AI AI Data Science Deep Learning

Real value, real time: Production AI with Amazon SageMaker and Tecton

Alation 2022.2: Open Data Quality Initiative and Enhanced Data Governance

Webinars

Trending Sources

Speed up Your ML Projects With Spark

Webinars

MLOps Landscape in 2023: Top Tools and Platforms

Meet FinGPT: An Open-Source Financial Large Language Model (LLMs)

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

AI-Driven Data Integrity Innovations to Solve Your Top Data Management Challenges

DeepSeek in My Engineer’s Eyes

Prioritizing employee well-being: An innovative approach with generative AI and Amazon SageMaker Canvas

State of Machine Learning Survey Results Part Two

Future-Proofing Your App: Strategies for Building Long-Lasting Apps

Create SageMaker Pipelines for training, consuming and monitoring your batch use cases

DagsHub x SwarmOne – Simplifying AI Model Development

16 Companies Leading the Way in AI and Data Science

12 AI Insight Talks to Help Improve Your Company’s AI Game at ODSC West

Best Practices for Metadata Management

Find Your AI Solutions at the ODSC West AI Expo

Snorkel AI partners with Snowflake to bring data-centric AI to the Snowflake Data Cloud

Snorkel AI partners with Snowflake to bring data-centric AI to the Snowflake Data Cloud

Seldon and Snorkel AI partner to advance data-centric AI

Seldon and Snorkel AI partner to advance data-centric AI

Announcing the ODSC West 2023 Preliminary Schedule

How AI facilitates more fair and accurate credit scoring

How to import Databricks data into Snorkel Flow

How to import Databricks data into Snorkel Flow

Definite Guide to Building a Machine Learning Platform

Scale knowledge management use cases with generative AI

How AI facilitates more fair and accurate credit scoring

AI in Stock Trading : Unlocking Profits

Snorkel AI Teams with Google Cloud and Vertex AI to speed AI deployment

Snorkel AI Teams with Google Cloud and Vertex AI to speed AI deployment

Modern Data Management Essentials: Exploring Data Fabric

LLMOps vs. MLOps: Understanding the Differences

Generative AI in the Enterprise

How to Visualize Deep Learning Models

Tackling AI’s data challenges with IBM databases on AWS

6 Best Practices for Implementing Generative AI

$100M+ ARR: Alation Achieves Centaur Status

Better Forecasting with AI-Powered Time Series Modeling

Enterprise LLM Summit highlights the importance of data development

Top 10 Reasons for Alation with Snowflake: Reduce Risk with Active Data Governance

11 Trending LLM Topics Coming to ODSC West 2024

How to Implement Gen AI in Highly Regulated Environments: Financial Services and Telecommunications and More

How to Implement Gen AI in Highly Regulated Environments: Financial Services and Telecommunications and More

Stay Connected