article thumbnail

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

AWS Machine Learning Blog

Organizations can search for PII using methods such as keyword searches, pattern matching, data loss prevention tools, machine learning (ML), metadata analysis, data classification software, optical character recognition (OCR), document fingerprinting, and encryption.

AWS 105
article thumbnail

Upcoming Snowflake Features

phData

It allows data engineers familiar with Python and Pandas to run their Pandas code in a scalable and distributed manner. Many more exciting features and updates include AI-powered Object Descriptions, Universal Search, and Sensitive Data Classification with Snowflake Horizon. schemas["my_schema"].tables.create(my_table)

Python 52
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

How much data processing that occurs will depend on the data’s state when ingested and how different the format is from the desired end state. Most data processing tasks are completed using ETL (Extract, Transform, Load) or ELT (Extract, Load Transform) processes.

article thumbnail

Using Snowflake Data as an Insurance Company

phData

Masked data provides a cost-effective way to help test if a system or design will perform as expected in real-life scenarios. As the insurance industry continues to generate a wider range and volume of data, it becomes more challenging to manage data classification.

article thumbnail

Alation 2022.1: Customize Your Data Catalog

Alation

Through Impact Analysis, users can determine if a problem occurred with data upstream, and locate the impacted data downstream. With robust data lineage, data engineers can find and fix issues fast and prevent them from recurring. Similarly, analysts gain a clear view of how data is created.

article thumbnail

How to Migrate from dbt Core to dbt Cloud: phData’s Simplified Approach

phData

These projects should include all functional areas within the data platform including analytics engineering, machine learning , and data science. Data governance and data classification are potential reasons to separate projects in dbt Cloud.

article thumbnail

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

How much data processing that occurs will depend on the data’s state when ingested and how different the format is from the desired end state. Most data processing tasks are completed using ETL (Extract, Transform, Load) or ELT (Extract, Load Transform) processes.