Books, Data Pipeline and Database - Data Science Current

Books

Data Pipeline

Database

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Flipboard

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis. or a later version) database.

ETL

ETL Data Warehouse Analytics Analytics

The 6 best ChatGPT plugins for data science

Data Science Dojo

OCTOBER 2, 2023

This can be useful for data scientists who need to streamline their data science pipeline or automate repetitive tasks. It provides access to a vast database of scholarly articles and books, as well as tools for literature review and data analysis.

Data Science

Data Science Machine Learning Machine Learning Data Analysis

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Unlocking generative AI for enterprises: How SnapLogic powers their low-code Agent Creator using Amazon Bedrock

AWS Machine Learning Blog

OCTOBER 23, 2024

Agent Creator is a versatile extension to the SnapLogic platform that is compatible with modern databases, APIs, and even legacy mainframe systems, fostering seamless integration across various data environments. The resulting vectors are stored in OpenSearch Service databases for efficient retrieval and querying.

AI AI AWS Database

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

10 Best Data Engineering Books [Beginners to Advanced]

Pickl AI

AUGUST 1, 2023

Aspiring and experienced Data Engineers alike can benefit from a curated list of books covering essential concepts and practical techniques. These 10 Best Data Engineering Books for beginners encompass a range of topics, from foundational principles to advanced data processing methods. What is Data Engineering?

Data Engineer

Data Engineer Data Engineering Data Engineering Data Engineering

Supercharging Your Data Pipeline with Apache Airflow (Part 2)

Heartbeat

NOVEMBER 6, 2023

Image Source — Pixel Production Inc In the previous article, you were introduced to the intricacies of data pipelines, including the two major types of existing data pipelines. You might be curious how a simple tool like Apache Airflow can be powerful for managing complex data pipelines.

Data Pipeline

Data Pipeline Clean Data ETL Python

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Flipboard

NOVEMBER 24, 2023

The SnapLogic Intelligent Integration Platform (IIP) enables organizations to realize enterprise-wide automation by connecting their entire ecosystem of applications, databases, big data, machines and devices, APIs, and more with pre-built, intelligent connectors called Snaps.

Database

Database AWS ETL SQL

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

AWS Machine Learning Blog

DECEMBER 4, 2024

Its sales analysts face a daily challenge: they need to make data-driven decisions but are overwhelmed by the volume of available information. They have structured data such as sales transactions and revenue metrics stored in databases, alongside unstructured data such as customer reviews and marketing reports collected from various channels.

AWS

AWS AI AI SQL

Build trust in banking with data lineage

IBM Journey to AI blog

APRIL 20, 2023

Before a bank can start the process of certifying a risk model, they first need to understand what data is being used and how it changes as it moves from a database to a model. The value of data lineage applies across all industries, but there are three key focuses when you consider it for banking use cases: 1.

Database

Database Data Engineering Data Engineering Data Engineering

Building a Dataset for Triplet Loss with Keras and TensorFlow

Flipboard

FEBRUARY 13, 2023

Project Structure Creating Our Configuration File Creating Our Data Pipeline Preprocessing Faces: Detection and Cropping Summary Citation Information Building a Dataset for Triplet Loss with Keras and TensorFlow In today’s tutorial, we will take the first step toward building our real-time face recognition application. The dataset.py

Data Pipeline

Data Pipeline Deep Learning Deep Learning Python

Triplet Loss with Keras and TensorFlow

Flipboard

MARCH 6, 2023

In the previous tutorial of this series, we built the dataset and data pipeline for our Siamese Network based Face Recognition application. Specifically, we looked at an overview of triplet loss and discussed what kind of data samples are required to train our model with the triplet loss. Download the code!

Deep Learning

Deep Learning Deep Learning Data Pipeline Computer Science

Five benefits of a data catalog

IBM Journey to AI blog

DECEMBER 16, 2022

You have a specific book in mind, but you have no idea where to find it. You enter the title of the book into the computer and the library’s digital inventory system tells you the exact section and aisle where the book is located. After all, Alex may not be aware of all the data available to her.

Data Quality

Data Quality Data Governance Data Scientist Data Wrangling

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

phData

FEBRUARY 14, 2023

Data integration is essentially the Extract and Load portion of the Extract, Load, and Transform (ELT) process. Data ingestion involves connecting your data sources, including databases, flat files, streaming data, etc, to your data warehouse. Snowflake provides native ways for data ingestion.

Data Warehouse

Data Warehouse Azure AWS Database

How Fifth Third Bank Implements a Data Mesh with Alation and Snowflake

Alation

JUNE 14, 2023

Anyone building anything net-new publishes to Snowflake in a database driven by the use case and uses our commoditized web-based GUI ingestion framework. Data Pipeline Capabilities This team’s scope is massive because the data pipelines are huge and there are many different capabilities embedded in them.

Data Pipeline

Data Pipeline ETL Data Warehouse SQL

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

A cloud data warehouse is designed to combine a concept that every organization knows, namely a data warehouse, and optimizes the components of it, for the cloud. If you’d like a more personalized look into the potential of Snowflake for your business, definitely book one of our free Snowflake migration assessment sessions.

Data Warehouse

Data Warehouse Data Lakes Cloud Data Clustering

Deploy generative AI agents in your contact center for voice and chat using Amazon Connect, Amazon Lex, and Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

SEPTEMBER 24, 2024

An optional CloudFormation stack to deploy a data pipeline to enable a conversation analytics dashboard. Booking – This demonstrates an example of routing the caller to a live agent queue. Choose an option for allowing unredacted logs for the Lambda function in the data pipeline. Choose Create data source.

AWS

AWS AI AI Analytics

A review of purpose-built accelerators for financial services

AWS Machine Learning Blog

SEPTEMBER 11, 2024

Market participants who are receiving either live or historical data feeds need to ingest this data and perform one or more steps, such as parse the message out of a binary protocol, rebuild the limit order book (LOB), or combine multiple feeds into a single normalized format.

AWS

AWS ML ML Clustering

Fine-tune your data lineage tracking with descriptive lineage

IBM Journey to AI blog

JULY 1, 2024

Whenever anyone talks about data lineage and how to achieve it, the spotlight tends to shine on automation. This is expected, as automating the process of calculating and establishing lineage is crucial to understanding and maintaining a trustworthy system of data pipelines.

ETL

ETL Data Lakes Database Data Pipeline

How to Load and Analyze Semi-structured Data in Snowflake

phData

OCTOBER 20, 2023

What is Semi-structured Data? Semi-structured data, also called partially structured data, is a form that does not adhere to the conventional tabular structure found in relational databases or other data tables. Semi-structured data can come from many sources, including applications, sensors, and mobile devices.

Big Data

Big Data Big Data Database Hadoop

Adversarial Learning with Keras and TensorFlow (Part 1): Overview of Adversarial Learning

PyImageSearch

JANUARY 8, 2024

We will understand the dataset and the data pipeline for our application and discuss the salient features of the NSL framework in detail. config.py ) The data pipeline (i.e., Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL! model.py ) Additionally, the robust.py

Deep Learning

Deep Learning Deep Learning Data Pipeline Computer Science

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

phData

SEPTEMBER 27, 2024

. “ This sounds great in theory, but how does it work in practice with customer data or something like a ‘composable CDP’? Well, implementing transitional modeling does require a shift in how we think about and work with customer data. It often involves specialized databases designed to handle this kind of atomic, temporal data.

Data Modeling

Data Modeling Data Models Apache Kafka Data Lakes

HCLTech’s AWS powered AutoWise Companion: A seamless experience for informed automotive buyer decisions with data-driven design

AWS Machine Learning Blog

JANUARY 15, 2025

The solution extracts valuable insights from diverse data sources, including OEM transactions, vehicle specifications, social media reviews, and OEM QRT reports. By employing a multi-modal approach, the solution connects relevant data elements across various databases.

AWS

AWS SQL AI AI

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

AWS Machine Learning Blog

DECEMBER 18, 2024

To further enrich the dataset, Fastweb generated synthetic Italian data using LLMs. High-quality Italian web articles, books, and other texts served as the basis for training the LLMs to generate authentic-sounding synthetic content that captured the nuances of the language.

Clustering

Clustering AWS AI AI

A Field Guide to Rapidly Improving AI Products

Flipboard

APRIL 15, 2025

With new tools and frameworks emerging weekly, its natural to focus on tangible things we can controlwhich vector database to use, which LLM provider to choose, which agent framework to adopt. The real challenge is ensuring your synthetic data actually triggers the scenarios you want to test. This isnt surprising.

AI AI Database ML

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

The 6 best ChatGPT plugins for data science

Webinars

Trending Sources

Unlocking generative AI for enterprises: How SnapLogic powers their low-code Agent Creator using Amazon Bedrock

Webinars

10 Best Data Engineering Books [Beginners to Advanced]

Supercharging Your Data Pipeline with Apache Airflow (Part 2)

How SnapLogic built a text-to-pipeline application with Amazon Bedrock to translate business intent into action

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

Build trust in banking with data lineage

Building a Dataset for Triplet Loss with Keras and TensorFlow

Triplet Loss with Keras and TensorFlow

Five benefits of a data catalog

What Are The Best Third-Party Data Ingestion Tools For Snowflake?

How Fifth Third Bank Implements a Data Mesh with Alation and Snowflake

What is the Snowflake Data Cloud and How Much Does it Cost?

Deploy generative AI agents in your contact center for voice and chat using Amazon Connect, Amazon Lex, and Amazon Bedrock Knowledge Bases

A review of purpose-built accelerators for financial services

Fine-tune your data lineage tracking with descriptive lineage

How to Load and Analyze Semi-structured Data in Snowflake

Adversarial Learning with Keras and TensorFlow (Part 1): Overview of Adversarial Learning

The Evolution of Customer Data Modeling: From Static Profiles to Dynamic Customer 360

HCLTech’s AWS powered AutoWise Companion: A seamless experience for informed automotive buyer decisions with data-driven design

How Fastweb fine-tuned the Mistral model using Amazon SageMaker HyperPod as a first step to build an Italian large language model

A Field Guide to Rapidly Improving AI Products

Stay Connected