DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Why Most Data Projects Fail Before the First Model Is Built

Why Most Data Projects Fail Before the First Model Is Built

5
Comments
2 min read
Boosting Lightweight ETL on AWS Lambda & Glue Python Shell with DuckDB and Apache Arrow Dataset

Boosting Lightweight ETL on AWS Lambda & Glue Python Shell with DuckDB and Apache Arrow Dataset

5
Comments
9 min read
Apache Data Lakehouse Weekly: February 26 – March 5, 2026

Apache Data Lakehouse Weekly: February 26 – March 5, 2026

1
Comments
6 min read
Data Relationship Intelligence Is Infrastructure — Not a Feature

Data Relationship Intelligence Is Infrastructure — Not a Feature

Comments
1 min read
DAY 5 - Production-Grade Feature Engineering

DAY 5 - Production-Grade Feature Engineering

Comments
1 min read
Part 4 | Why State Machines Power Reliable Scheduling Systems

Part 4 | Why State Machines Power Reliable Scheduling Systems

Comments
6 min read
The Two SQL Concepts That Made Me Finally Understand Real Data: Joins & Window Functions.

The Two SQL Concepts That Made Me Finally Understand Real Data: Joins & Window Functions.

1
Comments
3 min read
Our Data Extraction Pipeline Worked Perfectly… Until Month 6

Our Data Extraction Pipeline Worked Perfectly… Until Month 6

1
Comments
2 min read
Share of Shelf Analysis: How to Scrape Zappos Search Results

Share of Shelf Analysis: How to Scrape Zappos Search Results

1
Comments
4 min read
Iterator Patterns: How to Process Millions of Records Without Running Out of Memory

Iterator Patterns: How to Process Millions of Records Without Running Out of Memory

1
Comments
5 min read
O Poder da Leitura Genérica no PySpark: Uma Abordagem Unificada para Dados

O Poder da Leitura Genérica no PySpark: Uma Abordagem Unificada para Dados

1
Comments
3 min read
Introduction to Joins and Windows Funtions in SQL

Introduction to Joins and Windows Funtions in SQL

Comments
3 min read
Scaling Relationship Discovery Beyond Brute Force

Scaling Relationship Discovery Beyond Brute Force

2
Comments
1 min read
Data Engineering for AI Projects: What Most Developers Get Wrong

Data Engineering for AI Projects: What Most Developers Get Wrong

1
Comments
5 min read
From Statistical Evidence to Executable Data Graphs

From Statistical Evidence to Executable Data Graphs

1
Comments
1 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.