1.1 PostgreSQL: TOAST Compression (pglz vs LZ4)
PostgreSQL uses TOAST (The Oversized-Attribute Storage Technique) to compress and out-of-line store values exceeding roughly 2 KB. Two algorithms are available since PostgreSQL 14:
| Metric | pglz (default) | LZ4 |
|---|---|---|
| Compression ratio | ~2.23x | ~2.07x |
| Compression speed | Baseline | 5x faster (~20% of pglz time) |
| Compression speed under parallel load | Baseline | ~60–70% faster (parallel compression; SELECT impact is ~20% — see "Query time impact") |
| Query time impact | Baseline | 20% faster queries |
| Full benchmark speedup | — | 37.32% faster end-to-end |
| CPU overhead | 2x that of LZ4 | Baseline |
Configuration: ALTER TABLE t ALTER COLUMN c SET COMPRESSION lz4; or globally via default_toast_compression = lz4
Columnar Extensions: Citus and Hydra
- Citus Columnar: 3–10x compression ratios; benchmark data shows 5.4x for a typical mixed-type table vs heap storage.
- Hydra Columnar: Open-source; in ClickBench comparisons, Hydra outperformed Citus, TimescaleDB, Aurora, and standard PostgreSQL on the 42-query benchmark.
- TimescaleDB Hypercore: Achieves up to 95% compression on time-series data using dictionary compression for high-repetition JSONB columns.
1.2 MySQL / InnoDB: Page Compression and Table Compression
| Method | Algorithm | Compression Ratio | Performance Overhead |
|---|---|---|---|
| Barracuda Table Compression (ROW_FORMAT=COMPRESSED) | zlib | 50–75% size reduction | ~60% for write-heavy workloads |
| Transparent Page Compression (InnoDB page) | zlib or LZ4 | Variable | 10–17% (dramatically lower) |
Guidance: For new deployments, transparent page compression with LZ4 provides the best latency-to-savings tradeoff. Reserve Barracuda table compression for cold/archival tables where write overhead is acceptable.
1.3 ClickHouse: Per-Column Codec Selection
ClickHouse provides the most granular compression control of any production database. Each column can be assigned a pipeline of pre-compression codecs followed by a general-purpose algorithm.
| Codec | Best For | Mechanism |
|---|---|---|
| Delta | Monotonic sequences, timestamps | Stores difference between consecutive values |
| DoubleDelta | Periodic timestamps, increasing counters | Delta-of-deltas; exceptional for constant-interval series |
| Gorilla | Slowly changing floats, gauge metrics | XOR between consecutive values |
| T64 | Sparse integers, values small relative to type range | 64x64 bit matrix transpose + bit truncation |
| FPC | Float64 columns | Two-prediction finite context method |
| LZ4 | General purpose, speed-critical | Default; fast compression and decompression |
| ZSTD(level) | General purpose, ratio-critical | 30% better ratio than LZ4; higher CPU |
Real-World Benchmarks (NOAA Weather Dataset)
| Schema Stage | Compressed Size | Uncompressed Size | Ratio |
|---|---|---|---|
| Naive (all String) | 4.07 GiB | 131.58 GiB | 32.36:1 |
| Type-optimized | 3.83 GiB | 71.38 GiB | 18.63:1 |
| String/type refined | 3.81 GiB | 35.34 GiB | 9.28:1 |
| Optimal codecs | 1.42 GiB | 36.05 GiB | 25.36:1 |
Moving from naive types to optimal codecs reduced compressed storage by 65% (4.07 GiB to 1.42 GiB). Standout: Date column with DoubleDelta reduced 2.24 GiB to 24.28 MiB (~99% reduction).
1.4 SQLite: Page-Level Compression Options
SQLite has no built-in compression. External strategies fill the gap:
| Solution | Mechanism | Size Reduction | Query Impact |
|---|---|---|---|
| ZIPVFS (official, commercial) | Per-page compression | Variable | Low overhead |
| sqlite-zstd (Rust extension) | Row-level zstd with dictionary | 75% (442 MB to 67 MB typical) | Sequential: -22%, Random: +34% faster |
| sqlite_zstd_vfs | Page-level VFS streaming | 45–65% | -25% query throughput |
| ZFS/Btrfs filesystem | Transparent block-level | LZ4: ~67%, zstd: up to 90% | Minimal to none |
1.5 Cross-Engine Compression Ratio Summary
| Engine | Strategy | Typical Ratio | Notes |
|---|---|---|---|
| PostgreSQL (heap) | TOAST LZ4 | 2–2.2x | Row-level only |
| PostgreSQL (columnar) | Citus/Hydra | 3–10x | Analytical tables |
| MySQL InnoDB | Transparent page (LZ4) | 2–4x | 10–17% overhead |
| ClickHouse | LZ4 default | 5–15x | Column-oriented |
| ClickHouse | Optimal codecs + ZSTD | 15–50x | Per-column tuning |
| SQLite | sqlite-zstd + dictionary | 4–6x | Extension required |
| TimescaleDB | Hypercore columnar | 10–20x | Time-series optimized |