Overview
Zstandard (zstd) is a fast lossless compression algorithm targeting real-time compression scenarios at zlib-level and better compression ratios. It is backed by a very fast entropy stage provided by the Huff0 and FSE library (Finite State Entropy).
Zstandard's format is stable and documented in RFC 8478. The reference implementation is an open-source C library and CLI. Deployed across Meta and many other large cloud infrastructures. Continuously fuzzed by Google's oss-fuzz.
Benchmarks
Tested on Core i7-9700K @ 4.9GHz, Ubuntu 24.04, gcc 14.2.0, Silesia compression corpus:
| Compressor | Ratio | Compression | Decompression |
|---|---|---|---|
| zstd 1.5.7 -1 | 2.896 | 510 MB/s | 1550 MB/s |
| brotli 1.1.0 -1 | 2.883 | 290 MB/s | 425 MB/s |
| zlib 1.3.1 -1 | 2.743 | 105 MB/s | 390 MB/s |
| zstd 1.5.7 --fast=1 | 2.439 | 545 MB/s | 1850 MB/s |
| quicklz 1.5.0 -1 | 2.238 | 520 MB/s | 750 MB/s |
| zstd 1.5.7 --fast=4 | 2.146 | 665 MB/s | 2050 MB/s |
| lzo1x 2.10 -1 | 2.106 | 650 MB/s | 780 MB/s |
| lz4 1.10.0 | 2.101 | 675 MB/s | 3850 MB/s |
| snappy 1.2.1 | 2.089 | 520 MB/s | 1500 MB/s |
| lzf 3.6 -1 | 2.077 | 410 MB/s | 820 MB/s |
--fast=#) offer faster speed at the cost of ratio. Higher positive levels offer stronger ratios at the cost of compression speed. Decompression speed remains roughly constant across all settings — this is critical for the use case of reloading condensed repos into AI tools.
Small Data Compression (Dictionary Training)
Compression algorithms learn from past data to compress future data. At the beginning of a new data set, there is no "past" to build upon — making small data inherently harder to compress.
Zstd solves this with training mode: provide sample files to generate a "dictionary" that is loaded before compression/decompression.
Dictionary How-To
- Create dictionary:
zstd --train FullPathToTrainingSet/* -o dictionaryName - Compress:
zstd -D dictionaryName FILE - Decompress:
zstd -D dictionaryName --decompress FILE.zst
github-users sample set (~10K records, ~1KB each), dictionary-based compression achieves dramatically better ratios with faster speeds. For structured KV data with repetitive key patterns, dictionary compression can yield 3–4x improvement over non-dictionary ZSTD on small blocks.
Build Systems
| System | Command |
|---|---|
| make (reference) | make in root directory |
| cmake | cmake -S . -B build-cmake && cmake --build build-cmake |
| meson | See build/meson |
| vcpkg | vcpkg install zstd |
| conan | conan install --requires="zstd/[*]" --build=missing |
| buck | buck build programs:zstd |
| bazel | Via Bazel Central Repository |
| Visual Studio | Projects in build/ dir or generate via cmake |
macOS Universal2 (Fat) Build
cmake -S . -B build-cmake-debug -G Ninja -DCMAKE_OSX_ARCHITECTURES="x86_64;x86_64h;arm64"
cd build-cmake-debug
ninja
sudo ninja install
Testing
- Quick smoke test:
make check - Script-based:
playTest.shfromsrc/tests(set$ZSTD_BINand$DATAGEN_BIN) - CI details: see
TESTING.md
Key Links
- Homepage: www.zstd.net
- RFC 8478: HTTP Content Compression — Zstandard Specification
- Entropy library: github.com/Cyan4973/FiniteStateEntropy
- Language bindings: Available for 30+ languages — see zstd homepage
- Contributing:
devbranch is the merge target; direct commits toreleasenot permitted