Building a Better Entropy Coder

The Arithmetic Coder Dominance

For decades, arithmetic coding and its variants (range coding, ANS) have been the gold standard for entropy coding:

Arithmetic Coding

1970s

Rissanen, Pasco, Martin

Range Coding

1990s

Simplified variant, patent-free

ANS (tANS/rANS)

2014

Jarosław Duda, faster decoding

Why they dominate: Near-optimal compression (approaching Shannon entropy), flexible probability models, proven mathematics

A Chance Encounter at a Robotics Seminar

During a robotics seminar, a conversation sparked an entirely new direction...

"Have you looked at expander graphs for your compression problem? They have incredible mixing properties that might give you better state space coverage than traditional arithmetic coding."

— Robotics researcher discussing path planning algorithms

This led down a rabbit hole exploring:

Expander graphs – sparse graphs with strong connectivity properties
Nondeterministic Turing machines – computational models with multiple possible transitions
State space exploration – alternative approaches to entropy coding

Key Insight

What if entropy coding could use graph-theoretic structures instead of interval subdivision?

Expander Graphs

Properties:
• Sparse (few edges per vertex)
• High connectivity (hard to cut)
• Rapid mixing (fast convergence)
• Spectral gap (eigenvalue properties)

Compression analogy:
State space = graph vertices
Transitions = edges
Symbol encoding = path selection

Used in: Error-correcting codes, derandomization, network routing, pseudorandomness

Nondeterministic Turing Machines

Properties:
• Multiple valid transitions
• Parallel computation paths
• State explosion (exponential)
• Choice-based encoding

Compression analogy:
Current state = encoder position
Symbol = choice of transition
Next state = multiple possibilities
Decode = follow deterministic path

Used in: Complexity theory (NP), formal verification, parsing algorithms

The Challenge

Could these concepts from graph theory and theoretical computer science lead to a fundamentally different approach to entropy coding?

Traditional Approach

Interval Subdivision

Arithmetic coding narrows interval [0,1) based on symbol probabilities

Graph-Theoretic Approach?

State Space Navigation

Traverse expander graph using symbol-driven path selection

Nondeterministic Approach?

Choice Encoding

Encode which transition among multiple valid options

Next: Exploring whether these theoretical concepts can beat 50+ years of arithmetic coding optimization...