Building a Better Entropy Coder

The Arithmetic Coder Dominance

For decades, arithmetic coding and its variants (range coding, ANS) have been the gold standard for entropy coding:

Arithmetic Coding
1970s
Rissanen, Pasco, Martin
Range Coding
1990s
Simplified variant, patent-free
ANS (tANS/rANS)
2014
Jarosław Duda, faster decoding

Why they dominate: Near-optimal compression (approaching Shannon entropy), flexible probability models, proven mathematics

A Chance Encounter at a Robotics Seminar

During a robotics seminar, a conversation sparked an entirely new direction...

"Have you looked at expander graphs for your compression problem? They have incredible mixing properties that might give you better state space coverage than traditional arithmetic coding."

— Robotics researcher discussing path planning algorithms

This led down a rabbit hole exploring:

  • Expander graphs – sparse graphs with strong connectivity properties
  • Nondeterministic Turing machines – computational models with multiple possible transitions
  • State space exploration – alternative approaches to entropy coding
Key Insight

What if entropy coding could use graph-theoretic structures instead of interval subdivision?

Expander Graphs

Properties:
• Sparse (few edges per vertex)
• High connectivity (hard to cut)
• Rapid mixing (fast convergence)
• Spectral gap (eigenvalue properties)

Compression analogy:
State space = graph vertices
Transitions = edges
Symbol encoding = path selection

Used in: Error-correcting codes, derandomization, network routing, pseudorandomness

Nondeterministic Turing Machines

Properties:
• Multiple valid transitions
• Parallel computation paths
• State explosion (exponential)
• Choice-based encoding

Compression analogy:
Current state = encoder position
Symbol = choice of transition
Next state = multiple possibilities
Decode = follow deterministic path

Used in: Complexity theory (NP), formal verification, parsing algorithms

The Challenge

Could these concepts from graph theory and theoretical computer science lead to a fundamentally different approach to entropy coding?

Traditional Approach
Interval Subdivision
Arithmetic coding narrows interval [0,1) based on symbol probabilities
Graph-Theoretic Approach?
State Space Navigation
Traverse expander graph using symbol-driven path selection
Nondeterministic Approach?
Choice Encoding
Encode which transition among multiple valid options

Next: Exploring whether these theoretical concepts can beat 50+ years of arithmetic coding optimization...