For decades, arithmetic coding and its variants (range coding, ANS) have been the gold standard for entropy coding:
Why they dominate: Near-optimal compression (approaching Shannon entropy), flexible probability models, proven mathematics
During a robotics seminar, a conversation sparked an entirely new direction...
"Have you looked at expander graphs for your compression problem? They have incredible mixing properties that might give you better state space coverage than traditional arithmetic coding."
This led down a rabbit hole exploring:
What if entropy coding could use graph-theoretic structures instead of interval subdivision?
Properties:
• Sparse (few edges per vertex)
• High connectivity (hard to cut)
• Rapid mixing (fast convergence)
• Spectral gap (eigenvalue properties)
Compression analogy:
State space = graph vertices
Transitions = edges
Symbol encoding = path selection
Used in: Error-correcting codes, derandomization, network routing, pseudorandomness
Properties:
• Multiple valid transitions
• Parallel computation paths
• State explosion (exponential)
• Choice-based encoding
Compression analogy:
Current state = encoder position
Symbol = choice of transition
Next state = multiple possibilities
Decode = follow deterministic path
Used in: Complexity theory (NP), formal verification, parsing algorithms
Could these concepts from graph theory and theoretical computer science lead to a fundamentally different approach to entropy coding?
Next: Exploring whether these theoretical concepts can beat 50+ years of arithmetic coding optimization...