Multi-Context Architecture

Mixture of Experts Context Selection

Borrowed from PPM/PPMD

HoloCodec adapts the multiple context length strategy from Prediction by Partial Matching (PPM) algorithms. Rather than relying on a single fixed context window, the system maintains parallel probability matrices for different context orders.

Parallel Context Modeling

For each context length k ∈ {0, 1, 2, 3, ..., n}, HoloCodec maintains:

Pk(s | context[−k:]) = softmax(Wk · φk(context[−k:]))
k=0 Order-0 (global)
k=1..5 Short contexts
k=6+ Long contexts

Mixture of Experts (MoE) Selection

At each encoding step, HoloCodec employs a gating mechanism to determine which context length provides the most confident prediction:

1. Compute Confidence Scores: Evaluate prediction confidence for each context k

2. Expert Gating: gk = confidence_score(Pk) → which expert is most certain?

3. Select Winner: k* = argmaxk(gk) → choose highest confidence context

4. Encode Symbol: Use Pk* distribution for range encoding

Adaptive Context Switching

This approach allows the compressor to dynamically select the optimal context length based on local data characteristics:

  • Repetitive patterns: Longer contexts capture exact matches
  • Novel sequences: Shorter contexts provide fallback distributions
  • Mixed content: Seamless switching between context orders

Key Insight

Unlike traditional PPM which uses escape codes to blend contexts, HoloCodec makes a decisive selection per symbol based on confidence. This reduces overhead and allows the model to fully commit to the most informative context.

Slide 9
Previous Next