Tests a per-token-identity (or fuzzy-n-gram) lookup table of attention patterns built during prefill and queried during decode, plus its use for adaptive KV-cache quantization. The full write-up is in ...