Five attention tricks let 8 transformer layers execute a 22-opcode ISA. All use symmetric Q=K with λ=10 temperature scaling to approximate hard attention via softmax. Click each pattern to see how Q, K, V, and scores work on a concrete example.