← Back Opcode Data Flow →

Attention Patterns in Loom

Five attention tricks let 8 transformer layers execute a 22-opcode ISA. All use symmetric Q=K with λ=10 temperature scaling to approximate hard attention via softmax. Click each pattern to see how Q, K, V, and scores work on a concrete example.