Tests for TensorScatter(opset 24) + Attention(opset 24) pattern. - GQA path (kv_num_heads != q_num_heads) uses flash attention for external KV cache (fp16/bf16) - MHA path (kv_num_heads == q_num_heads ...
// RUN: %clang_cc1 -mllvm -emptyline-comment-coverage=false -fprofile-instrument=clang -fcoverage-mapping -dump-coverage-mapping -emit-llvm-only -main-file-name ...