Instead of a .cu file, embed the kernel directly in the test file. This is handy for quick experiments — editing and saving re-runs immediately.