Multimodal large language models are beginning to transform science education by combining text, visuals, audio, and other data to enrich teaching and learning. From analyzing classroom interactions ...