multimodal model

An AI system that can process and combine information from multiple input types, such as text, images, and audio.