Voiceover (00:01):
Welcome to Bytes from SkadBytes, jargon-free, byte-size insights from Skadden's IP and Tech team on the key issues shaping the tech landscape.
Akvile Jaseviciute (00:12):
Hi, I'm Akvile Jaseviciute from the IP and Technology team here in Skadden London. Here's your quick byte on unimodal and multimodal AI, the difference between the two, and why it matters. Let's start with the basics. The difference between unimodal and multimodal AI models is the type and diversity of data they can process. Unimodal AI models analyze and process one type of data.
(00:36):
For example, a chatbot trained on only written language is unimodal, as is a facial recognition system that only processes visual data. Multimodal AI combines and interprets multiple types of data, like text, images, video, and audio, within a single model architecture, allowing it to understand relationships across those modes. Imagine a virtual assistant that reads your emails, understands your calendar, and can recognize your facial expressions all in real time. Many of today's most powerful models are multimodal foundational models, retrained on vast and diverse data sets, then fine-tuned for specific tasks, raised in complex licensing and data provenance issues. Multimodal AI opens up new product capabilities, but also raises training data and IP issues, increasing the risk of infringing of third-party IP rights.
(01:26):
This can blur the line between content creation and content analysis, which raises copyright and ownership questions. The EU AI Act requires companies that use AI to classify the risk of their AI systems by making a distinction between unimodal and multimodal AI. Multimodal AI models, particularly those designed for broad applicability, are more likely to be classified as general-purpose AI or GPAI under the EU AI Act. That can mean extra transparency documentation and copyright safeguards, especially if the model is available downstream. In short, as the capabilities of AI expand, so do the legal and ethical responsibilities that come with them. This makes understanding the distinction between unimodal and multimodal AI foundational to managing legal risk.
Voiceover (02:11):
Thanks for listening to Bytes. Be sure to subscribe for more tech insights. Additional information about Skadden can be found at skadden.com.