Computer Vision Toolbox Model for OpenAI CLIP Network

by MathWorks Computer Vision Toolbox Team

The Contrastive Learning Image Pre-Training (CLIP) network is a vision language model that can be used for joint image-text classification.

0.0

(0)

18 Downloads

Updated 15 Oct 2025

Share
Download

The CLIP network uses contrastive learning to encode image and textual data into a shared feature space for joint classification. Images and text with high similarity will be close in this feature space, and have a high CLIP score. This further enables image search from input text, and text search from an input image.

MATLAB Release Compatibility

Created with R2026a

Compatible with R2026a

Platform Compatibility

Windows macOS (Apple Silicon) macOS (Intel) Linux

Tags Add Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Computer Vision Toolbox Model for OpenAI CLIP Network

Requires

MATLAB Release Compatibility

Platform Compatibility

Tags Add Tags

Community Treasure Hunt

Discover Live Editor