Embedding Models Overview
Transform text, images, and video into numerical vectors that capture semantic meaning, enabling powerful search, similarity comparison, and classification capabilities.Available Models
Amazon Models
- Titan Embed Text V2: Latest text embedding model with improved quality
- Titan Embed Text: Reliable text embedding model
- Titan Embed Image: Multimodal embeddings for text and images
- Titan E1M Medium: Multimodal text and image embeddings
- Nova 2 Multimodal Embeddings: Advanced multimodal embeddings for text and images
Cohere Models
- Embed V4: Latest multimodal embedding model for text and images
- Embed English: Optimized for English text
- Embed Multilingual: Support for 100+ languages
Twelve Labs Models
- Marengo Embed 3.0: Video and text embeddings for multimodal search
- Marengo Embed 2.7: Video and text embeddings
Model Capabilities
Semantic Search
Find relevant content based on meaning, not just keywords
Similarity Comparison
Measure semantic similarity between texts
Classification
Categorize content based on embedded features
Multimodal Search
Search across text, images, and video content
Embeddings API
Convert text to numerical vectors:Basic Example
Response Format
Batch Processing
Process multiple texts efficiently:Python
Model Comparison
| Model | Provider | Modalities | Strengths | Access |
|---|---|---|---|---|
| Embed V4 | Cohere | Text, Image | Latest multimodal, high quality | Basic |
| Embed English | Cohere | Text | Optimized for English | Basic |
| Embed Multilingual | Cohere | Text | 100+ languages | Basic |
| Titan Embed Text V2 | Amazon | Text | Latest generation, improved quality | Basic |
| Titan Embed Text | Amazon | Text | Reliable, affordable | Basic |
| Titan Embed Image | Amazon | Text, Image | Multimodal search | Basic |
| Titan E1M Medium | Amazon | Text, Image | Multimodal, affordable | Basic |
| Nova 2 Multimodal | Amazon | Text, Image | Advanced multimodal | Basic |
| Marengo Embed 3.0 | Twelve Labs | Text, Video | Video search and understanding | Basic |
| Marengo Embed 2.7 | Twelve Labs | Text, Video | Video embeddings | Basic |
Advanced Use Cases
Semantic Search Engine
Python
RAG (Retrieval-Augmented Generation)
Python
Best Practices
Choosing the Right Model
- cohere/embed-v4: Best for multimodal (text + image) use cases
- cohere/embed-multilingual: Best for multilingual content
- amazon/titan-embed-text-v2: Good general-purpose text embeddings
- twelvelabs/marengo-embed-3-0: Best for video search and understanding
Optimization Tips
- Batch processing: Send multiple texts in one request
- Caching: Store embeddings to avoid recomputation
- Preprocessing: Clean and normalize text before embedding
- Chunking: Split long documents into smaller segments
Common Use Cases
Search & Retrieval
Document search, FAQ systems, knowledge bases
Recommendation Systems
Content recommendations, similar item suggestions
Content Moderation
Detect inappropriate content, spam filtering
Data Analysis
Clustering, topic modeling, trend analysis
Getting Started
Quick Start
Generate your first embeddings
SDKs
Use our libraries