Cohere Embed 4 (New)

The cohere.embed-v4.0 is a multimodal model that can create text embeddings from a mixed modality input, which is an input of text and images in a single payload.

Available in These Regions

  • Brazil East (Sao Paulo) (dedicated AI cluster only)
  • Germany Central (Frankfurt) (dedicated AI cluster only)
  • India South (Hyderabad) (dedicated AI cluster only)
  • Japan Central (Osaka)
  • UAE East (Dubai) (dedicated AI cluster only)
  • UK South (London) (dedicated AI cluster only)
  • US Midwest (Chicago)

Key Features

  • Mode
    • Input text or image, but not both.
    • To get embeddings for an image, only one image is allowed. You can't combine text and image for the same embedding. Image input through API only.
  • Input and Output
    • In the Console, each text input must be less than 512 tokens and maximum 96 inputs per run.
    • In the SDK and API, all inputs together can add up to 128,000 tokens per embedding per run.
    • Model outputs a 1,536-dimensional vector for each embedding.
  • Language Support

Dedicated AI Cluster for the Model

To reach a model through a dedicated AI cluster in any listed region, you must create an endpoint for that model on a dedicated AI cluster. For the cluster unit size that matches this model, see the following table.

Base Model Fine-Tuning Cluster Hosting Cluster Pricing Page Information Request Cluster Limit Increase
  • Model Name: Cohere Embed 4
  • OCI Model Name: cohere.embed-v4.0
Not available for fine-tuning
  • Unit Size: Embed Cohere
  • Required Units: 1
  • Pricing Page Product Name: Embed Cohere - Dedicated
  • For Hosting, Multiply the Unit Price: x1
  • Limit Name: dedicated-unit-embed-cohere-count
  • For Hosting, Request Limit Increase by: 1
Tip

  • If you don't have enough cluster limits in your tenancy for hosting an Embed model on a dedicated AI cluster, request the dedicated-unit-embed-cohere-count limit to increase by 1.

  • Review the Cohere Embed 4 cluster performance benchmarks for different use cases.

Release and Retirement Dates

Model Release Date On-Demand Retirement Date Dedicated Mode Retirement Date
cohere.embed-v4.0 2025-07-03 At least 6 months after the release of the 1st replacement model. At least 6 months after the release of the 1st replacement model.
Important

For a list of all model time lines and retirement details, see Retiring the Models.

Embedding Model Parameter

When using the embedding models, you can get a different output by changing the following parameter.

Truncate

Whether to truncate the start or end tokens in a sentence, when that sentence exceeds the maximum number of allowed tokens. For example, a sentence has 516 tokens, but the maximum token size is 512. If you select to truncate the end, the last 4 tokens of that sentence are cut off.