Cohere Rerank 4
Cohere Rerank 4 is a rerank model available in two variants, Pro and Fast.
Reranking improves search relevance by reordering an initial set of retrieved results. After a retrieval step returns candidate documents, the reranking model compares the query with each candidate and ranks the results from most relevant to least relevant.
Cohere Rerank 4 supports multilingual reranking and semi-structured content, including JSON, tables, and code-like content.
What’s New in Rerank 4
Compared with Cohere Rerank 3.5, Rerank 4 adds a larger context window, improved reranking quality, self-learning support, and two variants optimized for different workload requirements
- Increased context window
-
Rerank 4 supports a 32,000-token context window. The larger context window improves handling for long documents and larger candidate inputs, which is useful for dense enterprise content such as reports, contracts, manuals, and technical documentation.
- Improved reranking quality
-
Rerank 4 improves result ordering for enterprise retrieval workloads. It provides stronger relevance ranking for business, finance, technical, and other domain-specific content, which can improve downstream retrieval-augmented generation workflows by surfacing more relevant context.
- Self-learning support
-
Rerank 4 introduces self-learning support, which helps adapt reranking behavior to domain-specific data, terminology, and relevance preferences without requiring annotated training data.
- Pro and Fast variants
-
Rerank 4 is available in two variants:
- Pro is optimized for higher-precision reranking and more complex retrieval tasks.
- Fast is optimized for lower-latency, higher-throughput workloads.
- Multilingual and semi-structured data support
-
Rerank 4 supports reranking for English and non-English content across more than 100 languages. It also supports semi-structured content, including JSON, tables, and code-like content.
Regions for this Model
For supported regions, endpoint types (on-demand or dedicated AI clusters), and hosting (OCI Generative AI or external calls) for this model, see the Models by Region page. For details about the regions, see the Generative AI Regions page.
Model Variants
Cohere Rerank 4 includes the following model variants:
| Model | OCI Model Name | Description |
|---|---|---|
| Cohere Rerank 4 Pro | cohere.rerank-v4.0-pro |
Multilingual reranking model for English and non-English text and semi-structured JSON data. Best suited for quality-focused and complex reranking workloads. |
| Cohere Rerank 4 Fast | cohere.rerank-v4.0-fast |
Lightweight multilingual reranking model for English and non-English text and semi-structured JSON data. Best suited for lower-latency and higher-throughput workloads. |
On-Demand Mode
Some Cohere Rerank 4 variants are available on-demand in supported regions. On-demand mode doesn't require a dedicated AI cluster.
See Models by Region to check which model variants are available on-demand and in which regions.
| Model Name | OCI Model Name | Pricing Page Product Name |
|---|---|---|
| Cohere Rerank 4 Pro | cohere.rerank-v4.0-pro |
Rerank 4 Pro |
| Cohere Rerank 4 Fast | cohere.rerank-v4.0-fast |
Rerank 4 Fast |
Pricing is based on 1,000 search units. See the Pricing Page.
Learn about On-Demand Mode.
Dedicated AI Cluster for the Model
Some Cohere Rerank 4 variants are available through dedicated AI clusters in supported regions. These models aren't available for fine-tuning.
For dedicated mode, create an endpoint on a hosting dedicated AI cluster.
| Model | Hardware Unit Size | Available Regions | Request Cluster Limit Increase |
|---|---|---|---|
Cohere Rerank 4 Pro (cohere.rerank-v4.0-pro) |
COHERE_A100_80G_X1 |
|
|
Cohere Rerank 4 Pro (cohere.rerank-v4.0-pro) |
COHERE_H100_X1 |
|
|
Cohere Rerank 4 Fast (cohere.rerank-v4.0-fast) |
COHERE_A100_80G_X1 |
|
|
Cohere Rerank 4 Fast (cohere.rerank-v4.0-fast) |
COHERE_H100_X1 |
|
|
For pricing, see the Cost estimator and the Pricing Page.
If the tenancy doesn't have enough limits to host these models on a dedicated AI cluster, request a limit increase for the hardware shape used in the target region. For example, to host the models in US West (Phoenix), request an increase of 1 for dedicated-unit-a100-80g-count.
Access this Model
To use a Cohere Rerank 4 model, call the RerankText API from a supported region.
- Endpoint
https://inference.generativeai.{region}.oci.oraclecloud.com- API operation
POST /20231130/actions/rerankText
In RerankTextDetails, for servingMode, set the servingType attribute based on how you want to access the model:
- Use
ON_DEMANDfor an on-demand model in a supported region. - Use
DEDICATEDfor a model hosted on a dedicated AI cluster endpoint.
For availability and setup details, see the preceding On-Demand Mode and Dedicated AI Cluster for the Model sections.
OCI Release and Retirement Dates
For release and retirement dates and replacement model options, see the following pages based on the mode (on-demand or dedicated):
Rerank Model Parameters
For the Rerank model parameters, see the RerankText API documentation.