Command-R

Command-R (โŒ˜-R) is a RAG-optimised business purpose-built model from Cohere aimed at large-scale production use. It has a 128k context window, is multilingual (enables the model to draw answers from sources in different languages), can work with tools (e.g., code interpreter and other user-defined tools) [@gomezCommandRRAGProduction2024], and works with Cohereโ€™s Embed and Rerank models.

โŒ˜-R is available on NVIDIA NIM, Cohereโ€™s hosted API, and AWS Bedrock. It is expected to be available on other major cloud platforms soon. Cohere states that its models can be securely fine-tuned on a companyโ€™s proprietary data, enabling specialised business needs [@cohereCohereCommandREnterprise2024].

Cohere API Pricing$ / M input tokens$ / M output tokens
Command-R$0.50$1.50

Rerank

Due to the size limitations of the LLM context window and performance considerations, it is not always feasible to feed all the relevant documents to the retrieval models in RAG applications. Arbitrary limiting the number of documents fed into the retrieval model may result in missing critical information and, ultimately, suboptimal output.

To resolve this challenge, a Reranker model can be used to index and rank the documents and only feed the most relevant ones to the retriever. This could greatly improve both performance and search accuracy.

Cohere provides two Rerank models: rerank-english-v2.0 for English-language documents and rerank-multingual-v2.0 model for languages in other languages.

Embed

Cohere provides a suite of embedding models for documents in English and many other languages. These models classify text and turn them into embeddings.

References