Text embeddings are numerical representations of text that enable measuring semantic similarity. This guide introduces embeddings, their applications, and how to use embedding models for tasks like search, recommendations, and anomaly detection.
Model | Context Length | Embedding Dimension | Description |
---|---|---|---|
voyage-3-large | 32,000 | 1024 (default), 256, 512, 2048 | The best general-purpose and multilingual retrieval quality. See blog post for details. |
voyage-3.5 | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for general-purpose and multilingual retrieval quality. See blog post for details. |
voyage-3.5-lite | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for latency and cost. See blog post for details. |
voyage-code-3 | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for code retrieval. See blog post for details. |
voyage-finance-2 | 32,000 | 1024 | Optimized for finance retrieval and RAG. See blog post for details. |
voyage-law-2 | 16,000 | 1024 | Optimized for legal and long-context retrieval and RAG. Also improved performance across all domains. See blog post for details. |
Model | Context Length | Embedding Dimension | Description |
---|---|---|---|
voyage-multimodal-3 | 32000 | 1024 | Rich multimodal embedding model that can vectorize interleaved text and content-rich images, such as screenshots of PDFs, slides, tables, figures, and more. See blog post for details. |
voyageai
Python package or HTTP requests, as described below.
voyageai
package can be installed using the following command:
result.embeddings
will be a list of two embedding vectors, each containing 1024 floating-point numbers. After running the above code, the two embeddings will be printed on the screen:
embed()
function.
For more information on the Voyage python package, see the Voyage documentation.
curl
command in a terminal:
input_type="document"
and input_type="query"
for embedding the document and query, respectively. More specification can be found here.
The output would be the 5th document, which is indeed the most relevant to the query:
Why do Voyage embeddings have superior quality?
What embedding models are available and which should I use?
voyage-3-large
: Best qualityvoyage-3.5-lite
: Lowest latency and costvoyage-3.5
: Balanced performance with superior retrieval quality at a competitive price pointinput_type
parameter to specify whether the text is a query or document type.Domain-specific models:voyage-law-2
voyage-code-3
voyage-finance-2
Which similarity function should I use?
What is the relationship between characters, words, and tokens?
When and how should I use the input_type parameter?
input_type
parameter be used to specify whether the input text is a query or document. Do not omit input_type
or set input_type=None
. Specifying whether input text is a query or document can create better dense vector representations for retrieval, which can lead to better retrieval quality.When using the input_type
parameter, special prompts are prepended to the input text prior to embedding. Specifically:📘 Prompts associated withinput_type
- For a query, the prompt is “Represent the query for retrieving supporting documents: “.
- For a document, the prompt is “Represent the document for retrieval: “.
- Example
- When
input_type="query"
, a query like “When is Apple’s conference call scheduled?” will become “Represent the query for retrieving supporting documents: When is Apple’s conference call scheduled?”- When
input_type="document"
, a query like “Apple’s conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.” will become “Represent the document for retrieval: Apple’s conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.”
voyage-large-2-instruct
, as the name suggests, is trained to be responsive to additional instructions that are prepended to the input text. For classification, clustering, or other MTEB subtasks, please use the instructions here.What quantization options are available?
output_dtype
parameter:float
: Each returned embedding is a list of 32-bit (4-byte) single-precision floating-point numbers. This is the default and provides the highest precision / retrieval accuracy.int8
and uint8
: Each returned embedding is a list of 8-bit (1-byte) integers ranging from -128 to 127 and 0 to 255, respectively.binary
and ubinary
: Each returned embedding is a list of 8-bit integers that represent bit-packed, quantized single-bit embedding values: int8
for binary
and uint8
for ubinary
. The length of the returned list of integers is 1/8 of the actual dimension of the embedding. The binary type uses the offset binary method, which you can learn more about in the FAQ below.Binary quantization example Consider the following eight embedding values: -0.03955078, 0.006214142, -0.07446289, -0.039001465, 0.0046463013, 0.00030612946, -0.08496094, and 0.03994751. With binary quantization, values less than or equal to zero will be quantized to a binary zero, and positive values to a binary one, resulting in the following binary sequence: 0, 1, 0, 0, 1, 1, 0, 1. These eight bits are then packed into a single 8-bit integer, 01001101 (with the leftmost bit as the most significant bit).
ubinary
: The binary sequence is directly converted and represented as the unsigned integer (uint8
) 77.binary
: The binary sequence is represented as the signed integer (int8
) -51, calculated using the offset binary method (77 - 128 = -51).
How can I truncate Matryoshka embeddings?
voyage-code-3
, that support multiple output dimensions generate such Matryoshka embeddings. You can truncate these vectors by keeping the leading subset of dimensions. For example, the following Python code demonstrates how to truncate 1024-dimensional vectors to 256 dimensions: