November 24th, 2024

32k context length text embedding models

Voyage AI released two embedding models, voyage-3 and voyage-3-lite, enhancing retrieval quality, reducing costs, and supporting a 32K-token context length, outperforming OpenAI's models in various domains.

Read original articleLink Icon
32k context length text embedding models

Voyage AI has announced the release of two new embedding models, voyage-3 and voyage-3-lite, which aim to enhance retrieval quality while reducing costs and latency. The voyage-3 model outperforms OpenAI's v3 large model by an average of 7.55% across various domains, including technology, law, and finance, while being 2.2 times cheaper and having a smaller embedding dimension. The voyage-3-lite model also shows improved retrieval accuracy, outperforming OpenAI v3 large by 3.82% and costing significantly less. Both models support a context length of 32K tokens, which is four times greater than that of OpenAI. The development of these models involved advanced research techniques, including improved architecture and extensive pre-training. Users of general-purpose embedding models are encouraged to upgrade to voyage-3 for better performance or choose voyage-3-lite for cost savings. The models are evaluated across 40 domain-specific datasets, demonstrating competitive performance, particularly in multilingual contexts. Voyage AI continues to offer domain-specific models for specialized needs, while the new models provide a versatile option for broader applications.

- Voyage-3 outperforms OpenAI v3 large by 7.55% while being 2.2x cheaper.

- Voyage-3-lite offers 3.82% better accuracy than OpenAI v3 large at 6x lower cost.

- Both models support a 32K-token context length, significantly higher than competitors.

- The models are designed for general-purpose use but also complement domain-specific models.

- Users can access the first 200M tokens for free to test the new models.

Link Icon 9 comments
By @albert_e - 3 months
Related question:

One year ago simonw said this in a post about embeddings:

[https://news.ycombinator.com/item?id=37985489]

> Lots of startups are launching new “vector databases”—which are effectively databases that are custom built to answer nearest-neighbour queries against vectors as quickly as possible.

> I’m not convinced you need an entirely new database for this: I’m more excited about adding custom indexes to existing databases. For example, SQLite has sqlite-vss and PostgreSQL has pgvector.

Do we still feel specialized vector databases are an overkill?

We have AWS promoting amazon OpenSearch as the default vector database for a RAG knowledge base and that service is not cheap.

Also I would like to understand a bit more about how to pre-process and chunk the data properly in a way that optimizes the vector embeddings, storage and retrieval ... any good guides on the same i cna refer to? Thanks!

By @throwup238 - 3 months
What’s the benefit of generating embeddings for such large chunks? Do people use these large contexts to include lots of document specific headers/footers or are they actually generating embeddings of single large documents?

I don’t understand how the math works out on those vectors

By @dtjohnnyb - 3 months
I've found good results from summarizing my documents using a large context model then embedding those summaries using a standard embedding model (e.g. e5)

This way I can tune what aspects of the doc I want to focus retrieval on, it's easier to determine when there are any data quality issues that need to be fixed, and the summaries have turned out to be useful for other use cases in the company.

By @Oras - 3 months
Not related, but why they don’t have a pricing page? Last time I checked voyageai I had to google their pricing to find the page as it’s not in the nav menu.
By @johnfn - 3 months
What on earth is "OpenAI V3"? Just to be sure I wasn't being obtuse, I Googled it, only to get a bunch of articles pointing back at this post.
By @antirez - 3 months
I wonder if random projections or other similar dimensionality reduction techniques work equally well then using a model specialized in smaller embeddings capturing the same amount of semantical information. This way we could use larger embeddings of open models working very well and yet enjoy faster node to node similarity during searches.
By @OutOfHere - 3 months
I would like to see an independent benchmark.
By @ldjkfkdsjnv - 3 months
I build a RAG system with voyage and it crushed openai embeddings, the difference in retrieval quality was noticeable