32k context length text embedding models
Voyage AI released two embedding models, voyage-3 and voyage-3-lite, enhancing retrieval quality, reducing costs, and supporting a 32K-token context length, outperforming OpenAI's models in various domains.
Read original articleVoyage AI has announced the release of two new embedding models, voyage-3 and voyage-3-lite, which aim to enhance retrieval quality while reducing costs and latency. The voyage-3 model outperforms OpenAI's v3 large model by an average of 7.55% across various domains, including technology, law, and finance, while being 2.2 times cheaper and having a smaller embedding dimension. The voyage-3-lite model also shows improved retrieval accuracy, outperforming OpenAI v3 large by 3.82% and costing significantly less. Both models support a context length of 32K tokens, which is four times greater than that of OpenAI. The development of these models involved advanced research techniques, including improved architecture and extensive pre-training. Users of general-purpose embedding models are encouraged to upgrade to voyage-3 for better performance or choose voyage-3-lite for cost savings. The models are evaluated across 40 domain-specific datasets, demonstrating competitive performance, particularly in multilingual contexts. Voyage AI continues to offer domain-specific models for specialized needs, while the new models provide a versatile option for broader applications.
- Voyage-3 outperforms OpenAI v3 large by 7.55% while being 2.2x cheaper.
- Voyage-3-lite offers 3.82% better accuracy than OpenAI v3 large at 6x lower cost.
- Both models support a 32K-token context length, significantly higher than competitors.
- The models are designed for general-purpose use but also complement domain-specific models.
- Users can access the first 200M tokens for free to test the new models.
Related
OpenAI slashes the cost of using its AI with a "mini" model
OpenAI launches GPT-4o mini, a cheaper model enhancing AI accessibility. Meta to release Llama 3. Market sees a mix of small and large models for cost-effective AI solutions.
Show HN: Keep Your Next Viral AI App Free for Longer with Local Embeddings
Function LLM is a new tool that allows developers to generate local embeddings, potentially saving up to 60% on OpenAI costs while enhancing privacy and requiring minimal implementation effort.
Llama 3.2: Revolutionizing edge AI and vision with open, customizable models
Meta released Llama 3.2, featuring vision models with 11B and 90B parameters, and lightweight text models with 1B and 3B parameters, optimized for edge devices and supporting extensive deployment options.
DeepSeek v2.5 – open-source LLM comparable to GPT-4o, but 95% less expensive
DeepSeek launched DeepSeek-V2.5, an advanced open-source model with a 128K context length, excelling in math and coding tasks, and offering competitive API pricing for developers.
All-in-one embedding model for interleaved text, images, and screenshots
Voyage AI released voyage-multimodal-3, an embedding model that enhances retrieval accuracy by 19.63%, integrating text and images for improved performance in multimodal tasks, now available with 200 million free tokens.
One year ago simonw said this in a post about embeddings:
[https://news.ycombinator.com/item?id=37985489]
> Lots of startups are launching new “vector databases”—which are effectively databases that are custom built to answer nearest-neighbour queries against vectors as quickly as possible.
> I’m not convinced you need an entirely new database for this: I’m more excited about adding custom indexes to existing databases. For example, SQLite has sqlite-vss and PostgreSQL has pgvector.
Do we still feel specialized vector databases are an overkill?
We have AWS promoting amazon OpenSearch as the default vector database for a RAG knowledge base and that service is not cheap.
Also I would like to understand a bit more about how to pre-process and chunk the data properly in a way that optimizes the vector embeddings, storage and retrieval ... any good guides on the same i cna refer to? Thanks!
I don’t understand how the math works out on those vectors
This way I can tune what aspects of the doc I want to focus retrieval on, it's easier to determine when there are any data quality issues that need to be fixed, and the summaries have turned out to be useful for other use cases in the company.
Related
OpenAI slashes the cost of using its AI with a "mini" model
OpenAI launches GPT-4o mini, a cheaper model enhancing AI accessibility. Meta to release Llama 3. Market sees a mix of small and large models for cost-effective AI solutions.
Show HN: Keep Your Next Viral AI App Free for Longer with Local Embeddings
Function LLM is a new tool that allows developers to generate local embeddings, potentially saving up to 60% on OpenAI costs while enhancing privacy and requiring minimal implementation effort.
Llama 3.2: Revolutionizing edge AI and vision with open, customizable models
Meta released Llama 3.2, featuring vision models with 11B and 90B parameters, and lightweight text models with 1B and 3B parameters, optimized for edge devices and supporting extensive deployment options.
DeepSeek v2.5 – open-source LLM comparable to GPT-4o, but 95% less expensive
DeepSeek launched DeepSeek-V2.5, an advanced open-source model with a 128K context length, excelling in math and coding tasks, and offering competitive API pricing for developers.
All-in-one embedding model for interleaved text, images, and screenshots
Voyage AI released voyage-multimodal-3, an embedding model that enhances retrieval accuracy by 19.63%, integrating text and images for improved performance in multimodal tasks, now available with 200 million free tokens.