August 2nd, 2024

SQLite vector search extension that runs anywhere

sqlite-vec is an SQLite extension for fast vector search, supporting float, int8, and binary vectors. It is compatible with multiple platforms and easy to install across various programming languages.

Read original articleLink Icon
ExcitementAppreciationCuriosity
SQLite vector search extension that runs anywhere

The GitHub repository sqlite-vec is an SQLite extension that facilitates fast vector search capabilities. It is designed for efficient storage and querying of float, int8, and binary vectors, and is compatible with multiple platforms including Linux, MacOS, Windows, and browsers via WASM. The extension is written in pure C, has no dependencies, and features virtual tables for vector storage and the ability to pre-filter vectors using subqueries.

Installation of sqlite-vec is straightforward across various programming languages, including Python, Node.js, Ruby, Go, Rust, Datasette, and sqlite-utils. Users can install it using package managers like pip, npm, gem, go get, and cargo.

A sample usage demonstrates how to create a virtual table for vector examples, insert sample embeddings, and perform a query to find the closest vectors based on a specified embedding. The project is supported by Mozilla and other sponsors, which contributes to its ongoing development and maintenance. For further details, users can refer to the official documentation or the GitHub repository.

AI: What people are saying
The comments reflect excitement and interest in the sqlite-vec extension for vector search.
  • Users express eagerness to try sqlite-vec for various projects, including recommendation engines and text analysis.
  • There are discussions about installation challenges and compatibility across different Python environments.
  • The author, AlexG, engages with the community, providing insights into the extension's portability and performance.
  • Some users share their experiences with similar tools, indicating a competitive landscape in vector search solutions.
  • Questions arise regarding specific features, such as maximum vector size and potential use cases in other applications.
Link Icon 16 comments
By @alexgarcia-xyz - 4 months
Author here, happy to answer any questions! Been working on this for a while, so I'm very happy to get this v0.1.0 "stable" release out.

sqlite-vec works on MacOS, Linux, Windows, Raspberry Pis, in the browser with WASM, and (theoretically) on mobile devices. I focused a lot on making it as portable as possible. It's also pretty fast - benchmarks are hard to do accurately, but I'd comfortable saying that it's a very very fast brute-force vector search solution.

One experimental feature I'm working on: You can directly query vectors that are in-memory as a contiguous block of memory (ie NumPy), without any copying or cloning. You can see the benchmarks for that feature here under "sqlite-vec static", and it's competitive with faiss/usearch/duckdb https://alexgarcia.xyz/blog/2024/sqlite-vec-stable-release/i...

By @simonw - 4 months
Lots more details in Alex's blog post here: https://alexgarcia.xyz/blog/2024/sqlite-vec-stable-release/i...
By @Cieric - 4 months
I feel like I've touched a lot of things where something like this is useful (hobby projects). In my case I've done a recommendation engine, music matching (I specifically use it for matching anime to their data), and perceptual hash matching.
By @cotega - 3 months
I absolutely love this, great work! For those that might find it useful, I created a Python notebook that shows how to extend this to perform Hybrid Search (Vector + BM25 based Full Text search) https://github.com/liamca/sqlite-hybrid-search
By @pjot - 4 months
I’ve done something similar, but using duckDB as the backend.

https://github.com/patricktrainer/duckdb-embedding-search

By @bodantogat - 4 months
This sounds useful (I do a lot of throw-away text analysis on my laptop)
By @1yefuwang1 - 4 months
Hi, nice work. I write a similar vector search extension https://github.com/1yefuwang1/vectorlite inspired by sqlite-vss using C++17 and hnswlib.

I'd like to do a benchmark to compare it with sqlite-vec, but I guess it is not a fair comparison given that sqlite-vec uses brute-force only.

One thing I'd recommend is to include recall rate in your benchmark data.

Brute force approach is a good starting point but doesn't scale with serious production workload.

By @dang - 4 months
Related:

I’m writing a new vector search SQLite Extension - https://news.ycombinator.com/item?id=40243168 - May 2024 (85 comments)

By @deepsquirrelnet - 4 months
I love this. I know how much work addressing the dependencies must be, but you’re really attacking the right problems. Looking forward to trying this out with my project.
By @huevosabio - 4 months
Been using this for video games and it's absolutely awesome. Alex, the author, is also great and very approachable.

I've been looking for something like this for a while.

By @nattaylor - 4 months
I have a use case for this that I'm excited to try. I'm glad AlexG has put so much effort into this. Even the docs are pretty good!

My pyenv python3.12.2's sqlite won't load extensions even after installing with what I think are the correct command line flags. Argh!

My brew installed python3.12's sqlite will load extensions though, so I can proceed.

By @mic47 - 4 months
Nice. Been waiting for this release to try it out.
By @pietz - 4 months
Is this also what turso uses in their "AI feature"?
By @haolez - 4 months
What's the maximum vector size?
By @fsndz - 4 months
I love this. I am currently doing this tutorial of RAG where the vector DB is simply postgreSQL and pgvector. I guess I can try to reproduce that with SQLite and sqlite-vec now ! Awesome: https://www.lycee.ai/courses/91b8b189-729a-471a-8ae1-717033c...