August 8th, 2024

I built a vector embedding database in Go for learning purposes

VecDB is a vector embedding database for educational and production use, utilizing a key-value model for vector storage, supporting raw vector and text embedding operations, with customizable server configuration.

Read original articleLink Icon
I built a vector embedding database in Go for learning purposes

VecDB is a vector embedding database designed for finding similar items using a hash-table approach. It serves both educational and production purposes, utilizing a simple `{key => value}` data model where the key is a unique identifier and the value is a vector of floats. The default configuration looks for a `config.yml` file in the current directory, but users can specify a custom path. The server listens on `0.0.0.0:3000`, and it currently supports the "bolt" storage driver based on BoltDB and the "gemini" embedding driver for text embeddings. The database has two main components: the Raw Vectors Layer, which handles storing and searching for vectors based on cosine similarity, and the optional Embedding Layer, which allows for converting text to vectors and searching for similar vectors based on text input. Example requests for writing and searching vectors are provided, demonstrating the JSON format required. Installation options include downloading the binary or using a Docker image available in the repository.

- VecDB is designed for both educational and production use.

- It employs a simple key-value data model for vector storage.

- The server configuration is customizable, with default settings for easy setup.

- It supports both raw vector operations and text embedding functionalities.

- Installation can be done via binary download or Docker image.

Link Icon 1 comments
By @PhilippGille - 5 months
For vector DBs running as server I think there's already a lot of choice (Qdrant, Chroma, Milvus, Weaviate, but also PostgreSQL with pgvector etc.). But as you said it was a fun learning experience for you, so that's great!

When I needed a vector DB for Go I was looking for an embedded/in-process library that doesn't require CGO (there are CGO bindings to Faiss, Annoy, sqlite-vss etc) and didn't find a suitable one, so I built https://github.com/philippgille/chromem-go/

One feedback for your library: I noticed you have a top level directory `internals`. Is that a typo of `internal`? The latter has the special property of only being importable from the same module.