Once you have your embeddings, you can store them in a vector database such as Turbopuffer or Pinecone. These databases are designed for efficient similarity search over large collections of vectors.
Most vector databases let you upsert (insert or update) vectors with an ID and metadata:
Copy
Ask AI
import pinecone# Initialize Pinecone (example)pinecone.init(api_key="[PINECONE_API_KEY]", environment="us-west1-gcp")index = pinecone.Index("my-embeddings-index")# Prepare vectors for upsertvectors = [ (f"code-{r['index']}", r["embedding"], {"source": inputs[r["index"]]}) for r in embeddings["results"]]# Upsert to Pineconeindex.upsert(vectors)
Turbopuffer and other vector databases have similar APIs for inserting vectors.
To search, embed the user query using the same API, then query the vector database for the most similar vectors (using cosine similarity or the database’s default metric):
Copy
Ask AI
# Embed the user queryquery = "How do I add two numbers?"query_data = { "model": "relace-embed-v1", "input": [query]}query_response = requests.post(url, headers=headers, json=query_data)query_embedding = query_response.json()["results"][0]["embedding"]# Query Pinecone for similar coderesults = index.query(vector=query_embedding, top_k=5, include_metadata=True)for match in results["matches"]: print(f"Score: {match['score']}, Source: {match['metadata']['source']}")
This will return the most relevant code snippets or documents from your database, ranked by similarity to the user query.For more details, see the Turbopuffer docs or Pinecone docs.