/vector_stores/search - Search Vector Store
Search a vector store for relevant chunks based on a query and file attributes filter. This is useful for retrieval-augmented generation (RAG) use cases.
Overviewโ
Feature | Supported | Notes |
---|---|---|
Cost Tracking | โ | Tracked per search operation |
Logging | โ | Works across all integrations |
End-user Tracking | โ | |
Support LLM Providers | OpenAI, Azure OpenAI, Bedrock, Vertex RAG Engine | Full vector stores API support across providers |
Usageโ
LiteLLM Python SDKโ
- Basic Usage
- Advanced Configuration
- Multiple Queries
- OpenAI Provider
Non-streaming exampleโ
Search Vector Store - Basic
import litellm
response = await litellm.vector_stores.asearch(
vector_store_id="vs_abc123",
query="What is the capital of France?"
)
print(response)
Synchronous exampleโ
Search Vector Store - Sync
import litellm
response = litellm.vector_stores.search(
vector_store_id="vs_abc123",
query="What is the capital of France?"
)
print(response)
With filters and ranking optionsโ
Search Vector Store - Advanced
import litellm
response = await litellm.vector_stores.asearch(
vector_store_id="vs_abc123",
query="What is the capital of France?",
filters={
"file_ids": ["file-abc123", "file-def456"]
},
max_num_results=5,
ranking_options={
"score_threshold": 0.7
},
rewrite_query=True
)
print(response)
Searching with multiple queriesโ
Search Vector Store - Multiple Queries
import litellm
response = await litellm.vector_stores.asearch(
vector_store_id="vs_abc123",
query=[
"What is the capital of France?",
"What is the population of Paris?"
],
max_num_results=10
)
print(response)
Using OpenAI provider explicitlyโ
Search Vector Store - OpenAI Provider
import litellm
import os
# Set API key
os.environ["OPENAI_API_KEY"] = "your-openai-api-key"
response = await litellm.vector_stores.asearch(
vector_store_id="vs_abc123",
query="What is the capital of France?",
custom_llm_provider="openai"
)
print(response)
LiteLLM Proxy Serverโ
- Setup & Usage
- curl
- Setup config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
general_settings:
# Vector store settings can be added here if needed
- Start proxy
litellm --config /path/to/config.yaml
- Test it with OpenAI SDK!
OpenAI SDK via LiteLLM Proxy
from openai import OpenAI
# Point OpenAI SDK to LiteLLM proxy
client = OpenAI(
base_url="http://0.0.0.0:4000",
api_key="sk-1234", # Your LiteLLM API key
)
search_results = client.beta.vector_stores.search(
vector_store_id="vs_abc123",
query="What is the capital of France?",
max_num_results=5
)
print(search_results)
Search Vector Store via curl
curl -L -X POST 'http://0.0.0.0:4000/v1/vector_stores/vs_abc123/search' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"query": "What is the capital of France?",
"filters": {
"file_ids": ["file-abc123", "file-def456"]
},
"max_num_results": 5,
"ranking_options": {
"score_threshold": 0.7
},
"rewrite_query": true
}'
Setting Up Vector Storesโ
To use vector store search, configure your vector stores in the vector_store_registry
. See the Vector Store Configuration Guide for:
- Provider-specific configuration (Bedrock, OpenAI, Azure, Vertex AI, PG Vector)
- Python SDK and Proxy setup examples
- Authentication and credential management
Using Vector Stores with Chat Completionsโ
Pass vector_store_ids
in chat completion requests to automatically retrieve relevant context. See Using Vector Stores with Chat Completions for implementation details.