Vector Store
OpenGateLLM allows you to interact with a vector database to perform RAG (Retrieval-Augmented Generation). The API lets you feed this vector store by importing files, which are automatically processed and inserted into the database.
Setup a vector store
OpenGateLLM supports currently two vector databases:
Prerequisites
To enable the vector store, you need:
- A vector database (either Qdrant or Elasticsearch)
- An embedding model
Configuration
- Elasticsearch
- Qdrant
Docker compose
Add an elasticsearch container in the services section of your compose.yml file:
services:
[...]
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:9.0.2
restart: always
ports:
- "${ELASTICSEARCH_PORT:-9200}:9200"
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
- "ELASTIC_USERNAME=${ELASTICSEARCH_USER:-elasticsearch}"
- "ELASTIC_PASSWORD=${ELASTICSEARCH_PASSWORD:-changeme}"
volumes:
- elasticsearch:/usr/share/elasticsearch/data
healthcheck:
test: [ "CMD-SHELL", "bash", "-c", ":> /dev/tcp/127.0.0.1/9200" ]
interval: 4s
timeout: 10s
retries: 5
Configuration file
For more information about the configuration file, see Configuration documentation.
-
Add Elasticsearch in the
dependenciessection of yourconfig.yml. Example:dependencies:
elasticsearch:
hosts: http://${ELASTICSEARCH_HOST:-elasticsearch}:${ELASTICSEARCH_PORT:-9200}
basic_auth:
- ${ELASTIC_USERNAME:-elasticsearch}
- ${ELASTIC_PASSWORD:-changeme} -
Add a model provider for a model with
text-embeddings-inferencetype in themodelssection of yourconfig.yml. Example:models:
[...]
- name: embeddings-small
type: text-embeddings-inference
providers:
- type: openai
key: ${OPENAI_API_KEY}
timeout: 120
model_name: text-embedding-3-smallThis model will be used to vectorize the text in the vector store database.
-
Specify the vector store model in the
settingssection of yourconfig.yml.settings:
vector_store_model: embeddings-small
Docker Compose
Add a qdrant container in the services section of your compose.yml file:
services:
[...]
qdrant:
image: qdrant/qdrant:v1.11.5-unprivileged
restart: always
environment:
- QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY:-changeme}
volumes:
- qdrant:/qdrant/storage
ports:
- ${QDRANT_HTTP_PORT:-6333}:6333
- ${QDRANT_GRPC_PORT:-6334}:6334
healthcheck:
test: [ "CMD-SHELL", "bash", "-c", ":> /dev/tcp/127.0.0.1/${QDRANT_HTTP_PORT:-6333}" ]
interval: 4s
timeout: 10s
retries: 5
Configuration file
For more information about the configuration file, see Configuration documentation.
-
Add an embedding model in the
modelssection of yourconfig.yml. Example:models:
[...]
- name: embeddings-small
type: text-embeddings-inference
providers:
- type: openai
key: ${OPENAI_API_KEY}
timeout: 120
model_name: text-embedding-3-small -
Add Qdrant in the
dependenciessection of yourconfig.yml. Example:dependencies:
qdrant:
url: "http://${QDRANT_HOST:-qdrant}:${QDRANT_HTTP_PORT:-6333}"
api_key: ${QDRANT_API_KEY:-changeme}
prefer_grpc: False
grpc_port: ${QDRANT_GRPC_PORT:-6334}
timeout: 20
This model will be used to vectorize the text in the vector store database.
-
Specify the vector store model in the
settingssection of yourconfig.yml.settings:
vector_store_model: embeddings-small
Access to the vector store
When the vector store is enabled, you can access to the document management endpoints to perform Retrieval-Augmented Generation (RAG):
/v1/collections/v1/documents/v1/chunks
For more information about the document management, see Retrieval-Augmented Generation (RAG) documentation.