Skip to main content

Vector Store

OpenGateLLM allows you to interact with a vector database to perform RAG (Retrieval-Augmented Generation). The API lets you feed this vector store by importing files, which are automatically processed and inserted into the database.

Setup a vector store

OpenGateLLM supports currently two vector databases:

Prerequisites

To enable the vector store, you need:

  1. A vector database (either Qdrant or Elasticsearch)
  2. An embedding model

Configuration

Docker compose

Add an elasticsearch container in the services section of your compose.yml file:

services:
[...]
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:9.0.2
restart: always
ports:
- "${ELASTICSEARCH_PORT:-9200}:9200"
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
- "ELASTIC_USERNAME=${ELASTICSEARCH_USER:-elasticsearch}"
- "ELASTIC_PASSWORD=${ELASTICSEARCH_PASSWORD:-changeme}"
volumes:
- elasticsearch:/usr/share/elasticsearch/data
healthcheck:
test: [ "CMD-SHELL", "bash", "-c", ":> /dev/tcp/127.0.0.1/9200" ]
interval: 4s
timeout: 10s
retries: 5

Configuration file

info

For more information about the configuration file, see Configuration documentation.

  1. Add Elasticsearch in the dependencies section of your config.yml. Example:

    dependencies:
    elasticsearch:
    hosts: http://${ELASTICSEARCH_HOST:-elasticsearch}:${ELASTICSEARCH_PORT:-9200}
    basic_auth:
    - ${ELASTIC_USERNAME:-elasticsearch}
    - ${ELASTIC_PASSWORD:-changeme}
  2. Add a model provider for a model with text-embeddings-inference type in the models section of your config.yml. Example:

    models:
    [...]
    - name: embeddings-small
    type: text-embeddings-inference
    providers:
    - type: openai
    key: ${OPENAI_API_KEY}
    timeout: 120
    model_name: text-embedding-3-small

    This model will be used to vectorize the text in the vector store database.

  3. Specify the vector store model in the settings section of your config.yml.

    settings:
    vector_store_model: embeddings-small

Access to the vector store

When the vector store is enabled, you can access to the document management endpoints to perform Retrieval-Augmented Generation (RAG):

  • /v1/collections
  • /v1/documents
  • /v1/chunks

For more information about the document management, see Retrieval-Augmented Generation (RAG) documentation.