Skip to main content

Configuration

OpenGateLLM requires configuring a configuration file. This defines models, dependencies, and settings parameters. Playground and API need a configuration file (could be the same file), see API configuration and Playground configuration.

By default, the configuration file must be ./config.yml file.

You can change the configuration file by setting the CONFIG_FILE environment variable.

The configuration file has 3 sections:

  • models: models configuration.
  • dependencies: dependencies configuration.
  • settings: settings configuration.

Secrets

You can pass environment variables in configuration file with pattern ${ENV_VARIABLE_NAME}. All environment variables will be loaded in the configuration file.

Example

models:
[...]
- name: my-language-model
type: text-generation
providers:
- type: openai
url: https://api.openai.com
key: ${OPENAI_API_KEY}
model_name: gpt-4o-mini

Example

The following is an example of configuration file:

# ----------------------------------- models ------------------------------------
models:
- name: albert-testbed
type: text-generation
# aliases: ["model-alias"]
# owned_by: Me
# load_balancing_strategy: shuffle
# cost_prompt_tokens: 0.10
# cost_completion_tokens: 0.10
providers:
- type: vllm
url: http://albert-testbed.etalab.gouv.fr:8000
# key: sk-xxx
model_name: "gemma3:1b"
# timeout: 60
# model_carbon_footprint_zone: FRA
# model_carbon_footprint_total_params: 8
# model_carbon_footprint_active_params: 8

# -------------------------------- dependencies ---------------------------------
dependencies:
postgres: # required
url: postgresql+asyncpg://${POSTGRES_USER:-postgres}:${POSTGRES_PASSWORD:-changeme}@${POSTGRES_HOST:-localhost}:${POSTGRES_PORT:-5432}/postgres
echo: False
pool_size: 5
connect_args:
server_settings:
statement_timeout: "120s"
command_timeout: 60

redis: # required
url: redis://${REDIS_USER:-redis}:${REDIS_PASSWORD:-changeme}@${REDIS_HOST:-localhost}:${REDIS_PORT:-6379}

# elasticsearch:
# number_of_shards: 1
# number_of_replicas: 0
# hosts: "http://localhost:9200"
# basic_auth:
# - "elastic"
# - ${ELASTIC_PASSWORD}

# brave:
# headers:
# Accept: application/json
# X-Subscription-Token: ${BRAVE_API_KEY}
# country: "fr"
# safesearch: "strict"

# sentry:
# dsn: ${SENTRY_DSN}

# ---------------------------------- settings -----------------------------------
settings:
# session_secret_key: ${SESSION_SECRET_KEY}
# disabled_routers: ["admin", "audio"]
# usage_tokenizer: tiktoken_gpt2
# app_title: My OpenGateLLM API

# log_level: INFO
# log_format: [%(asctime)s][%(process)d:%(name)s][%(levelname)s] %(client_ip)s - %(message)s

# swagger_version: 1.0.0
# swagger_contact_url: https://github.com/etalab-ia/OpenGateLLM
# swagger_contact_email: john.doe@example.com
# swagger_docs_url: /docs
# swagger_redoc_url: /redoc

auth_master_username: master
auth_master_key: changeme
# auth_max_token_expiration_days: 365

# rate_limiting_strategy: fixed_window

# monitoring_sentry_enabled: True
# monitoring_postgres_enabled: True
# monitoring_prometheus_enabled: True

# vector_store_model: my-model

# search_web_query_model: my-model
# search_web_limited_domains: ["google.com", "wikipedia.org"]
# search_web_user_agent: None

# search_multi_agents_synthesis_model: my-model
# search_multi_agents_reranker_model: my-model

playground_opengatellm_url: ${OPENGATELLM_URL:-http://localhost:8000}
# playground_default_model: my-model
# playground_theme_has_background: True
# playground_theme_accent_color: purple
# playground_theme_appearance: dark
# playground_theme_gray_color: gray
# playground_theme_panel_background: solid
# playground_theme_radius: medium
# playground_theme_scaling: 100%

API configuration

AttributeTypeDescriptionRequiredDefaultValuesExamples
dependenciesobjectDependencies used by the API. For details of configuration, see the Dependencies section.
modelsarrayModels used by the API. At least one model must be defined. For details of configuration, see the Model section.
settingsobjectGeneral settings configuration fields. For details of configuration, see the Settings section.

Settings

AttributeTypeDescriptionRequiredDefaultValuesExamples
app_titlestringDisplay title of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.OpenGateLLMMy API
auth_key_max_expiration_daysintegerMaximum number of days for a new API key to be valid.None
auth_master_keystringMaster key for the API. It should be a random string with at least 32 characters. This key has all permissions and cannot be modified or deleted. This key is used to create the first role and the first user. This key is also used to encrypt user tokens, watch out if you modify the master key, you'll need to update all user API keys.changeme
auth_playground_session_durationintegerDuration of the playground postgres_session in seconds.3600
disabled_routersarrayDisabled routers to limits services of the API.• admin
• audio
• auth
• chat
• chunks
• collections
• documents
• embeddings
• ...
['embeddings']
front_urlstringFront-end URL for the application.http://localhost:8501
hidden_routersarrayRouters are enabled but hidden in the swagger and the documentation of the API.• admin
• audio
• auth
• chat
• chunks
• collections
• documents
• embeddings
• ...
['admin']
log_formatstringLogging format of the API.[%(asctime)s][%(process)d:%(name)s][%(levelname)s] %(client_ip)s - %(message)s
log_levelstringLogging level of the API.INFO• DEBUG
• INFO
• WARNING
• ERROR
• CRITICAL
monitoring_postgres_enabledbooleanIf true, the log usage will be written in the PostgreSQL database.True
monitoring_prometheus_enabledbooleanIf true, Prometheus metrics will be exposed in the /metrics endpoint.True
rate_limiting_strategystringRate limiting strategy for the API.fixed_window• moving_window
• fixed_window
• sliding_window
routing_max_priorityintegerMaximum allowed priority in routing tasks.4
routing_max_retriesintegerMaximum number of retries for routing tasks.3
routing_retry_countdownintegerNumber of seconds before retrying a failed routing task.3
search_web_limited_domainsarrayLimited domains for the web search. If provided, the web search will be limited to these domains.
search_web_query_modelstringModel used to query the web in the web search. Is required if a web search dependency is provided (Brave or DuckDuckGo). This model must be defined in the models section and have type text-generation or image-text-to-text.None
search_web_user_agentstringUser agent to scrape the web. If provided, the web search will use this user agent.None
session_secret_keystringSecret key for postgres_session middleware. If not provided, the master key will be used.NoneknBnU1foGtBEwnOGTOmszldbSwSYLTcE6bdibC8bPGM
swagger_contactobjectContact informations of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.None
swagger_descriptionstringDisplay description of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.See documentationSee documentation
swagger_docs_urlstringDocs URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./docs
swagger_license_infoobjectLicence informations of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.{'name': 'MIT Licence', 'identifier': 'MIT', 'url': 'https://raw.githubusercontent.com/etalab-ia/opengatellm/refs/heads/main/LICENSE'}
swagger_openapi_tagsarrayOpenAPI tags of the API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.
swagger_openapi_urlstringOpenAPI URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./openapi.json
swagger_redoc_urlstringRedoc URL of swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information./redoc
swagger_summarystringDisplay summary of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.OpenGateLLM connect to your models. You can configuration this swagger UI in the configuration file, like hide routes or change the title.My API description.
swagger_terms_of_servicestringA URL to the Terms of Service for the API in swagger UI. If provided, this has to be a URL.Nonehttps://example.com/terms-of-service
swagger_versionstringDisplay version of your API in swagger UI, see https://fastapi.tiangolo.com/tutorial/metadata for more information.latest2.5.0
usage_tokenizerstringTokenizer used to compute usage of the API.tiktoken_gpt2• tiktoken_gpt2
• tiktoken_r50k_base
• tiktoken_p50k_base
• tiktoken_p50k_edit
• tiktoken_cl100k_base
• tiktoken_o200k_base
vector_store_modelstringModel used to vectorize the text in the vector store database. Is required if a vector store dependency is provided (Elasticsearch or Qdrant). This model must be defined in the models section and have type text-embeddings-inference.None

Model

In the models section, you define a list of models. Each model is a set of API providers for that model. Users will access the models specified in this section using their name. Load balancing is performed between the different providers of the requested model. All providers in a model must serve the same type of model (text-generation or text-embeddings-inference, etc.). We recommend that all providers of a model serve exactly the same model, otherwise users may receive responses of varying quality. For embedding models, the API verifies that all providers output vectors of the same dimension. You can define the load balancing strategy between the model's providers. By default, it is random.

For more information to configure model providers, see the ModelProvider section.


AttributeTypeDescriptionRequiredDefaultValuesExamples
aliasesarrayAliases of the model. It will be used to identify the model by users.['model-alias', 'model-alias-2']
cost_completion_tokensnumberModel costs completion tokens for user budget computation. The cost is by 1M tokens. Set to 0.0 to disable budget computation for this model.0.00.1
cost_prompt_tokensnumberModel costs prompt tokens for user budget computation. The cost is by 1M tokens.0.00.1
load_balancing_strategystringRouting strategy for load balancing between providers of the model.shuffle• shuffle
• least_busy
least_busy
namestringUnique name exposed to clients when selecting the model.gpt-4o
providersarrayAPI providers of the model. If there are multiple providers, the model will be load balanced between them according to the routing strategy. The different models have to the same type. For details of configuration, see the ModelProvider section.
typestringType of the model. It will be used to identify the model type.• image-text-to-text
• automatic-speech-recognition
• text-embeddings-inference
• text-generation
• text-classification
text-generation

ModelProvider

AttributeTypeDescriptionRequiredDefaultValuesExamples
keystringModel provider API key.Nonesk-1234567890
model_carbon_footprint_active_paramsintegerActive params of the model in billions of parameters for carbon footprint computation. If not provided, the total params will be used if provided, else carbon footprint will not be computed. For more information, see https://ecologits.aiNone8
model_carbon_footprint_total_paramsintegerTotal params of the model in billions of parameters for carbon footprint computation. If not provided, the active params will be used if provided, else carbon footprint will not be computed. For more information, see https://ecologits.aiNone8
model_carbon_footprint_zonestringModel hosting zone using ISO 3166-1 alpha-3 code format (e.g., WOR for World, FRA for France, USA for United States). This determines the electricity mix used for carbon intensity calculations. For more information, see https://ecologits.aiWOR• ABW
• AFG
• AGO
• AIA
• ALA
• ALB
• AND
• ARE
• ...
WOR
model_namestringModel name from the model provider.gpt-4o
qos_limitnumberThe value to use for the quality of service. Depends of the metric, the value can be a percentile, a threshold, etc.None0.5
qos_metricstringThe metric to use for the quality of service. If not provided, no QoS policy is applied.None• ttft
• latency
• inflight
• performance
inflight
timeoutintegerTimeout for the model provider requests, after user receive an 500 error (model is too busy).30010
typestringModel provider type.• albert
• openai
• mistral
• tei
• vllm
openai
urlstringModel provider API url. The url must only contain the domain name (without /v1 suffix for example). Depends of the model provider type, the url can be optional (Albert, OpenAI).Nonehttps://api.openai.com

Dependencies

AttributeTypeDescriptionRequiredDefaultValuesExamples
albertobjectIf provided, Albert API is used to parse pdf documents. Cannot be used with Marker dependency concurrently. Pass arguments to call Albert API in this section. For details of configuration, see the AlbertDependency section.None
braveobjectIf provided, Brave API is used to web search. Cannot be used with DuckDuckGo dependency concurrently. Pass arguments to call API in this section. All query parameters are supported, see https://api-dashboard.search.brave.com/app/documentation/web-search/query for more information. For details of configuration, see the BraveDependency section.None
celeryobjectIf provided, Celery is used to run tasks asynchronously with queues. Pass arguments to call Celery in this section. For details of configuration, see the CeleryDependency section.None
duckduckgoobjectIf provided, DuckDuckGo API is used to web search. Cannot be used with Brave dependency concurrently. Pass arguments to call API in this section. All query parameters are supported, see https://www.searchapi.io/docs/duckduckgo-api for more information. For details of configuration, see the DuckDuckGoDependency section.None
elasticsearchobjectPass all elastic python SDK arguments, see https://elasticsearch-py.readthedocs.io/en/v9.0.2/api/elasticsearch.html#elasticsearch.Elasticsearch for more information. Some others arguments are available to configure the Elasticsearch index. For details of configuration, see the ElasticsearchDependency section. For details of configuration, see the ElasticsearchDependency section.None
markerobjectIf provided, Marker API is used to parse pdf documents. Cannot be used with Albert dependency concurrently. Pass arguments to call Marker API in this section. For details of configuration, see the MarkerDependency section.None
postgresobjectPass all postgres python SDK arguments, see https://github.com/etalab-ia/opengatellm/blob/main/docs/dependencies/postgres.md for more information. For details of configuration, see the PostgresDependency section.
proconnectobjectProConnect configuration for the API. See https://github.com/etalab-ia/albert-api/blob/main/docs/oauth2_encryption.md for more information. For details of configuration, see the ProConnect section.None
qdrantobjectPass all qdrant python SDK arguments, see https://python-client.qdrant.tech/qdrant_client.qdrant_client for more information. For details of configuration, see the QdrantDependency section.None
redisobjectPass all from_url() method arguments of redis.asyncio.connection.ConnectionPool class, see https://redis.readthedocs.io/en/stable/connections.html#redis.asyncio.connection.ConnectionPool.from_url for more information. For details of configuration, see the RedisDependency section.
sentryobjectPass all sentry python SDK arguments, see https://docs.sentry.io/platforms/python/configuration/options/ for more information. For details of configuration, see the SentryDependency section.None

SentryDependency


RedisDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
urlstringRedis connection url.redis://:changeme@localhost:6379

QdrantDependency


ProConnect

AttributeTypeDescriptionRequiredDefaultValuesExamples
allowed_domainsstringComma-separated list of domains allowed to sign in via ProConnect (e.g. 'gouv.fr,example.com'). Only fronted on the specified domains will be allowed to authenticate using proconnect.localhost,gouv.fr
client_idstringClient identifier provided by ProConnect when you register your application in their dashboard. This value is public (it's fine to embed in clients) but must match the value configured in ProConnect.
client_secretstringClient secret provided by ProConnect at application registration. This value must be kept confidential — it's used by the server to authenticate with ProConnect during token exchange (do not expose it to browsers or mobile apps).
default_rolestringRole automatically assigned to users created via ProConnect login on first sign-in. Set this to the role name you want new ProConnect users to receive (must exist in your roles configuration).Freemium
redirect_uristringRedirect URI where users are sent after successful ProConnect authentication. This URI must exactly match one of the redirect URIs configured in OpenGateLLM settings. It must be an HTTPS endpoint in production and is used to receive the authorization tokens from ProConnect.https://albert.api.etalab.gouv.fr/v1/auth/callback
scopestringSpace-separated OAuth2/OpenID Connect scopes requested from ProConnect (for example: 'openid email given_name'). Scopes determine the information returned about the authenticated user; reduce scopes to the minimum necessary for privacy.openid email given_name usual_name siret organizational_unit belonging_population chorusdt
server_metadata_urlstringOpenID Connect discovery endpoint for ProConnect (server metadata). The SDK/flow uses this to discover authorization, token, and JWKS endpoints. Change to the production discovery URL when switching from sandbox to production.https://identite-sandbox.proconnect.gouv.fr/.well-known/openid-configuration

PostgresDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
urlstringPostgreSQL connection url.postgresql://postgres:changeme@localhost:5432/postgres

MarkerDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
headersobjectMarker API request headers.{'Authorization': 'Bearer my-api-key'}
timeoutintegerTimeout for the Marker API requests.30010
urlstringMarker API url.

ElasticsearchDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
number_of_replicasintegerNumber of replicas for the Elasticsearch index.11
number_of_shardsintegerNumber of shards for the Elasticsearch index.11

DuckDuckGoDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
headersobjectDuckDuckGo API request headers.False{}
timeoutintegerTimeout for the DuckDuckGo API requests.30010
urlstringDuckDuckGo API url.https://api.duckduckgo.com/

CeleryDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
broker_urlstringCelery broker url like Redis (redis://) or RabbitMQ (amqp://). If not provided, use redis dependency as broker.None
enable_utcbooleanEnable UTC.TrueTrue
result_backendstringCelery result backend url. If not provided, use redis dependency as result backend.None
timezonestringTimezone.UTCUTC

BraveDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
headersobjectBrave API request headers.True{'X-Subscription-Token': 'my-api-key'}
timeoutintegerTimeout for the Brave API requests.30010
urlstringBrave API url.https://api.search.brave.com/res/v1/web/search

AlbertDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
headersobjectAlbert API request headers.{'Authorization': 'Bearer my-api-key'}
timeoutintegerTimeout for the Albert API requests.30010
urlstringAlbert API url.https://albert.api.etalab.gouv.fr

Playground configuration

The following parameters allow you to configure the Playground application. The configuration file can be shared with the API, as the sections are identical and compatible. Some parameters are common to both the API and the Playground (for example, app_title).

For Plagroud deployment, some environment variables are required to be set, like Reflex backend URL. See Environment variables for more information.


AttributeTypeDescriptionRequiredDefaultValuesExamples
dependenciesobjectDependencies used by the playground. For details of configuration, see the Dependencies section.
settingsobjectGeneral settings configuration fields. Some fields are common to the API and the playground. For details of configuration, see the Settings section.

Settings

AttributeTypeDescriptionRequiredDefaultValuesExamples
app_titlestringThe title of the application.OpenGateLLM
auth_key_max_expiration_daysintegerMaximum number of days for a token to be valid.None
playground_default_modelstringThe first model selected in chat page.None
playground_opengatellm_urlstringThe URL of the OpenGateLLM API.http://localhost:8000
playground_theme_accent_colorstringThe primary color used for default buttons, typography, backgrounds, etc. See available colors at https://www.radix-ui.com/colors.purple
playground_theme_appearancestringThe appearance of the theme.light
playground_theme_gray_colorstringThe secondary color used for default buttons, typography, backgrounds, etc. See available colors at https://www.radix-ui.com/colors.gray
playground_theme_has_backgroundbooleanWhether the theme has a background.True
playground_theme_panel_backgroundstringWhether panel backgrounds are translucent: 'solid''translucent'.solid
playground_theme_radiusstringThe radius of the theme. Can be 'small', 'medium', or 'large'.medium
playground_theme_scalingstringThe scaling of the theme.100%
routing_max_priorityintegerMaximum allowed priority in routing tasks.10

Dependencies

AttributeTypeDescriptionRequiredDefaultValuesExamples
redisobjectSet the Redis connection url to use as stage manager. See https://reflex.dev/docs/api-reference/config/ for more information. For details of configuration, see the RedisDependency section.None

RedisDependency

AttributeTypeDescriptionRequiredDefaultValuesExamples
urlstringRedis connection url.redis://:changeme@localhost:6379