Redis
OpenGateLLM uses Redis as an in-memory data store for rate limiting and performance metrics. Redis provides high-performance, real-time data access for managing API quotas and monitoring model performance.
Redis is a required dependency for OpenGateLLM to function.
Overview
Redis handles two critical functions in OpenGateLLM:
- Rate Limiting: Tracks and enforces API usage limits (requests per minute/day, tokens per minute/day) for each user and model
- Performance Metrics: Stores time-series data for model performance monitoring (latency, time to first token)
Rate Limiting
The rate limiter uses Redis to store counters for:
- RPM (Requests Per Minute): Number of API requests per minute per user and model
- RPD (Requests Per Day): Number of API requests per day per user and model
- TPM (Tokens Per Minute): Number of tokens consumed per minute per user and model
- TPD (Tokens Per Day): Number of tokens consumed per day per user and model
OpenGateLLM supports three rate limiting strategies (configurable in settings):
- Fixed Window: Limits are enforced in fixed time windows
- Sliding Window: Limits are enforced using a sliding time window for smoother distribution
- Moving Window: Limits are enforced with a moving average approach
For more information about rate limiting, see Rate Limiting documentation.
Performance Metrics
Redis time-series module stores performance metrics for each model provider:
- Latency: Total request duration in milliseconds
- Time to First Token: Time until the first token is generated (for streaming responses)
These metrics are used for monitoring and can be exposed via Prometheus when enabled.
Setup Redis
Prerequisites
Redis Stack Server 7.4+ is required (includes the time-series module).
Configuration
Docker Compose
Add a redis container in the services section of your compose.yml file:
services:
[...]
redis:
image: redis/redis-stack-server:7.4.0-v7
restart: always
environment:
REDIS_ARGS: "--loadmodule /opt/redis-stack/lib/redistimeseries.so --dir /data --requirepass ${REDIS_PASSWORD:-changeme} --user ${REDIS_USER:-redis} on >password ~* allcommands --save 60 1 --appendonly yes"
ports:
- "${REDIS_PORT:-6379}:6379"
volumes:
- redis:/data
healthcheck:
test: [ "CMD", "redis-cli", "--raw", "incr", "ping" ]
interval: 4s
timeout: 10s
retries: 5
start_period: 60s
volumes:
redis:
Redis Stack Server is required (not standard Redis) because OpenGateLLM uses the RedisTimeSeries module for performance metrics.
Configuration File
For more information about the configuration file, see Configuration documentation.
-
Add Redis configuration in the
dependenciessection of yourconfig.yml. Example:dependencies:
[...]
redis:
url: redis://:${REDIS_PASSWORD:-changeme}@${REDIS_HOST:-localhost}:${REDIS_PORT:-6379}The Redis dependency accepts all parameters from the
from_url()method ofredis.asyncio.connection.ConnectionPoolclass. -
Configure rate limiting strategy in the
settingssection of yourconfig.yml(default isfixed_window):settings:
[...]
rate_limiting_strategy: fixed_windowFor more information about rate limiting, see Rate Limiting documentation.
-
Configure metrics retention in the
settingssection of yourconfig.yml(default is 40 seconds):settings:
[...]
metrics_retention_ms: 40000 # in millisecondsTheses metrics are stored in Redis time-series module to determine request prioritisation. For more information about request prioritisation, see request prioritisation documentation.
Security
We recommend securing your Redis instance by keeping the version up-to-date. For production environment, we also recommend :
- enabling protected mode
- deactivating default user
- create specific users with limited permissions
- logging to specific log files
- disabling syslog
- disabling dangerous commands (FLUSHALL, FLUSHDB, etc.)
It can be done by configuring the REDIS_ARGS environment variable in the docker-compose.yml or with a redis.conf file.
Also consider this security hardening for your docker compose service.
redis:
[...]
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
cap_add:
- SETGID
- SETUID
read_only: true
tmpfs:
- /tmp:noexec,nosuid,size=64M