Skip to main content

Role, permissions and rate limitings

Roles define what actions users can perform within the system through permissions, and what resource limits apply through rate limiting.

Available permissions

PermissionDescription
adminFull administrative access
create_public_collectionCreate public collections
read_metricRead prometheus /metrics endpoint
provide_modelsProvide models to the system

Rate limiting

Rate limiting controls model usage for users with a specific role. Each limit has three components defined in the limits parameter:

  • model: The model name
  • type: The limit type
  • value: The limit value (if null, the limit is not applied)
tip

Rate limiting allow to control model access. If a limit is set to 0, the model will be inaccessible.

Available limit types

Limit TypeDescription
tpmTokens per minute
tpdTokens per day
rpmRequests per minute
rpdRequests per day

Example:

{
"model": "my-language-model",
"type": "tpm",
"value": 100000
}
info

Rate limiting requires Redis to be configured. For more information about Redis setup and rate limiting strategies, see Redis documentation.

Managing roles

curl -X POST http://localhost:8000/v1/admin/roles \
-H "Authorization: Bearer <api_key>" \
-H "Content-Type: application/json" \
-d '{
"name": "Developer",
"permissions": ["create_public_collection"],
"limits": [
{
"model": "my-language-model",
"type": "tpm",
"value": 100000
}
]
}'