Retrieval-Augmented Generation (RAG)

RAG (Retrieval-Augmented Generation) search allows you to retrieve relevant chunks from your collections based on a query. This enables language models to generate responses grounded in your specific documents and knowledge base.

Search Methods

OpenGateLLM supports multiple search methods:

Method	Description
`semantic`	Vector similarity search using embeddings
`lexical`	Keyword-based search (BM25)
`hybrid`	Combination of semantic and lexical search

Search Parameters

prompt: Search query (required)
collections: List of collection IDs to search in (required)
method: Search method (default: semantic)
limit: Number of results to return (default: 10, max: 200)
offset: Pagination offset (default: 0)
rff_k: RRF constant for hybrid search (default: 20)
score_threshold: Minimum similarity score (0.0-1.0, only for semantic)
web_search: Add internet search results (default: false)
web_search_k: Number of web results (default: 5)

Search Flow

Performing Searches

Semantic search
Hybrid search
With web search

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is machine learning?",
    "collections": [1, 2],
    "method": "semantic",
    "limit": 10,
    "score_threshold": 0.7
  }'

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Python programming",
    "collections": [1],
    "method": "hybrid",
    "limit": 10,
    "rff_k": 20
  }'

curl -X POST http://localhost:8000/v1/search \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Latest AI developments",
    "collections": [1],
    "method": "semantic",
    "limit": 10,
    "web_search": true,
    "web_search_k": 5
  }'

info

See Configuration for more details.

Web Search Integration

When web_search is enabled, OpenGateLLM:

Generates a web search query from your prompt
Retrieves results from the configured web search engine
Creates a temporary collection to store web results
Parses and processes each web result as a document
Performs the search across both your collections and web results
Automatically deletes the temporary web collection after returning results

info

Web search integration requires a web search engine to be configured. See Configuration for more details.

Next Steps

Learn how to create and manage collections: Collections
Learn how to import and process documents: Parsing and Chunking

Search Methods​

Search Parameters​

Search Flow​

Performing Searches​

Web Search Integration​

Next Steps​