Retrieval-Augmented Generation (RAG)
RAG (Retrieval-Augmented Generation) search allows you to retrieve relevant chunks from your collections based on a query. This enables language models to generate responses grounded in your specific documents and knowledge base.
Search Methods
OpenGateLLM supports multiple search methods:
| Method | Description |
|---|---|
semantic | Vector similarity search using embeddings |
lexical | Keyword-based search (BM25) |
hybrid | Combination of semantic and lexical search |
Search Parameters
prompt: Search query (required)collections: List of collection IDs to search in (required)method: Search method (default:semantic)limit: Number of results to return (default: 10, max: 200)offset: Pagination offset (default: 0)rff_k: RRF constant for hybrid search (default: 20)score_threshold: Minimum similarity score (0.0-1.0, only for semantic)web_search: Add internet search results (default: false)web_search_k: Number of web results (default: 5)
Search Flow
Performing Searches
- Semantic search
- Hybrid search
- With web search
curl -X POST http://localhost:8000/v1/search \
-H "Authorization: Bearer <api_key>" \
-H "Content-Type: application/json" \
-d '{
"prompt": "What is machine learning?",
"collections": [1, 2],
"method": "semantic",
"limit": 10,
"score_threshold": 0.7
}'
curl -X POST http://localhost:8000/v1/search \
-H "Authorization: Bearer <api_key>" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Python programming",
"collections": [1],
"method": "hybrid",
"limit": 10,
"rff_k": 20
}'
curl -X POST http://localhost:8000/v1/search \
-H "Authorization: Bearer <api_key>" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Latest AI developments",
"collections": [1],
"method": "semantic",
"limit": 10,
"web_search": true,
"web_search_k": 5
}'
info
See Configuration for more details.
Web Search Integration
When web_search is enabled, OpenGateLLM:
- Generates a web search query from your prompt
- Retrieves results from the configured web search engine
- Creates a temporary collection to store web results
- Parses and processes each web result as a document
- Performs the search across both your collections and web results
- Automatically deletes the temporary web collection after returning results
info
Web search integration requires a web search engine to be configured. See Configuration for more details.
Next Steps
- Learn how to create and manage collections: Collections
- Learn how to import and process documents: Parsing and Chunking