Knowledge base

Multi-Source Retrieval Augmented Generation

Retrieval-Augmented Generation (RAG) is a powerful technique that enhances Large Language Models (LLMs) by grounding their responses in factual, relevant information. Instead of relying solely on the model’s pre-trained knowledge, RAG retrieves specific information from external sources and uses it as context for generating responses.

Deployment models

Extravaganza knowledge base in Cloud

To start the journey with provided solution please register or private or company account for cloud services, accept terms and conditions of services usage and freely use the provided solutions compliant with Extravaganza knowledge base e.g chatbot, or recorder.

To take the survey one step ahead and respond to the need for flexibility in the delivered solution, we provided frameworks for LLM and data processing compatible with Extravaganza knowledge base service, OpenAI API, and Llama-Index.

Extravaganza knowledge base Cloud On-Prem

Cloud On-Prem brings powerful knowledge base features directly to your infrastructure. Extravaganza Cloud On-Prem enables you to deploy the complete solution with your own environment to meet your company internal needs.

This solution is built for organizations that require enhanced data privacy, strict compliance adherence, and complete infrastructure control. Whether you’re managing critical infrastructure, operating under stringent regulations like GDPR or HIPAA, or simply need the flexibility to customize your environment, Extravaganza RAG Cloud On-Prem delivers enterprise-grade solution without compromising on security or control.

Before You Begin

Ensure you have the following ready before starting the installation:

Required:

Kubernetes cluster configured
Helm (version 3.12 with OCI Configuration)
Kubectl with access to Kubernetes cluster

Optional:

CPU architecture amd64 with AVX2/AVX512 instructions support is preferred

Minimal System Requirements:

Component	Details
Kubernetes	Version 1.33 or newer
TLS certificate	Single certificate for all endpoints, or separate certificates for frontend, and API. Must be trusted by all connecting entities.
Knowledge base Cloud services	4 machines with 4-8 CPU cores each, 8-16 GiB memory, GPU acceleration is not required but always welcome (with Nvidia CUDA 13.2 compatible hardware or Apple Metal)
NVMe/SSD storage	Storage for uploaded documents, 256 GiB for PVCs (cloud services are not ephemeral)
Third-party Services	5 machines with 4 CPU cores each, 160 GiB NVMe/SSD storage each for PVCs

Required Components

Third-Party Services

All these components must be installed prior to the knowledge base cloud services:

Component	Version	Purpose
MariaDB	12.x	Main metadata database
Kafka broker	4.x	Broker for internal inter-service communication
Redis	8.x	Main document storage and caching layer
Milvus	2.6.x	Main embedding vector store
Neo4j	5.26.x	Main nodes and edges graph store

Step by Step Guide to Install

Extravaganza knowledge base Hybrid solution

To meet data storage security requirements, Extravaganza Business Services has developed a hybrid knowledge base implementation model. The data itself is processed in cloud but stored within customer’s environment.

This solution is built for organizations that require enhanced data privacy, strict compliance adherence, but not complete infrastructure control. Whether you’re managing critical infrastructure, operating under stringent regulations like GDPR or HIPAA, or simply need the flexibility to customize your environment, Extravaganza RAG Cloud Hybrid delivers enterprise-grade solution without compromising on security but relaxed the control.

Before You Begin

Ensure you have the following ready before starting the installation:

Minimal System Requirements:

Component	Details
Third-party services	3 machines with 4 CPU cores each, 160 GiB NVMe/SSD storage each.
Secure VPN connection	To secure communication between Customer infrastructure and Extravaganza Business Services Cloud Services.

Required Components

Third-Party Services

All these components must be installed on customer environment:

Component	Version	Purpose
Redis	8.x	Main document storage
Milvus	2.6.x	Main embedding vector store
Neo4j	5.26.x	Main nodes and edges graph store

Extravaganza Business Services knowledge base solution

Retrieval-Augumented Generation (RAG) integrates user queries with a collection of pertinent documents sourced from an external knowledge database, incorporating two esential elements: the Retrieval Component and the Generation Component.

The Retrieval Component is responsible for fetching relevant documents or information from the external knowledge database. It identifies and retrieves the most pertinent data based on the input query. After the retrieval process, the generation component takes the retrieved information and generates coherent, contextually relevant responses. It leverages the capabilities of the language model to produce meaningful outputs.

It is important to understand how the multi-source RAG with hybrid search and re-ranking algorithm works to know better where its accuracy come from. Provided solution represents a significant advancement over basic RAG systems.

Multi-source retrieval pulls information from local static knowledge base, documents, media sources or other available information imported to and the web via Search API, giving access to both private documents stored and the latest information published online.
Hybrid search combines three complementary search methods:

Graph search helps traverse a graph dataset in the most efficient means possible. The right graph algorithm must be matched with the type of results you are looking for. Shines where the amount of data is huge,
Dense semantic search captures conceptual similarity using vector embeddings, finding relevant content even when exact keywords don’t match. Works well where the amount of data is not too large,
Sparse keywords search ensures important keywords are matched, similar to traditional search engines.

3. Re-ranking applies a specialized model to evaluate and sort the retrieved information based on its relevance to the specific question, ensuring the most useful context is prioritized.

This approach delivers more accurate, trustworthy responses (reducing the hallucinations of the LLM model) with several benefits:

Better factual grounding through diverse information sources.
Improved retrieval accuracy through complementary search methods.
Enhanced relevance through intelligent re-ranking.
Reduced hallucinations by providing high-quality context.
Auditability through source citations.

Main functionalities

Multiple modes support depending on customer needs for response speed and accuracy (local, global, hybrid, naive, mix, bypass).
Multiple embedding dimension support.
Configurable per agent.
Powered by multiple data sources (web, documents in huge variety of supported formats, multimedia sources).
Compatible with Large Language Models and Machine Learning Models.
Manual or automatic natural language detection.
Information protection against loss or theft.
Provides a high level of data separation between agents (multitenancy, each tenant data and access are logically isolated from others, even though they share the same physical resources and underlying software instance).
Ensures confidentiality of data transported between the client environment and the knowledge base.
Offers various deployment models, allowing to choose the level of separation of stored data depending on the business specific needs and requirements.

Benefits

Improved AI accuracy and reliability (ensures factual responses by accessing a company’s specific documents, provides precise and verifiable answers based on actual business information).
Enhances user trust by the ability to trace answers back to their source documents, increases confidence in the AI system’s output.
Lower costs by avoiding model retraining.
Supports faster time-to-market (companies can deploy and scale AI systems more quickly without the long development cycles associated with model retraining).
Improves operation efficiency (tasks like information retrieval, freeing up employees and enabling them to handle higher workloads).
Enhanced scalability (can efficiently handle and maintain performance as data volumes increase, which is critical for businesses with large and growing datasets.
Enhanced customer support (provide instant, personalized, and accurate answers, improving customer satisfaction and reducing the workload on support teams).
Real-time knowledge updates (pull in the latest information instantly, keeping applications and internal tools up to date without constant model updates).
Better compliance and auditability in relation to Large Language Models (make it easier to trace AI-generated answers back to their source documents, which is crucial for meeting audit requirements in regulated industries).
More effective operations through faster customer support and improved decision-making (provide relevant, contextually accurate, and actionable insights from large datasets, helping to eliminate guesswork and reduce the risk of poor decisions).
Drives innovation by analyzing customer feedback and market trends, RAG can inform product strategy and help businesses develop products that align with market needs.
Multitenancy, key concept of cloud computing, particularly SaaS, offers benefits like lover costs, scalability, and efficient resource management.