Main Concept

RAG (Retrieval-Augmented Generation) it’s a technique that allows a Foundation Model to access data sources outside of its own training data, that means you can connect the model with an external Knowledge Base, so it’s a way to customize models withouth fine-tuning.

Why it’s important?

  • RAG solves the problem of cutoff data, meaning the models doesn’t have the most current data.
  • RAG adds domain-specific knowledge
  • Allows the model to give more accurate and grounded responses
  • It’s a more cost-effective way to customize a model.

Key Aspects

  • External data could be documents, databases, wikis, etc.
  • The external data is pre-indexed and controlled (you decide what information the model can access)
  • You external data is converted to vectors (embeddings), and stored in databases.
  • The model retrieves relevant information, then generate responses.

Example

  • Customer service chat for a company
    • Knowledge base: Company’s products manuals, FAQ documents, policy guides stored in Amazon S3.
    • Process: Customers asks, the system searches on the company documents and generates answers.