Main Concept
RAG is an architecture that allows a Foundation Model (FM) to query an external, private Knowledge Base in real time before generating a response. It is the primary method for customizing a model’s knowledge without modifying its internal weights (fine-tuning).
Why it’s important?
-
Hallucination Mitigation: By “grounding” the response in real facts and corporate documents, the risk of fabricated information is reduced.
-
Constant Updates: Overcomes the limitation of the training cutoff date, allowing access to data generated “today.”
-
Cost-Effectiveness: It is considerably cheaper than fine-tuning since it does not require computational resources for training, only vector storage.
-
Security and Privacy: It allows the use of sensitive company data without this data becoming part of the model’s public knowledge.
Key Aspects
-
Embeddings & Vector Stores: Data is converted into numerical vectors and stored in AWS services such as Amazon OpenSearch Service, Amazon Aurora, or Amazon RDS for PostgreSQL.
-
Amazon Bedrock Knowledge Bases: This is the AWS managed service that automates the entire RAG workflow (from ingestion into S3 to connecting to the model).
-
Source Citation: A key advantage is the ability to cite the exact source of the document from which the information was extracted, increasing transparency.
Example
-
Scenario: An insurance company needs its chatbot to provide information about policies updated yesterday.
-
Implementation: PDFs are uploaded to Amazon S3 and automatically indexed in Amazon OpenSearch via Knowledge Bases for Amazon Bedrock.
-
Result: The model retrieves the exact paragraph from the policy and generates a response citing the source file.
đź’ˇ Exam Tip (AIF-C01)
If a question asks you to choose between RAG and Fine-tuning:
-
Choose RAG if the goal is to update knowledge or access private data.
-
Choose Fine-tuning if the goal is to change the model’s style, tone, or behavior for a very specific task
Exam Domain
- Domain 3: Applications of Foundation Models (28%)
- Task Statement 3.1: Describe design considerations for applications that use foundation models (FMs).
- Task Statement 3.4: Describe methods to evaluate FM performance.