Model Evaluation in Amazon Bedrock

Definition

With Amazon Bedrock, it is possible to evaluate language models both automatically and with human intervention, it also has the capability to evaluate knowledge bases used as Retrieval Augmented Generation (RAG) sources. That includes models and knowledge bases that are inside Bedrock but also outside of the service.

Key Points

We can use Amazon Bedrock to compute performance metrics of a model such as semantic robustness.
We can also calculate the correctness of a knowledge base in retrieving information and generating responses.
In Amazon Bedrock you create evaluation jobs to do the evaluation of models and knowledge bases
Model evaluation jobs support using the following types of Amazon Bedrock models:
- Foundation models
- Amazon Bedrock marketplace models
- Customized foundation models
- Imported foundation models
- Prompt routers
- Models that you haver purchased Provisioned Throughput

Automatic Model Evaluation with Amazon Bedrock
Human-based Model Evaluation with Amazon Bedrock
Benchmark Dataset
Metrics used for Foundation Model Evaluation
Retrieval-Augmented Generation (RAG)

🌿💻 The Packets Garden

Model Evaluation in Amazon Bedrock

Definition

Key Points

Links:

Graph View

Table of Contents

Backlinks

🌿💻 The Packets Garden

Model Evaluation in Amazon Bedrock

Definition

Key Points

Related Concepts

Links:

Graph View

Table of Contents

Backlinks