Amazon Bedrock Knowledge Base pricing is based on the following components:
1. Storage and Ingestion
- Storage: You are charged for the storage of vector embeddings in the underlying vector database (e.g., Amazon OpenSearch, Pinecone, or Redis Enterprise Cloud).
- Ingestion: When you ingest (sync) documents into the knowledge base, you are charged for the number of tokens processed to create embeddings.
Note: The actual storage and ingestion costs depend on the vector database you select. Bedrock itself does not charge a separate fee for the knowledge base feature, but you pay for the underlying vector store and the compute used for embedding generation.
2. Embedding Model Usage
- When you ingest documents, Bedrock uses an embedding model to convert your text into vectors.
- You are charged for the number of input tokens processed by the embedding model during ingestion.
- The price per 1,000 tokens depends on the embedding model you select (e.g., Amazon Titan Embeddings, Cohere Embed, etc.).
3. Vector Database (Vector Store) Costs
- You pay for the storage and retrieval operations in the vector database you choose (e.g., OpenSearch, Pinecone, Redis).
- These costs are billed separately by the respective service (e.g., OpenSearch Service pricing, Pinecone pricing, etc.).
4. Retrieval (Query) Costs
- When you query the knowledge base (e.g., via an agent), you are charged for:
- The number of tokens processed by the embedding model (if your query is embedded).
- The inference cost of the foundation model used to generate the answer (standard Bedrock model inference pricing).
Example Cost Flow
- Ingest 10,000 tokens into the knowledge base:
- Charged for 10,000 tokens by the embedding model.
- Charged for storage in the vector database.
- Query the knowledge base:
- Charged for embedding the query (if applicable).
- Charged for retrieval from the vector database.
- Charged for the foundation model inference to generate the answer.
Where to See Pricing
Summary Table
Component | How Charged |
Embedding Model | Per 1,000 tokens ingested/queried |
Vector Database Storage | Per GB/month (varies by provider) |
Vector Database Retrieval | Per query (varies by provider) |
Model Inference | Per 1,000 tokens generated (standard Bedrock FM) |
References: