Construct scalable containerized RAG primarily based generative AI purposes in AWS utilizing Amazon EKS with Amazon Bedrock

Generative synthetic intelligence (AI) purposes are generally constructed utilizing a method known as Retrieval Augmented Technology (RAG) that gives basis fashions (FMs) entry to extra knowledge they didn’t have throughout coaching. This knowledge is used to complement the generative AI immediate to ship extra context-specific and correct responses with out repeatedly retraining the FM, whereas additionally bettering transparency and minimizing hallucinations.

On this put up, we exhibit an answer utilizing Amazon Elastic Kubernetes Service (EKS) with Amazon Bedrock to construct scalable and containerized RAG options to your generative AI purposes on AWS whereas bringing your unstructured consumer file knowledge to Amazon Bedrock in an easy, quick, and safe approach.

Amazon EKS supplies a scalable, safe, and cost-efficient setting for constructing RAG purposes with Amazon Bedrock and in addition permits environment friendly deployment and monitoring of AI-driven workloads whereas leveraging Bedrock’s FMs for inference. It enhances efficiency with optimized compute situations, auto-scales GPU workloads whereas lowering prices by way of Amazon EC2 Spot Cases and AWS Fargate and supplies enterprise-grade safety by way of native AWS mechanisms similar to Amazon VPC networking and AWS IAM.

Our answer makes use of Amazon S3 because the supply of unstructured knowledge and populates an Amazon OpenSearch Serverless vector database by way of the usage of Amazon Bedrock Information Bases with the consumer’s current recordsdata and folders and related metadata. This allows a RAG state of affairs with Amazon Bedrock by enriching the generative AI immediate utilizing Amazon Bedrock APIs together with your company-specific knowledge retrieved from the OpenSearch Serverless vector database.

Resolution overview

The answer makes use of Amazon EKS managed node teams to automate the provisioning and lifecycle administration of nodes (Amazon EC2 situations) for the Amazon EKS Kubernetes cluster. Each managed node within the cluster is provisioned as a part of an Amazon EC2 Auto Scaling group that’s managed for you by EKS.

The EKS cluster consists of a Kubernetes deployment that runs throughout two Availability Zones for prime availability the place every node within the deployment hosts a number of replicas of a Bedrock RAG container picture registered and pulled from Amazon Elastic Container Registry (ECR). This setup makes positive that assets are used effectively, scaling up or down primarily based on the demand. The Horizontal Pod Autoscaler (HPA) is ready as much as additional scale the variety of pods in our deployment primarily based on their CPU utilization.

The RAG Retrieval Software container makes use of Bedrock Information Bases APIs and Anthropic’s Claude 3.5 Sonnet LLM hosted on Bedrock to implement a RAG workflow. The answer supplies the tip consumer with a scalable endpoint to entry the RAG workflow utilizing a Kubernetes service that’s fronted by an Amazon Software Load Balancer (ALB) provisioned by way of an EKS ingress controller.

The RAG Retrieval Software container orchestrated by EKS permits RAG with Amazon Bedrock by enriching the generative AI immediate acquired from the ALB endpoint with knowledge retrieved from an OpenSearch Serverless index that’s synced by way of Bedrock Information Bases out of your company-specific knowledge uploaded to Amazon S3.

The next structure diagram illustrates the varied elements of our answer:

Conditions

Full the next stipulations:

Guarantee mannequin entry in Amazon Bedrock. On this answer, we use Anthropic’s Claude 3.5 Sonnet on Amazon Bedrock.
Set up the AWS Command Line Interface (AWS CLI).
Set up Docker.
Set up Kubectl.
Set up Terraform.

Deploy the answer

The answer is out there for obtain on the GitHub repo. Cloning the repository and utilizing the Terraform template will provision the elements with their required configurations:

Clone the Git repository:

sudo yum set up -y unzip
git clone https://github.com/aws-samples/genai-bedrock-serverless.git
cd eksbedrock/terraform

From the terraform folder, deploy the answer utilizing Terraform:
```
terraform init
terraform apply -auto-approve
```

Configure EKS

Configure a secret for the ECR registry:

aws ecr get-login-password 
--region  | docker login 
--username AWS 
--password-stdin .dkr.ecr..amazonaws.com/bedrockragrepodocker pull .dkr.ecr..amazonaws.com/bedrockragrepo:latestaws eks update-kubeconfig 
--region  
--name eksbedrockkubectl create secret docker-registry ecr-secret  
--docker-server=.dkr.ecr..amazonaws.com 
--docker-username=AWS 
--docker-password=$(aws ecr get-login-password --region )

Navigate to the kubernetes/ingress folder:
- Ensure that the AWS_Region variable within the bedrockragconfigmap.yaml file factors to your AWS area.
- Exchange the picture URI in line 20 of the bedrockragdeployment.yaml file with the picture URI of your bedrockrag picture out of your ECR repository.
Provision the EKS deployment, service and ingress:
```
cd ..
kubectl apply -f ingress/
```

Create a data base and add knowledge

To create a data base and add knowledge, observe these steps:

Create an S3 bucket and add your knowledge into the bucket. In our weblog put up, we uploaded these two recordsdata, Amazon Bedrock Person Information and the Amazon FSx for ONTAP Person Information, into our S3 bucket.
Create an Amazon Bedrock data base. Comply with the steps right here to create a data base. Settle for all of the defaults together with utilizing the Fast create a brand new vector retailer possibility in Step 7 of the directions that creates an Amazon OpenSearch Serverless vector search assortment as your data base.
1. In Step 5c of the directions to create a data base, present the S3 URI of the article containing the recordsdata for the information supply for the data base
2. As soon as the data base is provisioned, get hold of the Information Base ID from the Bedrock Information Bases console to your newly created data base.

Question utilizing the Software Load Balancer

You possibly can question the mannequin immediately utilizing the API entrance finish supplied by the AWS ALB provisioned by the Kubernetes (EKS) Ingress Controller. Navigate to the AWS ALB console and procure the DNS identify to your ALB to make use of as your API:

curl -X POST "/question" 

-H "Content material-Kind: utility/json" 

-d '{"immediate": "What's a bedrock knowledgebase?", "kbId": ""}'

Cleanup

To keep away from recurring fees, clear up your account after making an attempt the answer:

From the terraform folder, delete the Terraform template for the answer:
terraform apply --destroy
Delete the Amazon Bedrock data base. From the Amazon Bedrock console, choose the data base you created on this answer, choose Delete, and observe the steps to delete the data base.

Conclusion

On this put up, we demonstrated an answer that makes use of Amazon EKS with Amazon Bedrock and supplies you with a framework to construct your individual containerized, automated, scalable, and extremely accessible RAG-based generative AI purposes on AWS. Utilizing Amazon S3 and Amazon Bedrock Information Bases, our answer automates bringing your unstructured consumer file knowledge to Amazon Bedrock inside the containerized framework. You should utilize the strategy demonstrated on this answer to automate and containerize your AI-driven workloads whereas utilizing Amazon Bedrock FMs for inference with built-in environment friendly deployment, scalability, and availability from a Kubernetes-based containerized deployment.

For extra details about methods to get began constructing with Amazon Bedrock and EKS for RAG situations, seek advice from the next assets:

In regards to the Authors

Kanishk Mahajan is Principal, Options Structure at AWS. He leads cloud transformation and answer structure for AWS prospects and companions. Kanishk focuses on containers, cloud operations, migrations and modernizations, AI/ML, resilience and safety and compliance. He’s a Technical Discipline Group (TFC) member in every of these domains at AWS.

Sandeep Batchu is a Senior Safety Architect at Amazon Net Companies, with in depth expertise in software program engineering, options structure, and cybersecurity. Keen about bridging enterprise outcomes with technological innovation, Sandeep guides prospects by means of their cloud journey, serving to them design and implement safe, scalable, versatile, and resilient cloud architectures.

Main Menu

What's Hot

AI Now Weaves Yarn Desires into Digital Artwork

What’s Actually Coming for Your Digital Defenses

DJI drones: The place to purchase the DJI Mini 4K drone

Construct scalable containerized RAG primarily based generative AI purposes in AWS utilizing Amazon EKS with Amazon Bedrock

Automate the creation of handout notes utilizing Amazon Bedrock Information Automation

Greatest Proxy Suppliers in 2025

Apple Workshop on Human-Centered Machine Studying 2024

AI Now Weaves Yarn Desires into Digital Artwork

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

AI Now Weaves Yarn Desires into Digital Artwork

What’s Actually Coming for Your Digital Defenses

DJI drones: The place to purchase the DJI Mini 4K drone

Automate the creation of handout notes utilizing Amazon Bedrock Information Automation

Main Menu

Subscribe to Updates

What's Hot

Construct scalable containerized RAG primarily based generative AI purposes in AWS utilizing Amazon EKS with Amazon Bedrock

Resolution overview

Conditions

Deploy the answer

Configure EKS

Create a data base and add knowledge

Question utilizing the Software Load Balancer

Cleanup

Conclusion

In regards to the Authors

Related Posts