Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Siemens launches enhanced movement management portfolio for fundamental automation functions

    June 10, 2025

    Envisioning a future the place well being care tech leaves some behind | MIT Information

    June 10, 2025

    Hidden Backdoors in npm Packages Let Attackers Wipe Whole Methods

    June 10, 2025
    Facebook X (Twitter) Instagram
    UK Tech Insider
    Facebook X (Twitter) Instagram Pinterest Vimeo
    UK Tech Insider
    Home»Machine Learning & Research»Construct scalable containerized RAG primarily based generative AI purposes in AWS utilizing Amazon EKS with Amazon Bedrock
    Machine Learning & Research

    Construct scalable containerized RAG primarily based generative AI purposes in AWS utilizing Amazon EKS with Amazon Bedrock

    Oliver ChambersBy Oliver ChambersMay 13, 2025No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Construct scalable containerized RAG primarily based generative AI purposes in AWS utilizing Amazon EKS with Amazon Bedrock
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    Generative synthetic intelligence (AI) purposes are generally constructed utilizing a method known as Retrieval Augmented Technology (RAG) that gives basis fashions (FMs) entry to extra knowledge they didn’t have throughout coaching. This knowledge is used to complement the generative AI immediate to ship extra context-specific and correct responses with out repeatedly retraining the FM, whereas additionally bettering transparency and minimizing hallucinations.

    On this put up, we exhibit an answer utilizing Amazon Elastic Kubernetes Service (EKS) with Amazon Bedrock to construct scalable and containerized RAG options to your generative AI purposes on AWS whereas bringing your unstructured consumer file knowledge to Amazon Bedrock in an easy, quick, and safe approach.

    Amazon EKS supplies a scalable, safe, and cost-efficient setting for constructing RAG purposes with Amazon Bedrock and in addition permits environment friendly deployment and monitoring of AI-driven workloads whereas leveraging Bedrock’s FMs for inference. It enhances efficiency with optimized compute situations, auto-scales GPU workloads whereas lowering prices by way of Amazon EC2 Spot Cases and AWS Fargate and supplies enterprise-grade safety by way of native AWS mechanisms similar to Amazon VPC networking and AWS IAM.

    Our answer makes use of Amazon S3 because the supply of unstructured knowledge and populates an Amazon OpenSearch Serverless vector database by way of the usage of Amazon Bedrock Information Bases with the consumer’s current recordsdata and folders and related metadata. This allows a RAG state of affairs with Amazon Bedrock by enriching the generative AI immediate utilizing Amazon Bedrock APIs together with your company-specific knowledge retrieved from the OpenSearch Serverless vector database.

    Resolution overview

    The answer makes use of Amazon EKS managed node teams to automate the provisioning and lifecycle administration of nodes (Amazon EC2 situations) for the Amazon EKS Kubernetes cluster. Each managed node within the cluster is provisioned as a part of an Amazon EC2 Auto Scaling group that’s managed for you by EKS.

    The EKS cluster consists of a Kubernetes deployment that runs throughout two Availability Zones for prime availability the place every node within the deployment hosts a number of replicas of a Bedrock RAG container picture registered and pulled from Amazon Elastic Container Registry (ECR). This setup makes positive that assets are used effectively, scaling up or down primarily based on the demand. The Horizontal Pod Autoscaler (HPA) is ready as much as additional scale the variety of pods in our deployment primarily based on their CPU utilization.

    The RAG Retrieval Software container makes use of Bedrock Information Bases APIs and Anthropic’s Claude 3.5 Sonnet LLM hosted on Bedrock to implement a RAG workflow. The answer supplies the tip consumer with a scalable endpoint to entry the RAG workflow utilizing a Kubernetes service that’s fronted by an Amazon Software Load Balancer (ALB) provisioned by way of an EKS ingress controller.

    The RAG Retrieval Software container orchestrated by EKS permits RAG with Amazon Bedrock by enriching the generative AI immediate acquired from the ALB endpoint with knowledge retrieved from an OpenSearch Serverless index that’s synced by way of Bedrock Information Bases out of your company-specific knowledge uploaded to Amazon S3.

    The next structure diagram illustrates the varied elements of our answer:

    Conditions

    Full the next stipulations:

    1. Guarantee mannequin entry in Amazon Bedrock. On this answer, we use Anthropic’s Claude 3.5 Sonnet on Amazon Bedrock.
    2. Set up the AWS Command Line Interface (AWS CLI).
    3. Set up Docker.
    4. Set up Kubectl.
    5. Set up Terraform.

    Deploy the answer

    The answer is out there for obtain on the GitHub repo. Cloning the repository and utilizing the Terraform template will provision the elements with their required configurations:

    1. Clone the Git repository:
      sudo yum set up -y unzip
      git clone https://github.com/aws-samples/genai-bedrock-serverless.git
      cd eksbedrock/terraform

    2. From the terraform folder, deploy the answer utilizing Terraform:
      terraform init
      terraform apply -auto-approve

    Configure EKS

    1. Configure a secret for the ECR registry:
      aws ecr get-login-password 
      --region  | docker login 
      --username AWS 
      --password-stdin .dkr.ecr..amazonaws.com/bedrockragrepodocker pull .dkr.ecr..amazonaws.com/bedrockragrepo:latestaws eks update-kubeconfig 
      --region  
      --name eksbedrockkubectl create secret docker-registry ecr-secret  
      --docker-server=.dkr.ecr..amazonaws.com 
      --docker-username=AWS 
      --docker-password=$(aws ecr get-login-password --region )

    2. Navigate to the kubernetes/ingress folder:
      • Ensure that the AWS_Region variable within the bedrockragconfigmap.yaml file factors to your AWS area.
      • Exchange the picture URI in line 20 of the bedrockragdeployment.yaml file with the picture URI of your bedrockrag picture out of your ECR repository.
    3. Provision the EKS deployment, service and ingress:
      cd ..
      kubectl apply -f ingress/

    Create a data base and add knowledge

    To create a data base and add knowledge, observe these steps:

    1. Create an S3 bucket and add your knowledge into the bucket. In our weblog put up, we uploaded these two recordsdata, Amazon Bedrock Person Information and the Amazon FSx for ONTAP Person Information, into our S3 bucket.
    2. Create an Amazon Bedrock data base. Comply with the steps right here to create a data base. Settle for all of the defaults together with utilizing the Fast create a brand new vector retailer possibility in Step 7 of the directions that creates an Amazon OpenSearch Serverless vector search assortment as your data base.
      1. In Step 5c of the directions to create a data base, present the S3 URI of the article containing the recordsdata for the information supply for the data base
      2. As soon as the data base is provisioned, get hold of the Information Base ID from the Bedrock Information Bases console to your newly created data base.

    Question utilizing the Software Load Balancer

    You possibly can question the mannequin immediately utilizing the API entrance finish supplied by the AWS ALB provisioned by the Kubernetes (EKS) Ingress Controller. Navigate to the AWS ALB console and procure the DNS identify to your ALB to make use of as your API:

    curl -X POST "/question" 
    
    -H "Content material-Kind: utility/json" 
    
    -d '{"immediate": "What's a bedrock knowledgebase?", "kbId": ""}'

    Cleanup

    To keep away from recurring fees, clear up your account after making an attempt the answer:

    1. From the terraform folder, delete the Terraform template for the answer:
      terraform apply --destroy 
    2. Delete the Amazon Bedrock data base. From the Amazon Bedrock console, choose the data base you created on this answer, choose Delete, and observe the steps to delete the data base.

    Conclusion

    On this put up, we demonstrated an answer that makes use of Amazon EKS with Amazon Bedrock and supplies you with a framework to construct your individual containerized, automated, scalable, and extremely accessible RAG-based generative AI purposes on AWS. Utilizing Amazon S3 and Amazon Bedrock Information Bases, our answer automates bringing your unstructured consumer file knowledge to Amazon Bedrock inside the containerized framework. You should utilize the strategy demonstrated on this answer to automate and containerize your AI-driven workloads whereas utilizing Amazon Bedrock FMs for inference with built-in environment friendly deployment, scalability, and availability from a Kubernetes-based containerized deployment.

    For extra details about methods to get began constructing with Amazon Bedrock and EKS for RAG situations, seek advice from the next assets:


    In regards to the Authors

    Kanishk Mahajan is Principal, Options Structure at AWS. He leads cloud transformation and answer structure for AWS prospects and companions. Kanishk focuses on containers, cloud operations, migrations and modernizations, AI/ML, resilience and safety and compliance. He’s a Technical Discipline Group (TFC) member in every of these domains at AWS.

    Sandeep Batchu is a Senior Safety Architect at Amazon Net Companies, with in depth expertise in software program engineering, options structure, and cybersecurity. Keen about bridging enterprise outcomes with technological innovation, Sandeep guides prospects by means of their cloud journey, serving to them design and implement safe, scalable, versatile, and resilient cloud architectures.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    Updates to Apple’s On-Gadget and Server Basis Language Fashions

    June 9, 2025

    Constructing clever AI voice brokers with Pipecat and Amazon Bedrock – Half 1

    June 9, 2025

    Run the Full DeepSeek-R1-0528 Mannequin Domestically

    June 9, 2025
    Top Posts

    Siemens launches enhanced movement management portfolio for fundamental automation functions

    June 10, 2025

    How AI is Redrawing the World’s Electrical energy Maps: Insights from the IEA Report

    April 18, 2025

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025
    Don't Miss

    Siemens launches enhanced movement management portfolio for fundamental automation functions

    By Arjun PatelJune 10, 2025

    Siemens mentioned customers can configure movement management for fundamental automation functions with its SINAMICS servo…

    Envisioning a future the place well being care tech leaves some behind | MIT Information

    June 10, 2025

    Hidden Backdoors in npm Packages Let Attackers Wipe Whole Methods

    June 10, 2025

    9Uniswap-Slippage-Adjustment-for-Prices

    June 9, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2025 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.