Claude Code is an AI-powered coding assistant from Anthropic that helps builders write, assessment, and modify code via pure language interactions. Amazon Bedrock is a totally managed service that gives entry to basis fashions from main AI corporations via a single API. This put up exhibits you easy methods to deploy Claude Code with Amazon Bedrock. You’ll be taught authentication strategies, infrastructure selections, and monitoring methods to deploy securely at enterprise scale.
Suggestions for many enterprises
We suggest the Steerage for Claude Code with Amazon Bedrock, which implements confirmed patterns that may be deployed in hours.
Deploy Claude Code with this confirmed stack:
This structure supplies safe entry with person attribution, capability administration, and visibility into prices and developer productiveness.
Authentication strategies
Claude Code deployments start with authenticating to Amazon Bedrock. The authentication choice impacts downstream safety, monitoring, operations, and developer expertise.
Authentication strategies comparability
| Characteristic | API Keys | AWS log in | SSO with IAM Id Middle | Direct IdP Integration |
| Session length | Indefinite | Configurable (as much as 12 hours) | Configurable (as much as 12 hours) | Configurable (as much as 12 hours) |
| Setup time | Minutes | Minutes | Hours | Hours |
| Safety danger | Excessive | Low | Low | Low |
| Person attribution | None | Primary | Primary | Full |
| MFA help | No | Sure | Sure | Sure |
| OpenTelemetry integration | None | Restricted | Restricted | Full |
| Value allocation | None | Restricted | Restricted | Full |
| Operation overhead | Excessive | Medium | Medium | Low |
| Use case | Quick time period testing | Testing and restricted deployments | Fast SSO deployment | Manufacturing deployment |
The next will focus on the trade-offs and implementation concerns specified by the above desk.
API keys
Amazon Bedrock helps API keys because the quickest path to proof-of-concept. Each short-term (12-hour) and long-term (indefinite) keys could be generated via the AWS Administration Console, AWS CLI, or SDKs.
Nonetheless, API keys create safety vulnerabilities via persistent entry with out MFA, handbook distribution necessities, and danger of repository commits. They supply no person attribution for price allocation or monitoring. Use just for short-term testing (< 1 week, 12-hour expiration).
AWS log in
The aws login command makes use of your AWS Administration Console credentials for Amazon Bedrock entry via a browser-based authentication circulation. It helps fast setup with out API keys and is advisable for testing and small deployments.
Single Signal-On (SSO)
AWS IAM Id Middle integrates with present enterprise id suppliers via OpenID Join (OIDC), an authentication protocol that permits single sign-on by permitting id suppliers to confirm person identities and share authentication data with functions. This integration permits builders to make use of company credentials to entry Amazon Bedrock with out distributing API keys.
Builders authenticate with AWS IAM Id Middle utilizing the aws sso login command, which generates non permanent credentials with configurable session durations. These credentials mechanically refresh, lowering the operational overhead of credential administration whereas enhancing safety via non permanent, time-limited entry.
Organizations utilizing IAM Id Middle for AWS entry can lengthen this sample to Claude Code. Nonetheless, it limits detailed user-level monitoring by not exposing OIDC JWT tokens for OpenTelemetry attribute extraction.
This authentication methodology fits organizations that prioritize speedy SSO deployment over detailed monitoring or preliminary rollouts the place complete metrics aren’t but required.
Direct idP integration
Direct OIDC federation together with your id supplier (Okta, Azure AD, Auth0, or AWS Cognito Person Swimming pools) is advisable for manufacturing Claude Code deployments. This strategy connects your enterprise id supplier on to AWS IAM to generate non permanent credentials with full person context for monitoring.
The course of credential supplier orchestrates the OAuth2 authentication with PKCE, a safety extension that helps forestall authorization code interception. Builders authenticate of their browser, exchanging OIDC tokens for AWS non permanent credentials.
A helper script makes use of AWS Safety Token Service (STS) AssumeRoleWithWebIdentity to imagine a job with credentials to InvokeModel and InvokeModelWithStreaming to make use of Amazon Bedrock. Direct IAM federation helps session durations as much as 12 hours and the JWT token stays accessible all through the session, enabling monitoring via OpenTelemetry to trace person attributes like electronic mail, division, and workforce.
The Steerage for Claude Code with Amazon Bedrock implements each Cognito Id Pool and Direct IAM federation patterns, however recommends Direct IAM for simplicity. The answer supplies an interactive setup wizard that configures your OIDC supplier integration, deploys the required IAM infrastructure, and builds distribution packages for Home windows, macOS, and Linux.
Builders obtain set up packages that configure their AWS CLI profile to make use of the credential course of. Authentication happens via company credentials, with computerized browser opening to refresh credentials. The credential course of handles token caching, credential refresh, and error restoration.
For organizations requiring detailed utilization monitoring, price attribution by developer, and complete audit trails, direct IdP integration via IAM federation supplies the muse for superior monitoring capabilities mentioned later on this put up.
Organizational selections
Past authentication, architectural selections form how Claude Code integrates together with your AWS infrastructure. These decisions have an effect on operational complexity, price administration, and enforcement of utilization insurance policies.
Public endpoints
Amazon Bedrock supplies managed, public API endpoints in a number of AWS Areas with minimal operational overhead. AWS manages infrastructure, scaling, availability, and safety patching. Builders use customary AWS credentials via AWS CLI profiles or surroundings variables. Mixed with OpenTelemetry metrics from Direct IdP integration, you possibly can observe utilization via public endpoints by particular person developer, division, or price middle and could be enforced on the AWS IAM stage. For instance, implementing per-developer price limiting requires infrastructure that observes CloudWatch metrics or CloudTrail logs and takes automated motion. Organizations requiring fast, request-level blocking based mostly on customized enterprise logic may have extra parts corresponding to an LLM (Massive Language Mannequin) gateway sample. Public Amazon Bedrock endpoints are enough for many organizations as they supply a steadiness of simplicity, AWS managed reliability, price alerting, and applicable management mechanisms.
LLM gateway
An LLM gateway introduces an middleman software layer between builders and Amazon Bedrock, routing requests via customized infrastructure. The Steerage for Multi-Supplier Generative AI Gateway on AWS describes this sample, deploying a containerized proxy service with load balancing and centralized credential administration.
This structure is greatest for:
- Multi-provider help: Routing between Amazon Bedrock, OpenAI, and Azure OpenAI based mostly on availability, price, or functionality
- Customized middleware: Proprietary immediate engineering, content material filtering, or immediate injection detection on the request stage
- Request-level coverage enforcement: Rapid blocking of requests exceeding customized enterprise logic past IAM capabilities
Gateways present unified APIs and real-time monitoring however add operational overhead: Amazon Elastic Container Service (Amazon ECS)/Amazon Elastic Kubernetes Service (Amazon EKS) infrastructure, Elastic Load Balancing (ELB) Utility Load Balancers, Amazon ElastiCache, Amazon Relational Database Service (Amazon RDS) administration, elevated latency, and a brand new failure mode the place gateway points block Claude Code utilization. LLM gateways excel for functions making programmatic calls to LLMs, offering centralized monitoring, per person visibility, and unified management entry suppliers.
For conventional API entry situations, organizations can deploy gateways to realize monitoring and attribution capabilities. The Claude Code steering resolution already consists of monitoring and attribution capabilities via Direct IdP authentication, OpenTelemetry metrics, IAM insurance policies, and CloudWatch dashboards. Including an LLM gateway to the steering resolution duplicates present performance. Think about gateways just for multi-provider help, customized middleware, or request-level coverage enforcement past IAM.
Single account implementation
We suggest consolidating coding assistant inferences in a single devoted account, separate out of your improvement and manufacturing workloads. This strategy supplies 5 key advantages:
- Simplified operations: Handle quotas and monitor utilization via unified dashboards as an alternative of monitoring throughout a number of accounts. Request quota will increase as soon as somewhat than per account.
- Clear price visibility: AWS Value Explorer and Value and Utilization Stories present Claude Code fees straight with out complicated tagging. OpenTelemetry metrics allow division and team-level allocation.
- Centralized safety: CloudTrail logs circulation to at least one location for monitoring and compliance. Deploy the monitoring stack as soon as to gather metrics from builders.
- Manufacturing safety: Account-level isolation helps forestall Claude Code utilization from exhausting quotas and throttling manufacturing functions. Manufacturing site visitors spikes don’t have an effect on developer productiveness.
- Implementation: Cross-account IAM configuration lets builders authenticate via id suppliers that federate to restricted roles, granting solely mannequin invocation permissions with applicable guardrails.
This technique integrates with Direct IdP authentication and OpenTelemetry monitoring. Id suppliers deal with authentication, the devoted account handles inference, and improvement accounts concentrate on functions.
Inference profiles
Amazon Bedrock inference profiles present price monitoring via useful resource tagging, however don’t scale to per-developer granularity. Whilst you can create software profiles for price allocation, managing profiles for 1000+ particular person builders turns into operationally burdensome. Inference profiles work greatest for organizations with 10-50 distinct groups requiring remoted price monitoring, or when utilizing cross-Area inference the place managed routing distributes requests throughout AWS Areas. They’re preferrred for situations requiring primary price allocation somewhat than complete monitoring.
System-defined cross-Area inference profiles mechanically route requests throughout a number of AWS Areas, distributing load for larger throughput and availability. Whenever you invoke a cross-Area profile (e.g., us.anthropic.claude-sonnet-4), Amazon Bedrock selects an out there Area to course of your request.
Utility inference profiles are profiles you create explicitly in your account, usually wrapped round a system-defined profile or a particular mannequin in a Area. You may tag software profiles with customized key-value pairs like workforce:data-science or challenge:fraud-detection that circulation to AWS Value and Utilization Stories for price allocation evaluation. To create an software profile:
Tags seem in AWS Value and Utilization Stories, so you possibly can question:
"What did the data-science workforce spend on Amazon Bedrock final month?"
Every profile have to be referenced explicitly in API calls, that means builders’ credential configurations should specify their distinctive profile somewhat than a shared endpoint.
For extra on inference profiles, see Amazon Bedrock Inference Profiles documentation.
Monitoring
An efficient monitoring technique transforms Claude Code from a productiveness instrument right into a measurable funding by monitoring utilization, prices, and impression.
Progressive enhancement path
Monitoring layers are complementary. Organizations usually begin with primary visibility and add capabilities as ROI necessities justify extra infrastructure.
Let’s discover every stage and when it is sensible to your deployment.
Notice: Infrastructure prices develop progressively—every stage retains the earlier layers whereas including new parts.
CloudWatch
Amazon Bedrock publishes metrics to Amazon CloudWatch mechanically, monitoring invocation counts, throttling errors, and latency. CloudWatch graphs present mixture traits corresponding to whole requests, common latency, and quota utilization with minimal deployment effort. This baseline monitoring is included in the usual pricing of CloudWatch and requires minimal deployment effort. You may create CloudWatch alarms that notify you when invocation charges spike, error charges exceed thresholds, or latency degrades.
Invocation logging
Amazon Bedrock invocation logging captures detailed details about every API name to Amazon S3 or CloudWatch Logs, preserving particular person request data together with invocation metadata and full request/response knowledge. Course of logs with Amazon Athena, load into knowledge warehouses, or analyze with customized instruments. The logs show utilization patterns, invocations by mannequin, peak utilization, and an audit path of Amazon Bedrock entry.
OpenTelemetry
Claude Code consists of help for OpenTelemetry, an open supply observability framework for accumulating software telemetry knowledge. When configured with an OpenTelemetry collector endpoint, Claude Code emits detailed metrics about its operations for each Amazon Bedrock API calls and higher-level improvement actions.
The telemetry captures detailed code-level metrics not included in Amazon Bedrock’s default logging, corresponding to: traces of code added/deleted, recordsdata modified, programming languages used, and builders’ acceptance charges of Claude’s strategies. It additionally tracks key operations together with file edits, code searches, documentation requests, and refactoring duties.
The steering resolution deploys OpenTelemetry infrastructure on Amazon ECS Fargate. An Utility Load Balancer receives telemetry over HTTP(S) and forwards metrics to an OpenTelemetry Collector. The collector exports knowledge to Amazon CloudWatch and Amazon S3.
Dashboard
The steering resolution features a CloudWatch dashboard that shows key metrics constantly, monitoring lively customers by hour, day, or week to disclose adoption and utilization traits that allow per-user price calculation. Token consumption breaks down by enter, output, and cached tokens, with excessive cache hit charges indicating environment friendly context reuse and per-user views figuring out heavy customers. Code exercise metrics observe traces added and deleted, correlating with token utilization to point out effectivity and utilization patterns.
The operations breakdown exhibits distribution of file edits, code searches, and documentation requests, whereas person leaderboards show prime shoppers by tokens, traces of code, or session length.
The dashboard updates in near-real-time and integrates with CloudWatch alarms to set off notifications when metrics exceed thresholds. The steering resolution deploys via CloudFormation with customized Lambda capabilities for complicated aggregations.
Analytics
Whereas dashboards excel at real-time monitoring, long-term traits and complicated person habits evaluation require analytical instruments. The steering resolution’s optionally available analytics stack streams metrics to Amazon S3 utilizing Amazon Knowledge Firehose. AWS Glue Knowledge Catalog defines the schema, making knowledge queryable via Amazon Athena.
The analytics layer helps queries corresponding to month-to-month token consumption by division, code acceptance charges by programming language, and token effectivity variations throughout groups. Value evaluation turns into refined by becoming a member of token metrics with Amazon Bedrock pricing to calculate actual prices by person, then mixture for department-level chargeback. Time-series evaluation exhibits how prices scale with workforce progress for funds forecasting. The SQL interface integrates with enterprise intelligence instruments, enabling exports to spreadsheets, machine studying fashions, or challenge administration techniques.
For instance, to see the month-to-month price evaluation by division:
The infrastructure provides average price: Knowledge Firehose fees for ingestion, S3 for retention, and Athena fees per question based mostly on knowledge scanned.
Allow analytics whenever you want historic evaluation, complicated queries, or integration with enterprise intelligence instruments. Whereas the dashboard alone could suffice for small deployments or organizations targeted totally on real-time monitoring, enterprises making important investments in Claude Code ought to implement the analytics layer. This supplies the visibility wanted to show return on funding and optimize utilization over time.
Quotas
Quotas permit organizations to regulate and handle token consumption by setting utilization limits for particular person builders or groups. Earlier than implementing quotas, we suggest first enabling monitoring to know pure utilization patterns. Utilization knowledge usually exhibits that prime token consumption correlates with excessive productiveness, indicating that heavy customers ship proportional worth.
The quota system shops limits in DynamoDB with entries like:
A Lambda perform triggered by CloudWatch Occasions aggregates token consumption each quarter-hour, updating DynamoDB and publishing to SNS when thresholds are crossed.
Monitoring comparability
The next desk summarizes the trade-offs throughout monitoring approaches:
| Functionality | CloudWatch | Invocation logging | OpenTelemetry | Dashboard and Analytics |
| Arrange complexity | None | Low | Medium | Medium |
| Person attribution | None | IAM Id | Full | Full |
| Actual-time metrics | Sure | No | Sure | Sure |
| Code-level metrics | No | No | Sure | Sure |
| Historic evaluation | Restricted | Sure | Sure | Sure |
| Value allocation | Account stage | Account stage | Person, workforce, division | Person, workforce, division |
| Token observe | Combination | Per-request | Per-user | Per-user with traits |
| Quota enforcement | Guide | Guide | Potential | Potential |
| Operational overhead | Minimal | Low | Medium | Medium |
| Value | Minimal | Low | Medium | Medium |
| Use case | POC | Primary auditing | Manufacturing | Enterprise with ROI |
Placing it collectively
This part synthesizes authentication strategies, organizational structure, and monitoring methods right into a advisable deployment sample, offering steering on implementation priorities as your deployment matures. This structure balances safety, operational simplicity, and complete visibility. Builders authenticate as soon as per day with company credentials, directors see real-time utilization in dashboards, and safety groups have CloudTrail audit logs and complete user-attributed metrics via OpenTelemetry.
Implementation path
The steering resolution helps speedy deployment via an interactive setup course of, with authentication and monitoring working inside hours. Deploy the complete stack to a pilot group first, collect actual utilization knowledge, then increase based mostly on validated patterns.
- Deployment – Clone the Steerage for Claude Code with Amazon Bedrock repository and run the interactive poetry run
ccwb initwizard. The wizard configures your id supplier, federation kind, AWS Areas, and optionally available monitoring. Deploy the CloudFormation stacks (usually 15-Half-hour), construct distribution packages, and check authentication domestically earlier than distributing to customers.
- Distribution – Establish a pilot group of 5-20 builders from totally different groups. This group will validate authentication, monitoring, and supply utilization knowledge for full rollout planning. When you enabled monitoring, the CloudWatch dashboard exhibits exercise instantly. You may monitor token consumption, code acceptance charges, and operation varieties to estimate capability necessities, determine coaching wants, and show worth for a broader rollout.
- Growth – As soon as Claude Code is validated, increase adoption by workforce or division. Add the analytics stack (usually 1-2 hours) for historic pattern evaluation to see adoption charges, high-performing groups, and prices forecasts.
- Optimization – Use monitoring knowledge for steady enchancment via common assessment cycles with improvement management. The monitoring knowledge can show worth, determine coaching wants, and information capability changes.
When to deviate from the advisable sample
Whereas the structure above fits most enterprise deployments, particular circumstances may justify totally different approaches.
- Think about an LLM gateway if you happen to want a number of LLM suppliers past Amazon Bedrock, customized middleware for immediate processing or response filtering, or function in a regulatory surroundings requiring request-level coverage enforcement past the AWS IAM capabilities.
- Think about inference profiles in case you have beneath 50 groups requiring separate price monitoring and like AWS-native billing allocation over telemetry metrics. Inference profiles work effectively for project-based price allocation however don’t scale to per-developer monitoring.
- Think about beginning with out monitoring for time-limited pilots with beneath 10 builders the place primary CloudWatch metrics suffice. Plan so as to add monitoring earlier than scaling, as retrofitting requires redistributing packages to builders.
- Think about API keys just for time-boxed testing (beneath one week) the place safety dangers are acceptable.
Conclusion
Deploying Claude Code with Amazon Bedrock at enterprise scale requires considerate authentication, structure, and monitoring selections. Manufacturing-ready deployments observe a transparent sample: Direct IdP integration supplies safe, user-attributed entry and a devoted AWS account simplifies capability administration. OpenTelemetry monitoring supplies visibility into prices and developer productiveness. The Steerage for Claude Code with Amazon Bedrock implements these patterns in a deployable resolution. Begin with authentication and primary monitoring, then progressively add options as you scale.
As AI-powered improvement instruments change into the trade customary, organizations that prioritize safety, monitoring, and operational excellence of their deployments will acquire lasting benefits. This information supplies a complete framework that can assist you maximize Claude Code’s potential throughout your enterprise.
To get began, go to the Steerage for Claude Code with Amazon Bedrock repository.
Concerning the authors
Courtroom Schuett is a Principal Specialist Answer Architect – GenAI who spends his days working with AI Coding Assistants to assist others get essentially the most out of them. Exterior of labor, Courtroom enjoys touring, listening to music, and woodworking.
Jawhny Cooke is the World Tech Lead for Anthropic’s Claude Code at AWS, the place he focuses on serving to enterprises operationalize agentic coding at scale. He companions with clients and companions to unravel the complicated manufacturing challenges of AI-assisted improvement, from designing autonomous coding workflows and orchestrating multi-agent techniques to operational optimization on AWS infrastructure. His work bridges cutting-edge AI capabilities with enterprise-grade reliability to assist organizations confidently undertake Claude Code in manufacturing environments.
Karan Lakhwani is a Sr. Buyer Options Supervisor at Amazon Internet Companies. He focuses on generative AI applied sciences and is an AWS Golden Jacket recipient. Exterior of labor, Karan enjoys discovering new eating places and snowboarding.
Gabe Levy is an Affiliate Supply Advisor at AWS based mostly out of New York primarily targeted on Utility Growth within the cloud. Gabe has a sub-specialization in Synthetic Intelligence and Machine Studying. When not working with AWS clients, he enjoys exercising, studying and spending time with household and buddies.
Gabriel Velazquez Lopez is a GenAI Product Chief at AWS, the place he leads the technique, go-to-market, and product launches for Claude on AWS in partnership with Anthropic.

