Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock

This publish is cowritten by David Stewart and Matthew Individuals from Oumi.

Effective-tuning open supply massive language fashions (LLMs) usually stalls between experimentation and manufacturing. Coaching configurations, artifact administration, and scalable deployment every require totally different instruments, creating friction when shifting from speedy experimentation to safe, enterprise-grade environments.

On this publish, we present easy methods to fine-tune a Llama mannequin utilizing Oumi on Amazon EC2 (with the choice to create artificial information utilizing Oumi), retailer artifacts in Amazon S3, and deploy to Amazon Bedrock utilizing Customized Mannequin Import for managed inference. Whereas we use EC2 on this walkthrough, fine-tuning may be accomplished on different compute companies corresponding to Amazon SageMaker or Amazon Elastic Kubernetes Service, relying in your wants.

Advantages of Oumi and Amazon Bedrock

Oumi is an open supply system that streamlines the inspiration mannequin lifecycle, from information preparation and coaching to analysis. As a substitute of assembling separate instruments for every stage, you outline a single configuration and reuse it throughout runs.

Key advantages for this workflow:

Recipe-driven coaching: Outline your configuration as soon as and reuse it throughout experiments, decreasing boilerplate and enhancing reproducibility
Versatile fine-tuning: Select full fine-tuning or parameter-efficient strategies like LoRA, based mostly in your constraints
Built-in analysis: Rating checkpoints utilizing benchmarks or LLM-as-a-judge with out extra tooling
Information synthesis: Generate task-specific datasets when manufacturing information is proscribed

Amazon Bedrock enhances this by offering managed, serverless inference. After fine-tuning with Oumi, you import your mannequin by way of Customized Mannequin Import in three steps: add to S3, create the import job, and invoke. No inference infrastructure to handle. The next structure diagram exhibits how these elements work collectively.

Determine 1: Oumi manages information, coaching, and analysis on EC2. Amazon Bedrock gives managed inference by way of Customized Mannequin Import.

Resolution overview

This workflow consists of three phases:

Effective-tune with Oumi on EC2: Launch a GPU-optimized occasion (for instance, g5.12xlarge or p4d.24xlarge), set up Oumi, and run coaching together with your configuration. For bigger fashions, Oumi helps distributed coaching with Totally Sharded Information Parallel (FSDP), DeepSpeed, and Distributed Information Parallel (DDP) methods throughout multi-GPU or multi-node setups.
Retailer artifacts on S3: Add mannequin weights, checkpoints, and logs for sturdy storage.
Deploy to Amazon Bedrock: Create a Customized Mannequin Import job pointing to your S3 artifacts. Amazon Bedrock provisions inference infrastructure mechanically. Shopper functions name the imported mannequin utilizing the Amazon Bedrock Runtime APIs.

This structure addresses widespread challenges in shifting fine-tuned fashions to manufacturing:

Technical implementation

Let’s stroll by way of a hands-on workflow utilizing the meta-llama/Llama-3.2-1B-Instruct mannequin for instance. Whereas we chosen this mannequin because it pairs properly with fine-tuning on an AWS g6.12xlarge EC2 occasion, the identical workflow may be replicated throughout many different open supply fashions (word that bigger fashions might require bigger situations or distributed coaching throughout situations). For extra info, see the Oumi mannequin fine-tuning recipes and Amazon Bedrock customized mannequin architectures.

Stipulations

To finish this walkthrough, you want:

Arrange AWS Assets

Clone this repository in your native machine:

git clone https://github.com/aws-samples/sample-oumi-fine-tuning-bedrock-cmi.git
cd sample-oumi-fine-tuning-bedrock-cmi

Run the setup script to create IAM roles, an S3 bucket, and launch a GPU-optimized EC2 occasion:

./scripts/setup-aws-env.sh [--dry-run]

The script prompts to your AWS Area, S3 bucket title, EC2 key pair title, and safety group ID, then creates all required assets. Defaults: g6.12xlarge occasion, Deep Studying Base AMI with Single CUDA (Amazon Linux 2023), and 100 GB gp3 storage. Be aware: In the event you do not need permissions to create IAM roles or launch EC2 situations, share this repository together with your IT administrator and ask them to finish this part to arrange your AWS surroundings.

As soon as the occasion is operating, the script outputs the SSH command and the Amazon Bedrock import function ARN (wanted in Step 5). SSH into the occasion and proceed with Step 1 beneath.

See the iam/README.md for IAM coverage particulars, scoping steering, and validation steps.

Step 1: Arrange the EC2 surroundings

Full the next steps to arrange the EC2 surroundings.

On the EC2 occasion (Amazon Linux 2023), replace the system and set up base dependencies:

sudo yum replace -y
sudo yum set up python3 python3-pip git -y

Clone the companion repository:

git clone https://github.com/aws-samples/sample-oumi-fine-tuning-bedrock-cmi.git
cd sample-oumi-fine-tuning-bedrock-cmi

Configure surroundings variables (change the values together with your precise area and bucket title from the setup script):

export AWS_REGION=us-west-2
export S3_BUCKET=your-bucket-name 
export S3_PREFIX=your-s3-prefix 
aws configure set default.area "$AWS_REGION"

Run the setup script to create a Python digital surroundings, set up Oumi, validate GPU availability, and configure Hugging Face authentication. See setup-environment.sh for choices.

./scripts/setup-environment.sh
supply .venv/bin/activate

Authenticate with Hugging Face to entry gated mannequin weights. Generate an entry token at huggingface.co/settings/tokens, then run:

hf auth login

Step 2: Configure coaching

The default dataset is tatsu-lab/alpaca, configured in configs/oumi-config.yaml. Oumi downloads it mechanically throughout coaching, no guide obtain is required. To make use of a special dataset, replace the dataset_name parameter in configs/oumi-config.yaml. See the Oumi dataset docs for supported codecs.

[Optional] Generate artificial coaching information with Oumi:

To generate artificial information utilizing Amazon Bedrock because the inference backend, replace the model_name placeholder in configs/synthesis-config.yaml with an Amazon Bedrock mannequin ID you will have entry to (e.g. anthropic.claude-sonnet-4-6). See Oumi information synthesis docs for particulars. Then run:

oumi synth -c configs/synthesis-config.yaml

Step 3: Effective-tune the mannequin

Effective-tune the mannequin utilizing Oumi’s built-in coaching recipe for Llama-3.2-1B-Instruct:

./scripts/fine-tune.sh --config configs/oumi-config.yaml --output-dir fashions/closing [--dry-run]

To customise hyperparameters, edit oumi-config.yaml.

Be aware: In the event you generated artificial information in Step 2, replace the dataset path within the config earlier than coaching.

Monitor GPU utilization with nvidia-smi or Amazon CloudWatch Agent. For long-running jobs, configure Amazon EC2 Automated Occasion Restoration to deal with occasion interruptions.

Step 4: Consider mannequin (Optionally available)

You may consider the fine-tuned mannequin utilizing commonplace benchmarks:

oumi consider -c configs/evaluation-config.yaml

The analysis config specifies the mannequin path and benchmark duties (e.g., MMLU). To customise, edit evaluation-config.yaml. For LLM-as-a-judge approaches and extra benchmarks, see Oumi’s analysis information.

Step 5: Deploy to Amazon Bedrock

Full the next steps to deploy the mannequin to Amazon Bedrock:

Add mannequin artifacts to S3 and import the mannequin to Amazon Bedrock.

./scripts/upload-to-s3.sh --bucket $S3_BUCKET --source fashions/closing --prefix $S3_PREFIX
./scripts/import-to-bedrock.sh --model-name my-fine-tuned-llama --s3-uri s3://$S3_BUCKET/$S3_PREFIX --role-arn $BEDROCK_ROLE_ARN --wait

The import script outputs the mannequin ARN on completion. Set MODEL_ARN to this worth (format: arn:aws:bedrock:::imported-model/).
Invoke the mannequin on Amazon Bedrock

./scripts/invoke-model.sh --model-id $MODEL_ARN --prompt "Translate this textual content to French: What's the capital of France?"

Amazon Bedrock creates a managed inference surroundings mechanically. For IAM function arrange, see bedrock-import-role.json.
Allow S3 versioning on the bucket to help rollback of mannequin revisions. For SSE-KMS encryption and bucket coverage hardening, see the safety scripts within the companion repository.

Step 6: Clear up

To keep away from ongoing prices, take away the assets created throughout this walkthrough:

aws ec2 terminate-instances --instance-ids $INSTANCE_ID
aws s3 rm s3://$S3_BUCKET/$S3_PREFIX/ --recursive
aws bedrock delete-imported-model --model-identifier $MODEL_ARN

Conclusion

On this publish, you discovered easy methods to fine-tune a Llama-3.2-1B-Instruct base mannequin utilizing Oumi on EC2 and deploy it utilizing Amazon Bedrock Customized Mannequin Import. This strategy offers you full management over fine-tuning with your personal information whereas utilizing managed inference in Amazon Bedrock.

The companion sample-oumi-fine-tuning-bedrock-cmi repository gives scripts, configurations, and IAM insurance policies to get began. Clone it, swap in your dataset, and deploy a customized mannequin to Amazon Bedrock.

To get began, discover the assets beneath and start constructing your personal fine-tuning-to-deployment pipeline on Oumi and AWS. Joyful Constructing!

Study Extra

Acknowledgement

Particular because of Pronoy Chopra and Jon Turdiev for his or her contribution.

Main Menu

What's Hot

BeatBanker Trojan Spreads by way of Phishing, Deploys Crypto Miner and RAT on Focused Gadgets

Expertise Is Reshaping Sleep Apnea Therapy

My New E-book On Vulnerability Nearly Killed Me…Actually

Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock

Run Tiny AI Fashions Domestically Utilizing BitNet A Newbie Information

From Textual content to Tables: Characteristic Engineering with LLMs for Tabular Knowledge

How Agent Expertise Create Specialised AI With out Coaching – O’Reilly

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

Meta resumes AI coaching utilizing EU person knowledge

BeatBanker Trojan Spreads by way of Phishing, Deploys Crypto Miner and RAT on Focused Gadgets

Expertise Is Reshaping Sleep Apnea Therapy

My New E-book On Vulnerability Nearly Killed Me…Actually

Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock

Main Menu

Subscribe to Updates

What's Hot

Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock

Advantages of Oumi and Amazon Bedrock

Resolution overview

Technical implementation

Stipulations

Arrange AWS Assets

Step 1: Arrange the EC2 surroundings

Step 2: Configure coaching

Step 3: Effective-tune the mannequin

Step 4: Consider mannequin (Optionally available)

Step 5: Deploy to Amazon Bedrock

Step 6: Clear up

Conclusion

Acknowledgement

Concerning the authors

Related Posts