Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    BeatBanker Trojan Spreads by way of Phishing, Deploys Crypto Miner and RAT on Focused Gadgets

    March 11, 2026

    Expertise Is Reshaping Sleep Apnea Therapy

    March 11, 2026

    My New E-book On Vulnerability Nearly Killed Me…Actually

    March 11, 2026
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Home»Machine Learning & Research»Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock
    Machine Learning & Research

    Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock

    Oliver ChambersBy Oliver ChambersMarch 11, 2026No Comments9 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    This publish is cowritten by David Stewart and Matthew Individuals from Oumi.

    Effective-tuning open supply massive language fashions (LLMs) usually stalls between experimentation and manufacturing. Coaching configurations, artifact administration, and scalable deployment every require totally different instruments, creating friction when shifting from speedy experimentation to safe, enterprise-grade environments.

    On this publish, we present easy methods to fine-tune a Llama mannequin utilizing Oumi on Amazon EC2 (with the choice to create artificial information utilizing Oumi), retailer artifacts in Amazon S3, and deploy to Amazon Bedrock utilizing Customized Mannequin Import for managed inference. Whereas we use EC2 on this walkthrough, fine-tuning may be accomplished on different compute companies corresponding to Amazon SageMaker or Amazon Elastic Kubernetes Service, relying in your wants.

    Advantages of Oumi and Amazon Bedrock

    Oumi is an open supply system that streamlines the inspiration mannequin lifecycle, from information preparation and coaching to analysis. As a substitute of assembling separate instruments for every stage, you outline a single configuration and reuse it throughout runs.

    Key advantages for this workflow:

    • Recipe-driven coaching: Outline your configuration as soon as and reuse it throughout experiments, decreasing boilerplate and enhancing reproducibility
    • Versatile fine-tuning: Select full fine-tuning or parameter-efficient strategies like LoRA, based mostly in your constraints
    • Built-in analysis: Rating checkpoints utilizing benchmarks or LLM-as-a-judge with out extra tooling
    • Information synthesis: Generate task-specific datasets when manufacturing information is proscribed

    Amazon Bedrock enhances this by offering managed, serverless inference. After fine-tuning with Oumi, you import your mannequin by way of Customized Mannequin Import in three steps: add to S3, create the import job, and invoke. No inference infrastructure to handle. The next structure diagram exhibits how these elements work collectively.

    Determine 1: Oumi manages information, coaching, and analysis on EC2. Amazon Bedrock gives managed inference by way of Customized Mannequin Import.

    Resolution overview

    This workflow consists of three phases:

    1. Effective-tune with Oumi on EC2: Launch a GPU-optimized occasion (for instance, g5.12xlarge or p4d.24xlarge), set up Oumi, and run coaching together with your configuration. For bigger fashions, Oumi helps distributed coaching with Totally Sharded Information Parallel (FSDP), DeepSpeed, and Distributed Information Parallel (DDP) methods throughout multi-GPU or multi-node setups.
    2. Retailer artifacts on S3: Add mannequin weights, checkpoints, and logs for sturdy storage.
    3. Deploy to Amazon Bedrock: Create a Customized Mannequin Import job pointing to your S3 artifacts. Amazon Bedrock provisions inference infrastructure mechanically. Shopper functions name the imported mannequin utilizing the Amazon Bedrock Runtime APIs.

    This structure addresses widespread challenges in shifting fine-tuned fashions to manufacturing:

    Technical implementation

    Let’s stroll by way of a hands-on workflow utilizing the meta-llama/Llama-3.2-1B-Instruct mannequin for instance. Whereas we chosen this mannequin because it pairs properly with fine-tuning on an AWS g6.12xlarge EC2 occasion, the identical workflow may be replicated throughout many different open supply fashions (word that bigger fashions might require bigger situations or distributed coaching throughout situations). For extra info, see the Oumi mannequin fine-tuning recipes and Amazon Bedrock customized mannequin architectures.

    Stipulations

    To finish this walkthrough, you want:

    Arrange AWS Assets

    1. Clone this repository in your native machine:
    git clone https://github.com/aws-samples/sample-oumi-fine-tuning-bedrock-cmi.git
    cd sample-oumi-fine-tuning-bedrock-cmi
    1. Run the setup script to create IAM roles, an S3 bucket, and launch a GPU-optimized EC2 occasion:
    ./scripts/setup-aws-env.sh [--dry-run]

    The script prompts to your AWS Area, S3 bucket title, EC2 key pair title, and safety group ID, then creates all required assets. Defaults: g6.12xlarge occasion, Deep Studying Base AMI with Single CUDA (Amazon Linux 2023), and 100 GB gp3 storage. Be aware: In the event you do not need permissions to create IAM roles or launch EC2 situations, share this repository together with your IT administrator and ask them to finish this part to arrange your AWS surroundings.

    1. As soon as the occasion is operating, the script outputs the SSH command and the Amazon Bedrock import function ARN (wanted in Step 5). SSH into the occasion and proceed with Step 1 beneath.

    See the iam/README.md for IAM coverage particulars, scoping steering, and validation steps.

    Step 1: Arrange the EC2 surroundings

    Full the next steps to arrange the EC2 surroundings.

    1. On the EC2 occasion (Amazon Linux 2023), replace the system and set up base dependencies:
    sudo yum replace -y
    sudo yum set up python3 python3-pip git -y
    1. Clone the companion repository:
    git clone https://github.com/aws-samples/sample-oumi-fine-tuning-bedrock-cmi.git
    cd sample-oumi-fine-tuning-bedrock-cmi
    1. Configure surroundings variables (change the values together with your precise area and bucket title from the setup script):
    export AWS_REGION=us-west-2
    export S3_BUCKET=your-bucket-name 
    export S3_PREFIX=your-s3-prefix 
    aws configure set default.area "$AWS_REGION"
    1. Run the setup script to create a Python digital surroundings, set up Oumi, validate GPU availability, and configure Hugging Face authentication. See setup-environment.sh for choices.
    ./scripts/setup-environment.sh
    supply .venv/bin/activate
    1. Authenticate with Hugging Face to entry gated mannequin weights. Generate an entry token at huggingface.co/settings/tokens, then run:
    hf auth login

    Step 2: Configure coaching

    The default dataset is tatsu-lab/alpaca, configured in configs/oumi-config.yaml. Oumi downloads it mechanically throughout coaching, no guide obtain is required. To make use of a special dataset, replace the dataset_name parameter in configs/oumi-config.yaml. See the Oumi dataset docs for supported codecs.

    [Optional] Generate artificial coaching information with Oumi:

    To generate artificial information utilizing Amazon Bedrock because the inference backend, replace the model_name placeholder in configs/synthesis-config.yaml with an Amazon Bedrock mannequin ID you will have entry to (e.g. anthropic.claude-sonnet-4-6). See Oumi information synthesis docs for particulars. Then run:

    oumi synth -c configs/synthesis-config.yaml

    Step 3: Effective-tune the mannequin

    Effective-tune the mannequin utilizing Oumi’s built-in coaching recipe for Llama-3.2-1B-Instruct:

    ./scripts/fine-tune.sh --config configs/oumi-config.yaml --output-dir fashions/closing [--dry-run]

    To customise hyperparameters, edit oumi-config.yaml.

    Be aware: In the event you generated artificial information in Step 2, replace the dataset path within the config earlier than coaching.

    Monitor GPU utilization with nvidia-smi or Amazon CloudWatch Agent. For long-running jobs, configure Amazon EC2 Automated Occasion Restoration to deal with occasion interruptions.

    Step 4: Consider mannequin (Optionally available)

    You may consider the fine-tuned mannequin utilizing commonplace benchmarks:

    oumi consider -c configs/evaluation-config.yaml

    The analysis config specifies the mannequin path and benchmark duties (e.g., MMLU). To customise, edit evaluation-config.yaml. For LLM-as-a-judge approaches and extra benchmarks, see Oumi’s analysis information.

    Step 5: Deploy to Amazon Bedrock

    Full the next steps to deploy the mannequin to Amazon Bedrock:

    1. Add mannequin artifacts to S3 and import the mannequin to Amazon Bedrock.
    ./scripts/upload-to-s3.sh --bucket $S3_BUCKET --source fashions/closing --prefix $S3_PREFIX
    ./scripts/import-to-bedrock.sh --model-name my-fine-tuned-llama --s3-uri s3://$S3_BUCKET/$S3_PREFIX --role-arn $BEDROCK_ROLE_ARN --wait
    1. The import script outputs the mannequin ARN on completion. Set MODEL_ARN to this worth (format: arn:aws:bedrock:::imported-model/).
    2. Invoke the mannequin on Amazon Bedrock
    ./scripts/invoke-model.sh --model-id $MODEL_ARN --prompt "Translate this textual content to French: What's the capital of France?"
    1. Amazon Bedrock creates a managed inference surroundings mechanically. For IAM function arrange, see bedrock-import-role.json.
    2. Allow S3 versioning on the bucket to help rollback of mannequin revisions. For SSE-KMS encryption and bucket coverage hardening, see the safety scripts within the companion repository.

    Step 6: Clear up

    To keep away from ongoing prices, take away the assets created throughout this walkthrough:

    aws ec2 terminate-instances --instance-ids $INSTANCE_ID
    aws s3 rm s3://$S3_BUCKET/$S3_PREFIX/ --recursive
    aws bedrock delete-imported-model --model-identifier $MODEL_ARN

    Conclusion

    On this publish, you discovered easy methods to fine-tune a Llama-3.2-1B-Instruct base mannequin utilizing Oumi on EC2 and deploy it utilizing Amazon Bedrock Customized Mannequin Import. This strategy offers you full management over fine-tuning with your personal information whereas utilizing managed inference in Amazon Bedrock.

    The companion sample-oumi-fine-tuning-bedrock-cmi repository gives scripts, configurations, and IAM insurance policies to get began. Clone it, swap in your dataset, and deploy a customized mannequin to Amazon Bedrock.

    To get began, discover the assets beneath and start constructing your personal fine-tuning-to-deployment pipeline on Oumi and AWS. Joyful Constructing!

    Study Extra

    Acknowledgement

    Particular because of Pronoy Chopra and Jon Turdiev for his or her contribution.


    Concerning the authors

    Bashir Mohammed

    Bashir is a Senior Lead GenAI Options Architect on the Frontier AI group at AWS, the place he companions with startups and enterprises to architect and deploy production-scale GenAI functions. With a PhD in Pc Science, his experience spans agentic techniques, LLM analysis and benchmarking, fine-tuning, post-training optimization, reinforcement studying from human suggestions and scalable ML infrastructure. Exterior of labor, he mentors early-career engineers and helps group technical applications.

    Bala Krishnamoorthy

    Bala is a Senior GenAI Information Scientist on the Amazon Bedrock GTM group, the place he helps startups leverage Bedrock to energy their merchandise. In his free time, he enjoys spending time with household/buddies, staying energetic, making an attempt new eating places, journey, and kickstarting his day with a steaming sizzling cup of espresso.

    Greg Fina

    Greg is a Principal Startup Options Architect for Generative AI at Amazon Internet Companies, the place he empowers startups to speed up innovation by way of cloud adoption. He makes a speciality of software modernization, with a powerful concentrate on serverless architectures, containers, and scalable information storage options. He’s keen about utilizing generative AI instruments to orchestrate and optimize large-scale Kubernetes deployments, in addition to advancing GitOps and DevOps practices for high-velocity groups. Exterior of his customer-facing function, Greg actively contributes to open supply tasks, particularly these associated to Backstage.

    David Stewart

    David leads Area Engineering at Oumi, the place he works with prospects to enhance their generative AI functions by creating customized language fashions for his or her use case. He brings intensive expertise working with LLMs, together with trendy agentic, RAG, and coaching architectures. David is deeply within the sensible facet of generative AI and the way individuals and organizations can create impactful merchandise and options that work at scale.

    Matthew Individuals

    Matthew is a cofounder and engineering chief at Oumi, the place he focuses on constructing and scaling sensible, open generative AI techniques for real-world use circumstances. He works carefully with engineers, researchers, and prospects to design sturdy architectures throughout all the AI growth pipeline. Matthew is keen about open-source AI, utilized machine studying, and enabling groups to maneuver rapidly from analysis proofs of idea to impactful merchandise.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    Run Tiny AI Fashions Domestically Utilizing BitNet A Newbie Information

    March 11, 2026

    From Textual content to Tables: Characteristic Engineering with LLMs for Tabular Knowledge

    March 10, 2026

    How Agent Expertise Create Specialised AI With out Coaching – O’Reilly

    March 10, 2026
    Top Posts

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025

    Midjourney V7: Quicker, smarter, extra reasonable

    April 18, 2025

    Meta resumes AI coaching utilizing EU person knowledge

    April 18, 2025
    Don't Miss

    BeatBanker Trojan Spreads by way of Phishing, Deploys Crypto Miner and RAT on Focused Gadgets

    By Declan MurphyMarch 11, 2026

    BeatBanker is a brand new Android malware marketing campaign focusing on customers in Brazil, combining…

    Expertise Is Reshaping Sleep Apnea Therapy

    March 11, 2026

    My New E-book On Vulnerability Nearly Killed Me…Actually

    March 11, 2026

    Speed up customized LLM deployment: Effective-tune with Oumi and deploy to Amazon Bedrock

    March 11, 2026
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2026 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.