Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Wiz Uncovers Vital Entry Bypass Flaw in AI-Powered Vibe Coding Platform Base44

    July 30, 2025

    AI vs. AI: Prophet Safety raises $30M to interchange human analysts with autonomous defenders

    July 30, 2025

    A Deep Dive into Picture Embeddings and Vector Search with BigQuery on Google Cloud

    July 30, 2025
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Facebook X (Twitter) Instagram
    UK Tech InsiderUK Tech Insider
    Home»Machine Learning & Research»Deploy Qwen fashions with Amazon Bedrock Customized Mannequin Import
    Machine Learning & Research

    Deploy Qwen fashions with Amazon Bedrock Customized Mannequin Import

    Oliver ChambersBy Oliver ChambersJune 13, 2025No Comments12 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Deploy Qwen fashions with Amazon Bedrock Customized Mannequin Import
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    We’re excited to announce that Amazon Bedrock Customized Mannequin Import now helps Qwen fashions. Now you can import customized weights for Qwen2, Qwen2_VL, and Qwen2_5_VL architectures, together with fashions like Qwen 2, 2.5 Coder, Qwen 2.5 VL, and QwQ 32B. You may deliver your individual custom-made Qwen fashions into Amazon Bedrock and deploy them in a totally managed, serverless surroundings—with out having to handle infrastructure or mannequin serving.

    On this put up, we cowl how one can deploy Qwen 2.5 fashions with Amazon Bedrock Customized Mannequin Import, making them accessible to organizations trying to make use of state-of-the-art AI capabilities inside the AWS infrastructure at an efficient value.

    Overview of Qwen fashions

    Qwen 2 and a pair of.5 are households of huge language fashions, out there in a variety of sizes and specialised variants to swimsuit numerous wants:

    • Basic language fashions: Fashions starting from 0.5B to 72B parameters, with each base and instruct variations for general-purpose duties
    • Qwen 2.5-Coder: Specialised for code era and completion
    • Qwen 2.5-Math: Targeted on superior mathematical reasoning
    • Qwen 2.5-VL (vision-language): Picture and video processing capabilities, enabling multimodal functions

    Overview of Amazon Bedrock Customized Mannequin Import

    Amazon Bedrock Customized Mannequin Import permits the import and use of your custom-made fashions alongside current basis fashions (FMs) via a single serverless, unified API. You may entry your imported customized fashions on-demand and with out the necessity to handle the underlying infrastructure. Speed up your generative AI software growth by integrating your supported customized fashions with native Amazon Bedrock instruments and options like Amazon Bedrock Data Bases, Amazon Bedrock Guardrails, and Amazon Bedrock Brokers. Amazon Bedrock Customized Mannequin Import is mostly out there within the US-East (N. Virginia), US-West (Oregon), and Europe (Frankfurt) AWS Areas. Now, we’ll discover how you need to use Qwen 2.5 fashions for 2 frequent use circumstances: as a coding assistant and for picture understanding. Qwen2.5-Coder is a state-of-the-art code mannequin, matching capabilities of proprietary fashions like GPT-4o. It helps over 90 programming languages and excels at code era, debugging, and reasoning. Qwen 2.5-VL brings superior multimodal capabilities. In line with Qwen, Qwen 2.5-VL shouldn’t be solely proficient at recognizing objects comparable to flowers and animals, but additionally at analyzing charts, extracting textual content from pictures, decoding doc layouts, and processing lengthy movies.

    Conditions

    Earlier than importing the Qwen mannequin with Amazon Bedrock Customized Mannequin Import, just be sure you have the next in place:

    1. An energetic AWS account
    2. An Amazon Easy Storage Service (Amazon S3) bucket to retailer the Qwen mannequin recordsdata
    3. Ample permissions to create Amazon Bedrock mannequin import jobs
    4. Verified that your Area helps Amazon Bedrock Customized Mannequin Import

    Use case 1: Qwen coding assistant

    On this instance, we are going to reveal how one can construct a coding assistant utilizing the Qwen2.5-Coder-7B-Instruct mannequin

    1. Go to to Hugging Face and seek for and duplicate the Mannequin ID Qwen/Qwen2.5-Coder-7B-Instruct:

    You’ll use Qwen/Qwen2.5-Coder-7B-Instruct for the remainder of the walkthrough. We don’t reveal fine-tuning steps, however it’s also possible to fine-tune earlier than importing.

    1. Use the next command to obtain a snapshot of the mannequin domestically. The Python library for Hugging Face gives a utility known as snapshot obtain for this:
    from huggingface_hub import snapshot_download
    
    snapshot_download(repo_id=" Qwen/Qwen2.5-Coder-7B-Instruct", 
                    local_dir=f"./extractedmodel/")

    Relying in your mannequin dimension, this might take a couple of minutes. When accomplished, your Qwen Coder 7B mannequin folder will include the next recordsdata.

    • Configuration recordsdata: Together with config.json, generation_config.json, tokenizer_config.json, tokenizer.json, and vocab.json
    • Mannequin recordsdata: 4 safetensor recordsdata and mannequin.safetensors.index.json
    • Documentation: LICENSE, README.md, and merges.txt

    1. Add the mannequin to Amazon S3, utilizing boto3 or the command line:

    aws s3 cp ./extractedfolder s3://yourbucket/path/ --recursive

    1. Begin the import mannequin job utilizing the next API name:
    response = self.bedrock_client.create_model_import_job(
                    jobName="uniquejobname",
                    importedModelName="uniquemodelname",
                    roleArn="fullrolearn",
                    modelDataSource={
                        's3DataSource': {
                            's3Uri': "s3://yourbucket/path/"
                        }
                    }
                )
                

    You can too do that utilizing the AWS Administration Console for Amazon Bedrock.

    1. Within the Amazon Bedrock console, select Imported fashions within the navigation pane.
    2. Select Import a mannequin.

    1. Enter the small print, together with a Mannequin title, Import job title, and mannequin S3 location.

    1. Create a brand new service function or use an current service function. Then select Import mannequin

    1. After you select Import on the console, you must see standing as importing when mannequin is being imported:

    In case you’re utilizing your individual function, be sure to add the next belief relationship as describes in  Create a service function for mannequin import.

    After your mannequin is imported, watch for mannequin inference to be prepared, after which chat with the mannequin on the playground or via the API. Within the following instance, we append Python to immediate the mannequin to immediately output Python code to checklist gadgets in an S3 bucket. Bear in mind to make use of the appropriate chat template to enter prompts within the format required. For instance, you may get the appropriate chat template for any appropriate mannequin on Hugging Face utilizing beneath code:

    from transformers import AutoTokenizer
    tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
    
    # As an alternative of utilizing mannequin.chat(), we immediately use mannequin.generate()
    # However it's good to use tokenizer.apply_chat_template() to format your inputs as proven beneath
    immediate = "Write pattern boto3 python code to checklist recordsdata in a bucket saved within the variable `my_bucket`"
    messages = [
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": prompt}
    ]
    textual content = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    Be aware that when utilizing the invoke_model APIs, it’s essential to use the complete Amazon Useful resource Identify (ARN) for the imported mannequin. You will discover the Mannequin ARN within the Bedrock console, by navigating to the Imported fashions part after which viewing the Mannequin particulars web page, as proven within the following determine

    After the mannequin is prepared for inference, you need to use Chat Playground in Bedrock console or APIs to invoke the mannequin.

    Use case 2: Qwen 2.5 VL picture understanding

    Qwen2.5-VL-* gives multimodal capabilities, combining imaginative and prescient and language understanding in a single mannequin. This part demonstrates how one can deploy Qwen2.5-VL utilizing Amazon Bedrock Customized Mannequin Import and check its picture understanding capabilities.

    Import Qwen2.5-VL-7B to Amazon Bedrock

    Obtain the mannequin from Huggingface Face and add it to Amazon S3:

    from huggingface_hub import snapshot_download
    
    hf_model_id = "Qwen/Qwen2.5-VL-7B-Instruct"
    
    # Allow quicker downloads
    os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
    
    # Obtain mannequin domestically
    snapshot_download(repo_id=hf_model_id, local_dir=f"./{local_directory}")

    Subsequent, import the mannequin to Amazon Bedrock (both through Console or API):

    response = bedrock.create_model_import_job(
        jobName=job_name,
        importedModelName=imported_model_name,
        roleArn=role_arn,
        modelDataSource={
            's3DataSource': {
                's3Uri': s3_uri
            }
        }
    )

    Take a look at the imaginative and prescient capabilities

    After the import is full, check the mannequin with a picture enter. The Qwen2.5-VL-* mannequin requires correct formatting of multimodal inputs:

    def generate_vl(messages, image_base64, temperature=0.3, max_tokens=4096, top_p=0.9):
        processor = AutoProcessor.from_pretrained("Qwen/QVQ-72B-Preview")
        immediate = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
        
        response = shopper.invoke_model(
            modelId=model_id,
            physique=json.dumps({
                'immediate': immediate,
                'temperature': temperature,
                'max_gen_len': max_tokens,
                'top_p': top_p,
                'pictures': [image_base64]
            }),
            settle for="software/json",
            contentType="software/json"
        )
        
        return json.hundreds(response['body'].learn().decode('utf-8'))
    
    # Utilizing the mannequin with a picture
    file_path = "cat_image.jpg"
    base64_data = image_to_base64(file_path)
    
    messages = [
        {
            "role": "user",
            "content": [
                {"image": base64_data},
                {"text": "Describe this image."}
            ]
        }
    ]
    
    response = generate_vl(messages, base64_data)
    
    # Print response
    print("Mannequin Response:")
    if 'selections' in response:
        print(response['choices'][0]['text'])
    elif 'outputs' in response:
        print(response['outputs'][0]['text'])
    else:
        print(response)
        

    When supplied with an instance picture of a cat (such the next picture), the mannequin precisely describes key options such because the cat’s place, fur coloration, eye coloration, and basic look. This demonstrates Qwen2.5-VL-* mannequin’s capacity to course of visible info and generate related textual content descriptions.

    The mannequin’s response:

    This picture encompasses a close-up of a cat mendacity down on a gentle, textured floor, possible a sofa or a mattress. The cat has a tabby coat with a mixture of darkish and light-weight brown fur, and its eyes are a hanging inexperienced with vertical pupils, giving it a fascinating look. The cat's whiskers are distinguished and lengthen outward from its face, including to the detailed texture of the picture. The background is softly blurred, suggesting a comfy indoor setting with some furnishings and presumably a window letting in pure gentle. The general environment of the picture is heat and serene, highlighting the cat's relaxed and content material demeanor. 

    Pricing

    You should utilize Amazon Bedrock Customized Mannequin Import to make use of your customized mannequin weights inside Amazon Bedrock for supported architectures, serving them alongside Amazon Bedrock hosted FMs in a totally managed method via On-Demand mode. Customized Mannequin Import doesn’t cost for mannequin import. You might be charged for inference primarily based on two elements: the variety of energetic mannequin copies and their period of exercise. Billing happens in 5-minute increments, ranging from the primary profitable invocation of every mannequin copy. The pricing per mannequin copy per minute varies primarily based on elements together with structure, context size, Area, and compute unit model, and is tiered by mannequin copy dimension. The customized mannequin unites required for internet hosting depends upon the mannequin’s structure, parameter rely, and context size. Amazon Bedrock mechanically manages scaling primarily based in your utilization patterns. If there aren’t any invocations for five minutes, it scales to zero and scales up when wanted, although this may contain cold-start latency of as much as a minute. Further copies are added if inference quantity persistently exceeds single-copy concurrency limits. The utmost throughput and concurrency per copy is set throughout import, primarily based on elements comparable to enter/output token combine, {hardware} sort, mannequin dimension, structure, and inference optimizations.

    For extra info, see Amazon Bedrock pricing.

    Clear up

    To keep away from ongoing fees after finishing the experiments:

    1. Delete your imported Qwen fashions from Amazon Bedrock Customized Mannequin Import utilizing the console or the API.
    2. Optionally, delete the mannequin recordsdata out of your S3 bucket in the event you not want them.

    Keep in mind that whereas Amazon Bedrock Customized Mannequin Import doesn’t cost for the import course of itself, you’re billed for mannequin inference utilization and storage.

    Conclusion

    Amazon Bedrock Customized Mannequin Import empowers organizations to make use of highly effective publicly out there fashions like Qwen 2.5, amongst others, whereas benefiting from enterprise-grade infrastructure. The serverless nature of Amazon Bedrock eliminates the complexity of managing mannequin deployments and operations, permitting groups to give attention to constructing functions moderately than infrastructure. With options like auto scaling, pay-per-use pricing, and seamless integration with AWS companies, Amazon Bedrock gives a production-ready surroundings for AI workloads. The mix of Qwen 2.5’s superior AI capabilities and Amazon Bedrock managed infrastructure gives an optimum steadiness of efficiency, value, and operational effectivity. Organizations can begin with smaller fashions and scale up as wanted, whereas sustaining full management over their mannequin deployments and benefiting from AWS safety and compliance capabilities.

    For extra info, seek advice from the Amazon Bedrock Consumer Information.


    Concerning the Authors

    Ajit Mahareddy is an skilled Product and Go-To-Market (GTM) chief with over 20 years of expertise in Product Administration, Engineering, and Go-To-Market. Previous to his present function, Ajit led product administration constructing AI/ML merchandise at main expertise corporations, together with Uber, Turing, and eHealth. He’s keen about advancing Generative AI applied sciences and driving real-world impression with Generative AI.

    Shreyas Subramanian is a Principal Knowledge Scientist and helps clients through the use of generative AI and deep studying to resolve their enterprise challenges utilizing AWS companies. Shreyas has a background in large-scale optimization and ML and in the usage of ML and reinforcement studying for accelerating optimization duties.

    Yanyan Zhang is a Senior Generative AI Knowledge Scientist at Amazon Internet Providers, the place she has been engaged on cutting-edge AI/ML applied sciences as a Generative AI Specialist, serving to clients use generative AI to attain their desired outcomes. Yanyan graduated from Texas A&M College with a PhD in Electrical Engineering. Outdoors of labor, she loves touring, figuring out, and exploring new issues.

    Dharinee Gupta is an Engineering Supervisor at AWS Bedrock, the place she focuses on enabling clients to seamlessly make the most of open supply fashions via serverless options. Her workforce focuses on optimizing these fashions to ship the perfect cost-performance steadiness for purchasers. Previous to her present function, she gained in depth expertise in authentication and authorization techniques at Amazon, growing safe entry options for Amazon choices. Dharinee is keen about making superior AI applied sciences accessible and environment friendly for AWS clients.

    Lokeshwaran Ravi is a Senior Deep Studying Compiler Engineer at AWS, specializing in ML optimization, mannequin acceleration, and AI safety. He focuses on enhancing effectivity, decreasing prices, and constructing safe ecosystems to democratize AI applied sciences, making cutting-edge ML accessible and impactful throughout industries.

    June Gained is a Principal Product Supervisor with Amazon SageMaker JumpStart. He focuses on making basis fashions simply discoverable and usable to assist clients construct generative AI functions. His expertise at Amazon additionally contains cell procuring functions and final mile supply.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    A Deep Dive into Picture Embeddings and Vector Search with BigQuery on Google Cloud

    July 30, 2025

    MMAU: A Holistic Benchmark of Agent Capabilities Throughout Numerous Domains

    July 29, 2025

    Construct a drug discovery analysis assistant utilizing Strands Brokers and Amazon Bedrock

    July 29, 2025
    Top Posts

    Wiz Uncovers Vital Entry Bypass Flaw in AI-Powered Vibe Coding Platform Base44

    July 30, 2025

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025

    Midjourney V7: Quicker, smarter, extra reasonable

    April 18, 2025
    Don't Miss

    Wiz Uncovers Vital Entry Bypass Flaw in AI-Powered Vibe Coding Platform Base44

    By Declan MurphyJuly 30, 2025

    Cybersecurity researchers have disclosed a now-patched essential safety flaw in a well-liked vibe coding platform…

    AI vs. AI: Prophet Safety raises $30M to interchange human analysts with autonomous defenders

    July 30, 2025

    A Deep Dive into Picture Embeddings and Vector Search with BigQuery on Google Cloud

    July 30, 2025

    Robotic arm with gentle grippers helps individuals with disabilities make pizza and extra

    July 30, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2025 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.