Close Menu
    Main Menu
    • Home
    • News
    • Tech
    • Robotics
    • ML & Research
    • AI
    • Digital Transformation
    • AI Ethics & Regulation
    • Thought Leadership in AI

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    SAVE Pupil Mortgage Replace: Do not Count on to Make Funds This Yr, however Do This One Factor ASAP

    June 7, 2025

    DragonForce Ransomware Reportedly Compromised Over 120 Victims within the Previous Yr

    June 7, 2025

    Greatest robotic vacuums and mops 2025: Examined on my tile and hardwood at house

    June 7, 2025
    Facebook X (Twitter) Instagram
    UK Tech Insider
    Facebook X (Twitter) Instagram Pinterest Vimeo
    UK Tech Insider
    Home»Machine Learning & Research»Construct a serverless audio summarization answer with Amazon Bedrock and Whisper
    Machine Learning & Research

    Construct a serverless audio summarization answer with Amazon Bedrock and Whisper

    Oliver ChambersBy Oliver ChambersJune 6, 2025No Comments10 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Construct a serverless audio summarization answer with Amazon Bedrock and Whisper
    Share
    Facebook Twitter LinkedIn Pinterest Email Copy Link


    Recordings of enterprise conferences, interviews, and buyer interactions have change into important for preserving necessary info. Nevertheless, transcribing and summarizing these recordings manually is commonly time-consuming and labor-intensive. With the progress in generative AI and automated speech recognition (ASR), automated options have emerged to make this course of quicker and extra environment friendly.

    Defending personally identifiable info (PII) is a crucial side of knowledge safety, pushed by each moral tasks and authorized necessities. On this put up, we reveal the right way to use the Open AI Whisper basis mannequin (FM) Whisper Massive V3 Turbo, out there in Amazon Bedrock Market, which provides entry to over 140 fashions by means of a devoted providing, to provide close to real-time transcription. These transcriptions are then processed by Amazon Bedrock for summarization and redaction of delicate info.

    Amazon Bedrock is a completely managed service that gives a alternative of high-performing FMs from main AI firms like AI21 Labs, Anthropic, Cohere, DeepSeek, Luma, Meta, Mistral AI, poolside (coming quickly), Stability AI, and Amazon Nova by means of a single API, together with a broad set of capabilities to construct generative AI functions with safety, privateness, and accountable AI. Moreover, you should utilize Amazon Bedrock Guardrails to robotically redact delicate info, together with PII, from the transcription summaries to assist compliance and information safety wants.

    On this put up, we stroll by means of an end-to-end structure that mixes a React-based frontend with Amazon Bedrock, AWS Lambda, and AWS Step Features to orchestrate the workflow, facilitating seamless integration and processing.

    Answer overview

    The answer highlights the facility of integrating serverless applied sciences with generative AI to automate and scale content material processing workflows. The person journey begins with importing a recording by means of a React frontend utility, hosted on Amazon CloudFront and backed by Amazon Easy Storage Service (Amazon S3) and Amazon API Gateway. When the file is uploaded, it triggers a Step Features state machine that orchestrates the core processing steps, utilizing AI fashions and Lambda features for seamless information move and transformation. The next diagram illustrates the answer structure.

    The workflow consists of the next steps:

    1. The React utility is hosted in an S3 bucket and served to customers by means of CloudFront for quick, world entry. API Gateway handles interactions between the frontend and backend companies.
    2. Customers add audio or video information immediately from the app. These recordings are saved in a chosen S3 bucket for processing.
    3. An Amazon EventBridge rule detects the S3 add occasion and triggers the Step Features state machine, initiating the AI-powered processing pipeline.
    4. The state machine performs audio transcription, summarization, and redaction by orchestrating a number of Amazon Bedrock fashions in sequence. It makes use of Whisper for transcription, Claude for summarization, and Guardrails to redact delicate information.
    5. The redacted abstract is returned to the frontend utility and exhibited to the person.

    The next diagram illustrates the state machine workflow.

    Construct a serverless audio summarization answer with Amazon Bedrock and Whisper

    The Step Features state machine orchestrates a sequence of duties to transcribe, summarize, and redact delicate info from uploaded audio/video recordings:

    1. A Lambda operate is triggered to assemble enter particulars (for instance, Amazon S3 object path, metadata) and put together the payload for transcription.
    2. The payload is shipped to the OpenAI Whisper Massive V3 Turbo mannequin by means of the Amazon Bedrock Market to generate a close to real-time transcription of the recording.
    3. The uncooked transcript is handed to Anthropic’s Claude Sonnet 3.5 by means of Amazon Bedrock, which produces a concise and coherent abstract of the dialog or content material.
    4. A second Lambda operate validates and forwards the abstract to the redaction step.
    5. The abstract is processed by means of Amazon Bedrock Guardrails, which robotically redacts PII and different delicate information.
    6. The redacted abstract is saved or returned to the frontend utility by means of an API, the place it’s exhibited to the person.

    Conditions

    Earlier than you begin, just remember to have the next conditions in place:

    Create a guardrail within the Amazon Bedrock console

    For directions for creating guardrails in Amazon Bedrock, check with Create a guardrail. For particulars on detecting and redacting PII, see Take away PII from conversations by utilizing delicate info filters. Configure your guardrail with the next key settings:

    • Allow PII detection and dealing with
    • Set PII motion to Redact
    • Add the related PII varieties, similar to:
      • Names and identities
      • Telephone numbers
      • E-mail addresses
      • Bodily addresses
      • Monetary info
      • Different delicate private info

    After you deploy the guardrail, word the Amazon Useful resource Identify (ARN), and you may be utilizing this when deploys the mannequin.

    Deploy the Whisper mannequin

    Full the next steps to deploy the Whisper Massive V3 Turbo mannequin:

    1. On the Amazon Bedrock console, select Mannequin catalog below Basis fashions within the navigation pane.
    2. Seek for and select Whisper Massive V3 Turbo.
    3. On the choices menu (three dots), select Deploy.

    Amazon Bedrock console displaying filtered model catalog with Whisper Large V3 Turbo speech recognition model and deployment option

    1. Modify the endpoint title, variety of cases, and occasion sort to fit your particular use case. For this put up, we use the default settings.
    2. Modify the Superior settings part to fit your use case. For this put up, we use the default settings.
    3. Select Deploy.

    This creates a brand new AWS Identification and Entry Administration IAM function and deploys the mannequin.

    You’ll be able to select Market deployments within the navigation pane, and within the Managed deployments part, you’ll be able to see the endpoint standing as Creating. Watch for the endpoint to complete deployment and the standing to vary to In Service, then copy the Endpoint Identify, and you may be utilizing this when deploying the

    Amazon Bedrock console: "How it works" overview, managed deployments table with Whisper model endpoint in service

    Deploy the answer infrastructure

    Within the GitHub repo, comply with the directions within the README file to clone the repository, then deploy the frontend and backend infrastructure.

    We use the AWS Cloud Improvement Equipment (AWS CDK) to outline and deploy the infrastructure. The AWS CDK code deploys the next assets:

    • React frontend utility
    • Backend infrastructure
    • S3 buckets for storing uploads and processed outcomes
    • Step Features state machine with Lambda features for audio processing and PII redaction
    • API Gateway endpoints for dealing with requests
    • IAM roles and insurance policies for safe entry
    • CloudFront distribution for internet hosting the frontend

    Implementation deep dive

    The backend consists of a sequence of Lambda features, every dealing with a selected stage of the audio processing pipeline:

    • Add handler – Receives audio information and shops them in Amazon S3
    • Transcription with Whisper – Converts speech to textual content utilizing the Whisper mannequin
    • Speaker detection – Differentiates and labels particular person audio system throughout the audio
    • Summarization utilizing Amazon Bedrock – Extracts and summarizes key factors from the transcript
    • PII redaction – Makes use of Amazon Bedrock Guardrails to take away delicate info for privateness compliance

    Let’s study among the key elements:

    The transcription Lambda operate makes use of the Whisper mannequin to transform audio information to textual content:

    def transcribe_with_whisper(audio_chunk, endpoint_name):
        # Convert audio to hex string format
        hex_audio = audio_chunk.hex()
        
        # Create payload for Whisper mannequin
        payload = {
            "audio_input": hex_audio,
            "language": "english",
            "activity": "transcribe",
            "top_p": 0.9
        }
        
        # Invoke the SageMaker endpoint operating Whisper
        response = sagemaker_runtime.invoke_endpoint(
            EndpointName=endpoint_name,
            ContentType="utility/json",
            Physique=json.dumps(payload)
        )
        
        # Parse the transcription response
        response_body = json.masses(response['Body'].learn().decode('utf-8'))
        transcription_text = response_body['text']
        
        return transcription_text
    

    We use Amazon Bedrock to generate concise summaries from the transcriptions:

    def generate_summary(transcription):
        # Format the immediate with the transcription
        immediate = f"{transcription}nnGive me the abstract, audio system, key discussions, and motion objects with homeowners"
        
        # Name Bedrock for summarization
        response = bedrock_runtime.invoke_model(
            modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
            physique=json.dumps({
                "immediate": immediate,
                "max_tokens_to_sample": 4096,
                "temperature": 0.7,
                "top_p": 0.9,
            })
        )
        
        # Extract and return the abstract
        consequence = json.masses(response.get('physique').learn())
        return consequence.get('completion')

    A vital element of our answer is the automated redaction of PII. We carried out this utilizing Amazon Bedrock Guardrails to assist compliance with privateness laws:

    def apply_guardrail(bedrock_runtime, content material, guardrail_id):
    # Format content material in keeping with API necessities
    formatted_content = [{"text": {"text": content}}]
    
    # Name the guardrail API
    response = bedrock_runtime.apply_guardrail(
    guardrailIdentifier=guardrail_id,
    guardrailVersion="DRAFT",
    supply="OUTPUT",  # Utilizing OUTPUT parameter for correct move
    content material=formatted_content
    )
    
    # Extract redacted textual content from response
    if 'motion' in response and response['action'] == 'GUARDRAIL_INTERVENED':
    if len(response['outputs']) > 0:
    output = response['outputs'][0]
    if 'textual content' in output and isinstance(output['text'], str):
    return output['text']
    
    # Return unique content material if redaction fails
    return content material

    When PII is detected, it’s changed with sort indicators (for instance, {PHONE} or {EMAIL}), ensuring that summaries stay informative whereas defending delicate information.

    To handle the advanced processing pipeline, we use Step Features to orchestrate the Lambda features:

    {
    "Remark": "Audio Summarization Workflow",
    "StartAt": "TranscribeAudio",
    "States": {
    "TranscribeAudio": {
    "Sort": "Activity",
    "Useful resource": "arn:aws:states:::lambda:invoke",
    "Parameters": {
    "FunctionName": "WhisperTranscriptionFunction",
    "Payload": {
    "bucket": "$.bucket",
    "key": "$.key"
    }
    },
    "Subsequent": "IdentifySpeakers"
    },
    "IdentifySpeakers": {
    "Sort": "Activity",
    "Useful resource": "arn:aws:states:::lambda:invoke",
    "Parameters": {
    "FunctionName": "SpeakerIdentificationFunction",
    "Payload": {
    "Transcription.$": "$.Payload"
    }
    },
    "Subsequent": "GenerateSummary"
    },
    "GenerateSummary": {
    "Sort": "Activity",
    "Useful resource": "arn:aws:states:::lambda:invoke",
    "Parameters": {
    "FunctionName": "BedrockSummaryFunction",
    "Payload": {
    "SpeakerIdentification.$": "$.Payload"
    }
    },
    "Finish": true
    }
    }
    }

    This workflow makes positive every step completes efficiently earlier than continuing to the subsequent, with automated error dealing with and retry logic inbuilt.

    Check the answer

    After you may have efficiently accomplished the deployment, you should utilize the CloudFront URL to check the answer performance.

    Audio/video upload and summary interface with completed file upload for team meeting recording analysis

    Safety issues

    Safety is a vital side of this answer, and we’ve carried out a number of greatest practices to assist information safety and compliance:

    • Delicate information redaction – Robotically redact PII to guard person privateness.
    • High-quality-Grained IAM Permissions – Apply the precept of least privilege throughout AWS companies and assets.
    • Amazon S3 entry controls – Use strict bucket insurance policies to restrict entry to approved customers and roles.
    • API safety – Safe API endpoints utilizing Amazon Cognito for person authentication (non-obligatory however beneficial).
    • CloudFront safety – Implement HTTPS and apply trendy TLS protocols to facilitate safe content material supply.
    • Amazon Bedrock information safety – Amazon Bedrock (together with Amazon Bedrock Market) protects buyer information and doesn’t ship information to suppliers or prepare utilizing buyer information. This makes positive your proprietary info stays safe when utilizing AI capabilities.

    Clear up

    To stop pointless costs, ensure to delete the assets provisioned for this answer if you’re completed:

    1. Delete the Amazon Bedrock guardrail:
      1. On the Amazon Bedrock console, within the navigation menu, select Guardrails.
      2. Select your guardrail, then select Delete.
    2. Delete the Whisper Massive V3 Turbo mannequin deployed by means of the Amazon Bedrock Market:
      1. On the Amazon Bedrock console, select Market deployments within the navigation pane.
      2. Within the Managed deployments part, choose the deployed endpoint and select Delete.
    3. Delete the AWS CDK stack by operating the command cdk destroy, which deletes the AWS infrastructure.

    Conclusion

    This serverless audio summarization answer demonstrates the advantages of mixing AWS companies to create a classy, safe, and scalable utility. By utilizing Amazon Bedrock for AI capabilities, Lambda for serverless processing, and CloudFront for content material supply, we’ve constructed an answer that may deal with giant volumes of audio content material effectively whereas serving to you align with safety greatest practices.

    The automated PII redaction function helps compliance with privateness laws, making this answer well-suited for regulated industries similar to healthcare, finance, and authorized companies the place information safety is paramount. To get began, deploy this structure inside your AWS atmosphere to speed up your audio processing workflows.


    In regards to the Authors

    Kaiyin HuKaiyin Hu is a Senior Options Architect for Strategic Accounts at Amazon Net Companies, with years of expertise throughout enterprises, startups, {and professional} companies. At present, she helps clients construct cloud options and drives GenAI adoption to cloud. Beforehand, Kaiyin labored within the Good House area, helping clients in integrating voice and IoT applied sciences.

    Sid VantairSid Vantair is a Options Architect with AWS masking Strategic accounts.  He thrives on resolving advanced technical points to beat buyer hurdles. Exterior of labor, he cherishes spending time along with his household and fostering inquisitiveness in his kids.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Oliver Chambers
    • Website

    Related Posts

    Construct a Textual content-to-SQL resolution for information consistency in generative AI utilizing Amazon Nova

    June 7, 2025

    Multi-account assist for Amazon SageMaker HyperPod activity governance

    June 7, 2025

    Implement semantic video search utilizing open supply giant imaginative and prescient fashions on Amazon SageMaker and Amazon OpenSearch Serverless

    June 6, 2025
    Leave A Reply Cancel Reply

    Top Posts

    SAVE Pupil Mortgage Replace: Do not Count on to Make Funds This Yr, however Do This One Factor ASAP

    June 7, 2025

    How AI is Redrawing the World’s Electrical energy Maps: Insights from the IEA Report

    April 18, 2025

    Evaluating the Finest AI Video Mills for Social Media

    April 18, 2025

    Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

    April 18, 2025
    Don't Miss

    SAVE Pupil Mortgage Replace: Do not Count on to Make Funds This Yr, however Do This One Factor ASAP

    By Sophia Ahmed WilsonJune 7, 2025

    Pla2na/Getty Pictures/CNETThere’s been numerous scholar mortgage chatter, however little readability for debtors enrolled within the…

    DragonForce Ransomware Reportedly Compromised Over 120 Victims within the Previous Yr

    June 7, 2025

    Greatest robotic vacuums and mops 2025: Examined on my tile and hardwood at house

    June 7, 2025

    CISA asks CISOs: Does that asset actually should be on the web?

    June 7, 2025
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    UK Tech Insider
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms Of Service
    • Our Authors
    © 2025 UK Tech Insider. All rights reserved by UK Tech Insider.

    Type above and press Enter to search. Press Esc to cancel.