Textual content-to-image fundamentals with Amazon Nova Canvas

AI picture technology has emerged as one of the transformative applied sciences in recent times, revolutionizing the way you create and work together with visible content material. Amazon Nova Canvas is a generative mannequin within the suite of Amazon Nova inventive fashions that allows you to generate reasonable and inventive photographs from plain textual content descriptions.

This publish serves as a newbie’s information to utilizing Amazon Nova Canvas. We start with the steps to get arrange on Amazon Bedrock. Amazon Bedrock is a completely managed service that hosts main basis fashions (FMs) for numerous use instances corresponding to textual content, code, and picture technology; summarization; query answering; and customized use instances that contain fine-tuning and Retrieval Augmented Era (RAG). On this publish, we give attention to the Amazon Nova picture technology fashions accessible in AWS Areas within the US, specifically, the Amazon Nova Canvas mannequin. We then present an summary of the picture technology course of (diffusion) and dive deep into the enter parameters for text-to-image technology with Amazon Nova Canvas.

Get began with picture technology on Amazon Bedrock

Full the next steps to get setup with entry to Amazon Nova Canvas and the picture playground:

Create an AWS account when you don’t have one already.
Open the Amazon Bedrock console as an AWS Identification and Entry Administration (IAM) administrator or applicable IAM consumer.
Affirm and select one of many Areas the place the Amazon Nova Canvas mannequin is out there (for instance, US East (N. Virginia)).
Within the navigation pane, select Mannequin entry underneath Bedrock configurations.

Beneath What’s Mannequin entry, select Modify mannequin entry or Allow particular fashions (if not but activated).

Choose Nova Canvas, then select Subsequent.

On the Assessment and submit web page, select Submit.

Refresh the Base fashions
In case you see the Amazon Nova Canvas mannequin within the Entry Granted standing, you’re able to proceed with the following steps.

Within the navigation pane, select Picture / Video underneath Playgrounds.

Select Choose mannequin, then select Amazon and Nova Canvas. Then select Apply.

You’re all set as much as begin producing photographs with Amazon Nova Canvas on Amazon Bedrock. The next screenshot reveals an instance of our playground.

Understanding the technology course of

Amazon Nova Canvas makes use of diffusion-based approaches to generate photographs:

Place to begin – The method begins with random noise (a pure static picture).
Iterative denoising – The mannequin step by step removes noise in steps, guided by your prompts. The quantity of noise to take away at every step is discovered at coaching. For example, for a mannequin to generate a picture of a cat, it must be skilled on a number of cat photographs, and iteratively insert noise into the picture till it’s full noise. When studying the quantity of noise so as to add at every step, the mannequin successfully learns the reverse course of, beginning with a loud picture and iteratively subtracting noise to reach on the picture of a cat.
Textual content conditioning – The textual content immediate serves because the conditioning that guides the picture technology course of. The immediate is encoded as a numerical vector, referenced in opposition to comparable vectors in a text-image embedding house that corresponds to photographs, after which utilizing these vectors, a loud picture is remodeled into a picture that captures the enter immediate.
Picture conditioning – Along with textual content prompts, Amazon Nova Canvas additionally accepts photographs as inputs.
Security and equity – To adjust to security and equity objectives, each the immediate and the generated output picture undergo filters. If no filter is triggered, the ultimate picture is returned.

Prompting fundamentals

Picture technology begins with efficient prompting—the artwork of crafting textual content descriptions that information the mannequin towards your required output. Nicely-constructed prompts embody particular particulars about topic, fashion, lighting, perspective, temper, and composition, and work higher when structured as picture captions somewhat than a command or dialog. For instance, somewhat than saying “generate a picture of a mountain,” a more practical immediate is perhaps “an imposing snow-capped mountain peak at sundown with dramatic lighting and wispy clouds, photorealistic fashion.” Consult with Amazon Nova Canvas prompting finest practices for extra details about prompting.

Let’s deal with the next immediate parts and observe their impression on the ultimate output picture:

Topic descriptions (what or who’s within the picture) – Within the following instance, we use the immediate “a cat sitting on a chair.”

Type references (images, oil portray, 3D render) – Within the following examples, we use the prompts “A cat sitting on a chair, oil portray fashion” after which “A cat sitting on a chair, anime fashion.”

Compositional parts and technical specs (foreground, background, perspective, lighting) – Within the following examples, we use the prompts “A cat sitting on a chair, mountains within the background,” and “A cat sitting on a chair, daylight from the appropriate low angle shot.”

Constructive and unfavourable prompts

Constructive prompts inform the mannequin what to incorporate. These are the weather, types, and traits you wish to observe within the remaining picture. Keep away from the usage of negation phrases like “no,” “not,” or “with out” in your immediate. Amazon Nova Canvas has been skilled on image-caption pairs, and captions not often describe what isn’t in a picture. Due to this fact, the mannequin has by no means discovered the idea of negation. As a substitute, use unfavourable prompts to specify parts to exclude from the output.

Adverse prompts specify what to keep away from. Widespread unfavourable prompts embody “blurry,” “distorted,” “low high quality,” “poor anatomy,” “unhealthy proportions,” “disfigured palms,” or “further limbs,” which assist fashions keep away from typical technology artifacts.

Within the following examples, we first use the immediate “An aerial view of an archipelago,” then we refine the immediate as “An aerial view of an archipelago. Adverse Immediate: Seashores.”

The stability between optimistic and unfavourable prompting creates an outlined inventive house for the mannequin to work inside, usually leading to extra predictable and fascinating outputs.

Picture dimensions and facet ratios

Amazon Nova Canvas is skilled on 1:1, portrait and panorama resolutions, with technology duties having a most output decision of 4.19 million pixels (that’s, 2048×2048, 2816×1536). For modifying duties, the picture needs to be 4,096 pixels on its longest aspect, have a facet ratio between 1:4 and 4:1, and have a complete pixel depend of 4.19 million or smaller. Understanding dimensional limitations helps keep away from stretched or distorted outcomes, notably for specialised composition wants.

Classifier-free steering scale

The classifier-free steering (CFG) scale controls how strictly the mannequin follows your immediate:

Low values (1.1–3) – Extra inventive freedom for the AI, probably extra aesthetic, however low distinction and fewer prompt-adherent outcomes
Medium values (4–7) – Balanced strategy, sometimes beneficial for many generations
Excessive values (8–10) – Strict immediate adherence, which might produce extra exact outcomes however typically at the price of pure aesthetics and elevated coloration saturation

Within the following examples, we use the immediate “Cherry blossoms, bonsai, Japanese fashion panorama, excessive decision, 8k, lush greens within the background.”

The primary picture with CFG 2 captures some parts of cherry blossoms and bonsai. The second picture with CFG 8 adheres extra to the immediate with a potted bonsai, extra pronounced cherry blossom flowers, and luxurious greens within the background.

Consider CFG scale as adjusting how actually your directions are considered vs. how a lot creative interpretation it applies.

Seed values and reproducibility

Each picture technology begins with a randomization seed—primarily a beginning quantity that determines preliminary circumstances:

Seeds are sometimes represented as lengthy integers (for instance, 1234567890)
Utilizing the identical seed, immediate, and parameters reproduces similar photographs each time
Saving seeds means that you can revisit profitable generations or create variations on promising outcomes
Seed values don’t have any inherent high quality; they’re merely completely different beginning factors

Reproducibility via seed values is important for skilled workflows, permitting refined iterations on the immediate or different enter parameters to obviously see their impact, somewhat than fully random generations. The next photographs are generated utilizing two barely completely different prompts (“A portrait of a woman smiling” vs. “A portrait of a woman laughing”), whereas holding the seed worth and all different parameters fixed.

All previous photographs on this publish have been generated utilizing the text-to-image (TEXT_IMAGE) process sort of Amazon Nova Canvas, accessible via the Amazon Bedrock InvokeModel API. The next is the API request and response construction for picture technology:

#Request Construction
{
    "taskType": "TEXT_IMAGE",
    "textToImageParams": {
        "textual content": string,         #Constructive Immediate
        "negativeText": string  #Adverse Immediate
    },
    "imageGenerationConfig":  "premium",   #Picture High quality
        "cfgScale": float,      #Classifer Free Steerage Scale
        "seed": int,            #Seed worth
        "numberOfImages": int   #Variety of photographs to be generated (max 5)
    
}
#Response Construction
{
    "photographs": "photographs": string[], #checklist of Base64 encoded photographs
    "error": string
}

Code instance

This answer can be examined regionally with a Python script or a Jupyter pocket book. For this publish, we use an Amazon SageMaker AI pocket book utilizing Python (v3.12). For extra data, see Run instance Amazon Bedrock API requests utilizing an Amazon SageMaker AI pocket book. For directions to arrange your SageMaker pocket book occasion, confer with Create an Amazon SageMaker pocket book occasion. Be sure the occasion is ready up in the identical Area the place Amazon Nova Canvas entry is enabled. For this publish, we create a Area variable to match the Area the place Amazon Nova Canvas is enabled (us-east-1). You have to modify this variable when you’ve enabled the mannequin in a unique Area. The next code demonstrates text-to-image technology by invoking the Amazon Nova Canvas v1.0 mannequin utilizing Amazon Bedrock. To know the API request and response construction for various kinds of generations, parameters, and extra code examples, confer with Producing photographs with Amazon Nova.

import base64  #For encoding/decoding base64 knowledge
import io  #For dealing with byte streams
import json  #For JSON processing
import boto3  #AWS SDK for Python
from PIL import Picture  #Python Imaging Library for picture processing
from botocore.config import Config  #For AWS shopper configuration

#Create a variable to repair the area to the place Nova Canvas is enabled
area = "us-east-1"

#Setup an Amazon Bedrock runtime shopper
shopper = boto3.shopper(service_name="bedrock-runtime", region_name=area, config=Config(read_timeout=300))

#Set the content material sort and settle for headers for the API name
settle for = "software/json"
content_type = "software/json"

#Outline the immediate for picture technology
immediate = """A cat sitting on a chair, mountains within the background, low angle shot."""

#Create the request physique with technology parameters
api_request= json.dumps({
        "taskType": "TEXT_IMAGE",  #Specify text-to-image technology
        "textToImageParams": {
            "textual content": immediate  
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,   #Generate one picture
            "top": 720,        #Picture top in pixels
            "width": 1280,         #Picture width in pixels
            "cfgScale": 7.0,       #CFG Scale
            "seed": 0              #Seed quantity for technology
        }
})
#Name the Bedrock mannequin to generate the picture
response = shopper.invoke_model(physique=api_request, modelId='amazon.nova-canvas-v1:0', settle for=settle for, 
contentType=content_type)
        
#Parse the JSON response
response_json = json.hundreds(response.get("physique").learn())

#Extract the base64-encoded picture from the response
base64_image = response_json.get("photographs")[0]
#Convert the base64 string to ASCII bytes
base64_bytes = base64_image.encode('ascii')
#Decode the base64 bytes to get the precise picture bytes
image_data = base64.b64decode(base64_bytes)

#Convert bytes to a picture object
output_image = Picture.open(io.BytesIO(image_data))
#Show the picture
output_image.present()
#Save the picture to present working listing
output_image.save('output_image.png')

Clear up

When you will have completed testing this answer, clear up your assets to stop AWS expenses from being incurred:

Again up the Jupyter notebooks within the SageMaker pocket book occasion.
Shut down and delete the SageMaker pocket book occasion.

Price concerns

Take into account the next prices from the answer deployed on AWS:

You’ll incur expenses for generative AI inference on Amazon Bedrock. For extra particulars, confer with Amazon Bedrock pricing.
You’ll incur expenses in your SageMaker pocket book occasion. For extra particulars, confer with Amazon SageMaker pricing.

Conclusion

This publish launched you to AI picture technology, after which offered an summary of accessing picture fashions accessible on Amazon Bedrock. We then walked via the diffusion course of and key parameters with examples utilizing Amazon Nova Canvas. The code template and examples demonstrated on this publish intention to get you accustomed to the fundamentals of Amazon Nova Canvas and get began along with your AI picture technology use instances on Amazon Bedrock.

For extra particulars on text-to-image technology and different capabilities of Amazon Nova Canvas, see Producing photographs with Amazon Nova. Give it a try to tell us your suggestions within the feedback.

In regards to the Writer

Arjun Singh is a Sr. Information Scientist at Amazon, skilled in synthetic intelligence, machine studying, and enterprise intelligence. He’s a visible particular person and deeply inquisitive about generative AI applied sciences in content material creation. He collaborates with prospects to construct ML and AI options to attain their desired outcomes. He graduated with a Grasp’s in Data Techniques from the College of Cincinnati. Outdoors of labor, he enjoys taking part in tennis, understanding, and studying new expertise.

Main Menu

What's Hot

5 AI Buying and selling Bots That Work With Robinhood

Everest Ransomware Claims Mailchimp as New Sufferer in Comparatively Small Breach

VMware Options 8 Finest Virtualization Options

Textual content-to-image fundamentals with Amazon Nova Canvas

Introducing AWS Batch Assist for Amazon SageMaker Coaching jobs

Greatest Net Scraping Corporations in 2025

STIV: Scalable Textual content and Picture Conditioned Video Era

5 AI Buying and selling Bots That Work With Robinhood

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

5 AI Buying and selling Bots That Work With Robinhood

Everest Ransomware Claims Mailchimp as New Sufferer in Comparatively Small Breach

VMware Options 8 Finest Virtualization Options

Introducing AWS Batch Assist for Amazon SageMaker Coaching jobs

Main Menu

Subscribe to Updates

What's Hot

Textual content-to-image fundamentals with Amazon Nova Canvas

Get began with picture technology on Amazon Bedrock

Understanding the technology course of

Prompting fundamentals

Constructive and unfavourable prompts

Picture dimensions and facet ratios

Classifier-free steering scale

Seed values and reproducibility

Code instance

Clear up

Price concerns

Conclusion

In regards to the Writer

Related Posts