If we aim to craft a copy with proper grammar, it undeniably leaves a significant impact on the reader.
Just like this Stable Diffusion prompt grammar plays a crucial role in helping your model understand what and how you intend to generate the image.
However prompt grammar is not akin to linguistic grammar; it has a special perspective that one typically gains through various experiments, which is, of course, a time-consuming process.
That’s why I’m here to share all the knowledge gained by our team through testing all the grammatical elements in prompts.
Stable Diffusion Prompt Grammar
Stable Diffusion Prompt Grammar is a set of guidelines for writing prompts that basically governs the construction of prompts to achieve desired outcomes in the generated image.
So, to compose a proper Stable Diffusion prompt and achieve the desired image, we should bear in mind various key elements while formulating a prompt:
- Syntax (structure),
- Punctuation,
- and Modifiers.
If you are in a hurry and want a quick answer for all your modifiers, then check the below table. But for a detailed practical explanation, need to read the full article.
Modifiers | Work | Example |
---|---|---|
Commas (,) | As soft separator | a cat, dog |
Semicolon (;) | As hard separator | River; mountain |
Full stop(.) | As hard separator | Lion. Tiger |
Exclamation Mark (!) | convey a sense of emphasis | cityscape at night! |
Colon (:) | Increases the weight of the subject | a cat eye:2 |
Parentheses (()) | Increases the weight of the subject | ((egg)), bread |
Bracket Notation[ ] | Decrease the weight of the subject | a [cat] eye |
Pipe(|) | blends multiple concepts | vivid sunset | over mountain |
Now, let’s practically explore all the key points, starting from prompt syntax, as every work begins with its structure.
Stable Diffusion Prompt Syntax
It is the linguistic blueprint that guides the Stable Diffusion model in understanding and responding to your input. The basics of prompt syntax involve clear and concise language, specifying context, and incorporating relevant details.
So, let’s go through the process of gradually refining a prompt:
Step 1: Specify Style and Theme:
Prompt: “A landscape in a surrealistic style.”
To test the process, I am using Automatic1111 and intentionally avoiding proper punctuation in prompts because I will delve into it in detail later.
And as I am about to generate a realistic image using the following negative prompt:
“blurry, unrealistic, cartoon, anime, low quality, bad anatomy, painting, drawing, poor quality.”
Step 2: Add Descriptive Elements:
“A surrealistic landscape with swirling clouds and bold, contrasting colors.”
Incorporating descriptive elements like “swirling clouds,” and “bold, contrasting colors”, I am trying to add more clarity to the image.
And you can see this step really helps in setting the mood and visualizing specific features.
Step 3: Experiment with Techniques: We also can add specific artistic techniques, like impasto brush strokes.
So, the prompt will be “A surrealistic landscape with swirling clouds and bold, contrasting colors using impasto brush strokes.”
This actually encourages the model to incorporate a particular texture into the artwork, adding an additional layer of detail.
Now, as many of us append words like ‘create’ or ‘generate’ at the start of prompts, let’s explore how these words impact:
You can see the words have almost zero value as we know the stable diffusion models are so smart to understand prompts.
So, after implementing the process, we can ensure that a stable diffusion prompt syntax may resemble the following:
[Style and Theme], [Descriptive Elements], [Techniques].
As I already told you that we avoid proper prompt punctuation in the above discussion, so now let’s address it.
Stable Diffusion Prompt Punctuation
Engaging with my team and actively participating in various AI art forums, I’ve observed that we employ a range of punctuation marks to instruct and steer the model’s creativity.
Here are some commonly used punctuation marks that I’ve identified and tested in stable diffusion prompts:
- Commas (,),
- Semicolon (;), and Full stop(.),
- Exclamation Mark (!),
- Colon (:),
- Parentheses (()),
- and Bracket Notation[].
So, let’s start with comma(), as well as a very basic question:
Does Stable Diffusion Prompt Need Comma(,)?
No, commas are not strictly necessary in Stable Diffusion prompts, but they can be used to improve the readability and organization of your prompts.
Commas are generally used to separate concepts in the prompt to make it easier for the model to understand what you are trying to generate.
For example, the prompt “a cow, a pig, and a goat” is clearer than the prompt “a cow pig goat.”
Now, let’s see how commas affect prompts as well as images.
1. Separate Concepts with Commas:
Initial Prompt: “A serene landscape with mountains, a flowing river, and a clear blue sky.”
In this prompt, commas are used to separate distinct elements, such as mountains, a flowing river, and a clear blue sky.
2. Control Ordering for Token Interaction:
Enhanced Prompt: “Mountains, a flowing river, and a clear blue sky compose a serene landscape.”
Observe how the ordering of words influences token interaction. However, its impact is minimal in my end.
3. Consider Context for Meaning:
Contextual Prompt: “An image featuring a serene landscape with mountains, a flowing river, and a clear blue sky.”
So, it is clear that adding context helps guide the model in understanding the specific requirements, resulting in a more meaningful generated image.
4. Combine Words Effectively with Commas:
Effective Combination Prompt: “Majestic mountains, a gently flowing river, and a vividly clear blue sky define the tranquil beauty of the landscape.”
It actually acts like separate concepts, with commas.
Until now, I’ve essentially tested various use cases for a specific prompt, and you can observe that all the techniques generate almost the same result.
Only “Context for Meaning” is proving to be slightly more effective.
Now I am going to test commas in different prompts in special cases.
5. To separate adjectives that modify a noun:
Example Prompt: “A mysterious, ancient artifact lies hidden in the dark, forgotten chambers of the ancient temple.”
Here, commas are used to separate adjectives (“mysterious” and “ancient”) modifying the nouns “artifact” and “dark” and “forgotten” and modifying the noun “chambers.”
6. Commas to set off non-essential clauses:
Example Prompt: “The protagonist, who had faced numerous challenges, emerged victorious in the end.”
So, the concept is providing additional information about the protagonist without altering the core meaning of the sentence.
Semicolon (;), and Full stop(.)
Both semicolons (;) and full stops (.) are considered hard separators in prompts, meaning they create clear and distinct breaks between different elements or concepts.
Example:
Example Prompt using Semicolon (;): “Depths of the ancient cavern; discover hidden treasures and unravel the secrets within.”
As both modifiers have almost the same specialty, I only tested one, and it is clear that you can choose either one depending on your prompt structure.
Exclamation Mark (!) and Pipe(|)
Using a pipe “|” in prompts allows you to blend multiple concepts or ideas into a single prompt, giving the model a more comprehensive set of instructions.
Actually, each segment separated by the pipe is treated as a distinct prompt within the overall instruction.
Let’s look at an example:
Example Prompt with Pipe: “A vivid sunset over the mountains | Include a calm lake reflecting the colors of the sky | Integrate a silhouette of a lone tree on the horizon.”
In this example, the pipe “|” is used to blend three distinct prompts into one cohesive instruction for the model.
And it actually did.
The exclamation mark “!” in a prompt is typically used to convey a sense of emphasis or urgency.
Example Prompt: “A dynamic cityscape at night! Emphasize the bright lights of skyscrapers, bustling streets, and the energetic atmosphere.”
You can see the exclamation mark inject a sense of enthusiasm or urgency into the prompt, guiding the model to prioritize certain aspects.
Now, as Colon (:), Parentheses (()), and Bracket Notation[ ] are generally used for Stable Diffusion prompt weights in automatic1111, we discuss them in the prompt weight section below.
Stable Diffusion Prompt Weights
1. Colon (:): The colon is used to assign a weight or importance to a specific word or concept in the prompt.
Prompt: `cat:2.0, playful, eyes:1.8, curious`
In this prompt, “cat” is assigned a weight of 2.0, emphasizing its importance.
The term “playful” is included without explicit weighting.
The word “eyes” is given a weight of 1.8, suggesting a focus on the cat’s eyes, the prompt actually did its work.
2. Parentheses (): Parentheses are also used to increase the attention of the subject.
Prompt: `(((mysterious cat))), ((River in a dark forest)), (moonlight)`
As the overall theme of the image is dark, the attention on the cat is not as good as I’d like.
However, the focus on the river is excellent and brings its use to life.
3. Bracket Notation []: Bracket notation is used to specify different subjects or concepts at different steps in the generation process.
It can also be used to decrease the attention of a subject.
Prompt: `[sunset:beach:12] (palm trees) (calm atmosphere), waves:1.5`
This prompt uses bracket notation to transition from “sunset” to “beach” at step 12.
The concepts “palm trees” and “calm atmosphere” are included throughout the prompt.
And the word “waves” is given extra emphasis with a weight of 1.5, suggesting a focus on the beach’s waves.
Hence, this notation allows for a controlled progression from a sunset to a beach scene with specified elements, culminating in a focus on the waves.
Now, do you know these notations can be combined and adjusted based on your specific goals?
Yes, you can. So, let’s experiment with different groupings to achieve the desired results.
How to change Prompt Weights in Automatic1111
Combining modifiers allows you to have more fine-grained control over attention and emphasis in your prompts.
Here’s how you can combine modifiers with examples:
1. Combining Parentheses and Colon:
Example: a (cat:2.0), ((playful)), a (eyes:1.5), (curious)
In this example, “cat” and “playful are given a weight of 2.0 for increased attention, but used different modifiers.
The “eyes” have increased attention (factor of 1.5) and emphasis, and “curious” has increased attention by 1.0.
For a clearer understanding of this section, read our guide on Stable Diffusion Lighting.
2. Combining Square Brackets and Colon:
Example: a [cat:0.8], [playful], (eyes), curious
In this context, “cat” has reduced attention (factor of 0.8), “playful” also has reduced attention, “eyes” has heightened attention (factor of 1.0), and the default emphasis is on ‘curious’.
3. Combining Multiple Modifiers on a Single Prompt:
Example: a (cat:2.0), ((playful)), a [eyes:0.7], (curious)
In this case, “cat” has increased attention and emphasis, “playful” has increased attention, “eyes” has decreased attention (factor of 0.7), and “curious” has default attention.
The order of modifiers matters. For example, `(word:1.5)` would increase attention by a factor of 1.5, but `((word))` would increase attention by a factor of 1.21.
Final Verdict:
In conclusion, feel free to experiment with different elements and adjust the prompt’s wording based on the model’s outputs.
Also, experiment with different combinations of modifiers based on your specific requirements.
If anything I have missed out, please feel free to let me know by commenting below.
Hi there! I’m Zaro, the passionate mind behind aienthusiastic.com. With a background in Electronics Science, I’ve had the privilege of delving deep into AI and ML. And this blog is my platform to share my enthusiasm with you.