How to prompt AI for better images for online articles

Erwin SOTIRI
Apr 30, 2024
8 min read

Updated: May 27, 2024

A guide for lawyers and other non-professional content creators

Images are powerful tools to capture the attention and interest of readers, as well as to convey the main message and tone of an article. However, not all images are equally effective, and some can even detract from the quality and credibility of the content. This is especially true when using AI to generate images from text prompts, as the results can be unpredictable, generic, or irrelevant. It is therefore essential to learn how to prompt AI for better images for online webinars, presentations and articles.

Why is prompting AI important and what are the common challenges?

Prompting AI is the process of providing a text input to an AI system that can generate images from text, such as OpenAI's DALL-E or Stable Diffusion. The text input can be a word, a phrase, a sentence, or a paragraph, depending on the system and the desired output. The text input serves as a guide or a hint for the AI to produce an image that matches the intended meaning and style.

Prompting AI is important because it can help us create images that are more relevant, specific, and creative for our articles, without relying on stock photos or generic icons. However, prompting AI is also challenging, because the AI system may not always understand the text input the same way we do, or may not be able to generate an image that satisfies our expectations. Some of the common challenges are:

Lack of imagination: The AI system may generate images that are too literal, simplistic, or clichéd, such as shields, coins, symbols, or cartoons, that do not express anything other than the lack of ideas.
Lack of relevance: The AI system may generate images that are unrelated, inappropriate, or misleading for the article's topic, tone, or audience, such as images that are too abstract, humorous, or offensive.
Lack of quality: The AI system may generate images that are low-resolution, blurry, distorted, or inconsistent, such as images that are pixelated, noisy, or have mismatched colors or styles.

How to use context, keywords, and modifiers to guide the AI?

To overcome these challenges, we need to learn how to prompt the AI more effectively, using context, keywords, and modifiers. These are elements that we can add to our text input to provide more information, direction, and variation for the AI to generate better images. Let's look at each of them in more detail.

Context

The context offers essential background to help the AI understand the depth and implications of the article. For an article discussing new regulations in financial technology, the context could be: "An exploration of recent changes in financial regulations affecting cryptocurrency exchanges in the European Union."

Keywords

Keywords are the main words or phrases that describe the image we want to generate. Keywords can be nouns, verbs, adjectives, or adverbs, that can help the AI narrow down the possible images that fit the text input. Keywords signal the key elements of the image that the AI needs to focus on. For an article on financial regulations, the keywords might include: "cryptocurrency, regulation, EU, compliance, technology."

Modifiers

Modifiers are the additional words or phrases that modify the keywords or the image we want to generate. Modifiers can be colors, shapes, sizes, styles, or any other attributes that can help the AI adjust the appearance or the mood of the image. Suitable modifiers for this topic could be: "formal, sophisticated, high-tech, intricate, legal."

Integrating these elements

For example you might believe that a long phrase is fairly specific for a prompt, such as:

"Create an image that reflects the sophisticated and intricate environment of cryptocurrency regulation within the EU. The image should portray a high-tech, formal setting emphasizing compliance and legal oversight, capturing the essence of advanced financial technology in a regulatory context."

The above prompt provides the following image:

A minimal description of context — Generic prompt result

It is better than a few words prompt, however it is still too generic. A better prompt would be :

“Envision a futuristic EU regulatory hub where fintech innovators and regulators converge. Depict a sleek, modern space with:

Holographic displays, AR interfaces, and minimalist workstations
Diverse professionals engaged in collaborative discussions, surrounded by real-time data streams, blockchain visualizations, and AI-driven risk assessments
Subtle nods to EU institutions, such as the European Parliament's hemicycle or the European Central Bank's euro symbol
Calming blues and whites, punctuated by gold and silver accents, evoking luxury, innovation, and precision
Futuristic lighting effects and curved lines to evoke fluidity and forward-thinking

Capture the essence of advanced financial technology in a regulatory context, conveying the EU's commitment to innovation, stability, and security in the cryptocurrency landscape.”

That prompt would provide with the following generations on Midjourney:

An advanced and descriptive prompt result — A slightly better prompt result

The main difference is that the second prompt was generated from the first prompt using artificial intelligence. Of course, you can ask AI to improve your prompts. The prompt can be watered down in one phrase, either through AI or manually such as :

"Futuristic writer with AI collaboration amidst image concepts, set against a cityscape at midday daylight"

On the other hand Midjourney has a peculiar recognisable style which many may not like. However, the same prompt can be inserted in other AI image tools including ChatGPT:

This method ensures that the images generated are not only relevant but also enhance the article's ability to convey complex legal and regulatory concepts effectively, making them more accessible to readers interested in financial technology law.

Crafting an image free from the quirks of AI and random artefacts often requires a few attempts—a task somewhat akin to wrestling with a Word 2010 page layout, but arguably with less swearing involved. Optimistically, we might see improvements... say, sometime next year?

How to evaluate and refine the generated images?

Once we have prompted the AI with our text input, we can expect to see a range of images that the AI has generated based on our input. However, not all of them may be suitable or satisfactory for our article. Therefore, we need to evaluate and refine the generated images, using the following RSCQ criteria:

Relevance: The image should be relevant to the article's topic, tone, and audience. It should convey the main message and emotion of the article, and not distract or confuse the readers. For example, if the article is about the benefits of meditation, the image should not show something that is unrelated, inappropriate, or misleading, such as a person playing video games, a person in pain, or a person in a war zone.
Specificity: The image should be specific to the article's content and context. It should not be too generic, vague, or ambiguous, such as a shield, a coin, a symbol, or a cartoon. It should also not be too similar to other images that are commonly used for the same or similar topics, such as a person sitting cross-legged, a lotus flower, or a sunset.
Creativity: The image should be creative and original, showing something that is not obvious, expected, or clichéd. It should surprise, intrigue, or delight the readers, and make them want to read the article. For example, if the article is about the benefits of meditation, the image could show something that is unusual, unexpected, or novel, such as a person meditating underwater, a person meditating with animals, or a person meditating in space.
Quality: The image should be high-quality and clear, showing the details and features of the image. It should not be low-resolution, blurry, distorted, or inconsistent, such as pixelated, noisy, or having mismatched colors or styles. It should also not have any errors, glitches, or artifacts, such as missing parts, overlapping elements, or unnatural shapes.

To refine the generated images, we can use the following methods:

Re-prompting: We can re-prompt the AI with a different text input, using different context, keywords, or modifiers, or adding more information, direction, or variation. For example, if the image is too generic, we can add more specific keywords or modifiers, such as "meditation with headphones, green, oval, small, abstract, relaxing".
Re-generating: We can re-generate the images with the same text input, using a different AI system, model, or parameter, or changing the number, size, or format of the images. For example, if the image is too low-quality, we can use a higher-resolution, a larger-size, or a different-format image, such as PNG, JPEG, or SVG.
Re-selecting: We can re-select the images from the range of images that the AI has generated, using a different criterion, preference, or feedback. For example, if the image is too similar to other images, we can choose a more creative, original, or novel image, or ask for a second opinion from a colleague, a friend, or a reader.

How to avoid common pitfalls and mistakes?

Finally, we need to be aware of some common pitfalls and mistakes that can affect the quality and effectiveness of the images we generate with AI. Here are some tips to avoid them:

Do not copy and paste the text from the article as the text input for the AI. This can result in very generic, irrelevant, or misleading images, as the AI may not be able to extract the main keywords or modifiers from the text, or may generate images based on minor or irrelevant details. Instead, use a summary, a title, or a main point of the article as the context, and add the keywords and modifiers that describe the image you want to generate.
Do not use too many or too few words in the text input for the AI. This can result in very complex, cluttered, or confusing images, or very simple, boring, or empty images, as the AI may not be able to balance the elements or the space in the image, or may generate images that are too literal or too abstract. Instead, use a moderate amount of words, around 10 to 20, that provide enough information, direction, and variation for the AI to generate a better image.
Do not use words that are too vague, ambiguous, or subjective in the text input for the AI. This can result in very inconsistent, unpredictable, or unsatisfactory images, as the AI may not be able to interpret the meaning or the style of the words, or may generate images that do not match your expectations or preferences. Instead, use words that are more clear, specific, and objective, that can help the AI generate an image that is more relevant, specific, and creative.

Why write about this subject?

A year ago, while hosting webinars on effective prompting techniques for AI tools like Midjourney and Stable Diffusion, I encountered scepticism from many colleagues. They doubted the necessity of learning these skills, viewing them as irrelevant to their professional needs. Fast forward to today, and the landscape has drastically changed. The digital content sphere is now flooded with articles accompanied by subpar illustrations that fail to enhance the text, and in some cases, diminish its value.

The truth is, not everyone possesses innate artistic talent, but this should not deter us. Effective AI-generated illustrations are within reach for those who invest time in mastering the art of precise and imaginative prompting. It's not just about the technical know-how; it’s about sparking curiosity and a willingness to experiment and learn.

There's no simple remedy for a lack of interest or curiosity. However, embracing these tools can significantly enhance the quality and impact of our digital narratives. Let’s commit to learning and improving our AI prompting skills to ensure our articles are visually compelling and add true value to our written words.

For those interested in elevating their content with AI, start by exploring the potential of creative and contextually appropriate prompts. It’s an investment in your professional growth and the quality of your output in the increasingly visual world of content creation.

In this document, we tried to give hints on how to prompt AI to generate better images for articles, using context, keywords, and modifiers, as well as how to evaluate and refine the generated images, and how to avoid common pitfalls and mistakes. We hope that this guide will help you create more engaging, informative, and attractive articles, using the power and potential of AI.