Prompt Engineering

What is Prompt Engineering?

LLMs, such as OpenAI GTP, Mistral and Claude 3, provide a simple ‘text-in, text-out interface’, allowing users to program AI models by prompts - a combination of instructions and examples to produce the desired results. This practice of authoring effective prompts is called Prompt Engineering.

Write clear and specific instructions

Writing clear and specific instructions is often the most effective way to improve model performance, but it is often the most overlooked. Similar to asynchronous communications with humans, you need to be more specific with the requirements to eliminate ambiguities compared to face-to-face communications. For example, ‘Tell me something about Alan Turning’ is not very specific, while ‘Tell me about Alan Tuning’s contribution to computer science.’ will likely yield the desired results. A few principles can make your prompts clearer, precise, and more specific.

Use delimiters

Use delimiters such as triple quotes, triple backticks, triple dashes, angle brackets, or XML tags to indicate different parts of the prompts.

This can also prevent prompt injection attacks.

Ask for structured output

You can ask the model to produce structured output, such as HTML and JSON, to make the results usable by other tools. Structured output also means the model can work with rules that are inherently easier to follow. For example, an HTML snippet is more straightforward to validate than an ‘engaging article’.

Check whether conditions are satisfied

Asking the model to check the conditions will help prevent the completion attempt that would otherwise cause hallucinations or errors. For example, prompting, ‘If the text does not contain any instructions, output “No instruction provided.”’ will stop the model from making things up when the input text does not contain valid instructions.

Zero-shot prompting

Modern LLMs are trained on a vast amount of data and are often capable of performing complex tasks zero-shot. In this use case, users do not need to provide examples - the models can already perform these tasks through previous training.

Instruction turning, which includes examples to fine-tune the models, can improve zero-shot learning.

Few-shot prompting

For complex tasks in which zero-shot is not sufficient, few-shot training can be used to provide a few examples and in-context training to improve output quality.

Fine-turning

Due to prompts’ token context limits, cramming a lot of examples in a single prompt (few-shot learning) may be impractical and expensive. Instead, we can use fine-tuning to pre-train a model with as many examples as needed to achieve better results. Once a model is fine-turned, you don’t need to include as many examples in your prompts. This lowers costs and latency due to short prompts.

Chain-of-thought (CoT)

Chain-of-thought (CoT) is an effective way to improve LLMs’ ability to reason a complex idea. It gives the model time to think and break down the tasks into smaller steps. When combined with zero-shot and few-shot techniques, CoT allows users to provide examples to guide the AI models in breaking down a complex task into manageable steps and arrive at the correct answers step by step, otherwise beyond the models’ capacity. You can also specify the exact steps in detail or the output structure to aid the model in completing the task.

With many LLMs, adding “Let’s think it step by step” to the end of an existing prompt is enough to make the model think more logically and methodologically.

Update @ 25/02/2025:

Nowadays, most SToA LLMs are trained to use CoT by default. Leading AI companies have introduced a new class of models (reasoning models) that use CoT as their main selling point, such as OpenAI ‘o’ models, Google Gemini 2, and Anthropic Claude 3.7.

Break one big prompt into smaller & simpler prompts

Like Chain of Thoughts (CoT), you can tackle challenging tasks by breaking big prompts into smaller, simple steps. This is how humans handle large and complex tasks: break a task down into actionable chunks.

Providing contextual information

Trained with existing data, models may not always be up-to-date on recent knowledge, E.g., who gave a talk about AI in WDCC 1581? In such situations, LLMs will happily make up something convincing. Although we can instruct models to answer truthfully, ‘I do not know.’, it’s not always reliable as models tend to lie on such occasions. A better approach is to provide extra contextual information to help the model answer the question. When the context is short, we can include the context in the prompt directly when it is short.

When the context is too large to fit into a single prompt, RAG can be used to automate and improve the context generation.

Generated-knowledge prompting

Another approach to improve LLMs’ output accuracy is to provide generated knowledge.

Instead of using embeddings or manually crafted context, we can ask AI models to generate knowledge and then include the generated knowledge in the prompt before the prediction happens. In this way, we can improve the quality of the responses without having to curate context manually.

Give conversational AI models personalities

Assigning identities and personalities to AI models is an effective technique to improve the prompts and output quality when working with LLMs, as it narrows down the AI model to perform in a confined social or situational context and only output text suitable to that desired context.

For example, specifying the agent’s level of knowledge in a specific field and the circumstances in which agents would engage with users. One example would be, ‘You are an experienced SEO specialist with 25 years of experience optimising UK news websites. You always use white-hat SEO techniques to ensure the structure of the websites is excellent for information discovery.

Developing prompt iteratively

Getting a prompt to do precisely what you want the first time is rare. Employing a logical process to develop a prompt iteratively is the key to achieving more with LLMs. There is no such thing as The 30 Perfect Prompts for XXX you may see flooding the Internet.

If a prompt does not work as expected, try the techniques listed in this note with each iteration. E.g., if the output is too long, give clear instructions such as ‘Use at most 50 words.’; if the model has difficulty getting the tone of voice right, provide examples or fine-turn the model.

Use LLMs to generate prompts

Numerous studies have demonstrated (Zhou et al., 2023) that LLMs excel as prompt engineers. They can craft human-level prompts that are specific, comprehensive, and highly effective. In addition, LLMs are proven to be good students who take human feedback and can improve their prompt-generating skills quickly. This is especially useful for agentic workflows that rely on long-term memory to improve their output quality. In this scenario, new information can be used to train the LLMs to keep improving the application LLM instructions, creating ‘self-improving’ applications.

References

DAIR.AI (2023). Prompting Techniques – Nextra. [online] Available at: https://www.promptingguide.ai/techniques [Accessed 28 Mar. 2023].
Zhou, Y., Andrei Ioan Muresanu, Han, Z., Paster, K., Silviu Pitis, Chan, H. and Ba, J. (2023). Large Language Models are Human-Level Prompt Engineers. [online] OpenReview. Available at: https://openreview.net/forum?id=92gvk82DE-.

Liwen's Notes

Explorer

Recent Notes

How to Read Effectively

LangChain

Mixture of Expert (MoE)

The `NextRequest` and `NextResponse` Objects

LLM Explainability & Interpretability

Accountability & Motivation

Reasoning Models

Gen AI: The Future Is Agentic