Post

Differences in prompt formats based on LLM models.

Let's identify the differences in prompt formats based on LLM models.

Differences in prompt formats based on LLM models.

In this post, I present the results of experiments conducted using several models registered on Hugging Face, leveraging ChatBrainAI().
General prompts (PromptTemplate) have different formats depending on the model, meaning they need to be managed accordingly. I will compare the prompt formats of three models.
Conversational prompts (ChatPromptTemplate) are categorized by roles (system, user, assistant), so the prompt formats for most models seem to be similar.
To conclude, while the models respond well to PromptTemplate, the results for ChatPromptTemplate do not seem as favorable. It appears that non-conversational LLMs do not respond effectively. I plan to update this post further as I gain more experience in the future.

Selecting Prompt Format Based on LLM Model

Each model has its own corresponding template format, which needs to be recognized and addressed accordingly. The differences in prompts across three models are outlined below. As mentioned earlier, I plan to focus primarily on ChatOllama(). I will explain it in the next post. The LangChain setup is the same, so I will omit it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
llm = ChatBrainAI('gemma-2-2b-it')
template = """<start_of_turn>system
    You are a friendly AI assistant. Your name is DS2Man. Please answer questions briefly.
    <end_of_turn>
    <start_of_turn>user
    {question}
    <end_of_turn>
    <start_of_turn>model
"""
suffix = "<start_of_turn>model"

llm = ChatBrainAI('Llama-3.2-3B-Instruct')
template = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a friendly AI assistant. Your name is DS2Man. Please answer questions briefly.
<|eot_id|><|start_header_id|>user<|end_header_id|>{question}
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
suffix = "<|end_header_id|>"

llm = ChatBrainAI('EEVE-Korean-Instruct-2.8B-v1.0')
template = """You are a friendly AI assistant. Your name is DS2Man. Please answer questions briefly.
Human: {question}
Assistant:
"""
suffix = "Assistant:"
1
2
3
4
5
6
7
8
9
10
11
12
prompt = PromptTemplate.from_template(
    template
) 

chain = prompt | llm

question = "What is the capital of the United States?"
response = chain.stream({"question": question})
stream_response(response)

# response = chain.invoke({"question": question})
# invoke_response(response, suffix)
1
The capital of the United States is Washington, D.C.

Understanding ChatPromptTemplate

It is a template primarily used for chat-based conversational models, with roles categorized as system, user, and assistant. In particular, prompts based on conversational content are mainly implemented using ChatPromptTemplate.from_message(). To conclude, while the models respond well to PromptTemplate, the results for ChatPromptTemplate do not seem as favorable. It appears that non-conversational LLMs do not respond effectively.

ModelResponseNotes
gemma-2-2b-it-
Llama-3.2-3B-InstructxRepeatedly asking the same question internally.
EVE-Korean-Instruct-2.8B-v1.0xA lot of unnecessary answers are generated.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from ai.local.langchain_brainai import ChatBrainAI, stream_response, invoke_response
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a friendly AI assistant. Your name is DS2Man. Please answer questions briefly."),
        ("human", "{question}\nAnswer:")
    ]
)

chain = prompt | llm

question = "What is the capital of the United States?"
response = chain.stream({"question": question})
stream_response(response)

# response = chain.invoke({"question": question})
# invoke_response(response, "Answer:")
1
The capital of the United States is Washington, D.C.
This post is licensed under CC BY 4.0 by the author.