Post

Comparison Between Models.

Let's compare between models registered in Ollama

Comparison Between Models.

I plan to build all projects using ChatOllama(). So, It would be useful to note the VRAM usage of the models registered in Ollama. In this post, I am writing to keep a record of the usage history when using different models. This will be continuously updated, so please refer to it.

Comparison Between Models

AttributeModelParamSize
(GB)
VRAM Usage
(GB)
agent
tools
visonthinkingarchitecturequantization
GeneralLlama-3.2-3B-Instruct--9.6-----
Generalgemma-2-2b-it--9.8-----
Generalpolyglot-ko-1.3b--6.1-----
Ollamallama3.1:latest8b4.96.0--llamaQ4_K_M
Ollamallama3.2:latest3b2.03.2--llamaQ4_K_M
Ollamallama3.2-vision:latest11b6.010.0--llamaQ4_K_M
Ollamagemma2:latest9b5.48.4---gemma2Q4_0
Ollamagemma3:latest4b3.34.5--gemma3Q4_K_M
OllamaPetrosStav/gemma3-tools:12b12b8.17.0--gemma3Q4_K_M
Ollama 12b8.17.0--gemma3Q4_K_M
Ollamadeepseek-r1:14b14b9.08.9-qwen2Q4_K_M
Ollamadeepseek-r1:7b7b4.75.2-qwen2Q4_K_M
Ollamaqwen3:latest8b5.26.4-qwen3Q4_K_M
Ollamaqwen3:14b14b9.310-qwen3Q4_K_M

You can check key information about a model using the ollama show command. The quantization details are especially important.
As the use of Large Language Models (LLMs) rapidly increases, model compression and optimization have become critical. In particular, in edge environments with limited server resources or services that require real-time responses, it’s often not feasible to use large-scale models as-is. In such cases, quantization is a technique that can significantly reduce both inference speed and memory usage, while preserving as much model accuracy as possible. I’ll update this post with further insights after trying it out myself.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
PS C:\Users\ycjang> ollama show gemma3:12b
  Model
    architecture        gemma3
    parameters          12.2B
    context length      8192
    embedding length    3840
    quantization        Q4_K_M

  Parameters
    stop           "<end_of_turn>"
    temperature    0.1

  License
    Gemma Terms of Use
    Last modified: February 21, 2024

PS C:\Users\ycjang>
This post is licensed under CC BY 4.0 by the author.