Post

Understanding `Hit` and `Hits` in PyMilvus

Let's Understand `Hit` and `Hits` in PyMilvus Search Results

Understanding `Hit` and `Hits` in PyMilvus

When using Milvus through the Python client pymilvus, one of the most common operations you’ll perform is a vector similarity search. After issuing a search query, the results are returned as Hit and Hits objects.

In this post, we’ll clarify what pymilvus.client.search_result.Hit and Hits are, how to work with them, and what developers should keep in mind.

What Are Hit and Hits?

In PyMilvus, when you perform a search() operation, the result is not just a plain list of dictionaries or raw data. Instead, PyMilvus wraps the results in structured classes to help you easily navigate the data.

Hit: A Single Search Result

A Hit represents a single matched entity in the result of a vector search. It contains:

  • id: The primary key (or vector ID) of the matched entity.
  • distance: The similarity score between the query vector and this result.
  • entity: The actual data (if you’ve enabled output fields like output_fields=["field_name"] in your query).
1
2
3
from pymilvus.client.search_result import Hit

print(f"ID: {hit.id}, Distance: {hit.distance}, Entity: {hit.entity}")

Hits: A Collection of Search Results

A Hits object is a list-like container that holds all Hit objects corresponding to one query vector. If you search with multiple query vectors, the result will be a list of Hits - one for each query vector:

1
2
3
4
5
6
results = collection.search(...)
# results: List[Hits]

for hits in results:  # Each `hits` corresponds to one query vector
    for hit in hits:  # Each `hit` is an individual Hit object
        print(hit.id, hit.distance)

Checking the type

You may want to normalize the result shape:

1
2
3
4
5
6
from pymilvus.client.search_result import Hit, Hits

if isinstance(result, Hit):
    result = [result]
elif isinstance(result, Hits):
    result = list(result)

This ensures result is always a list of Hit objects - which is easier to process consistently.

Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
PROCESSMAP_COSINE_REF = float(os.environ["MILVUS_RADIUS"])
LIMIT_CNT = 3
vector_res = client.search(
    collection_name=COLLECTION_NAME,
    data=vector_embedding, # Basic ANN Search
    output_fields=fields,
    limit=LIMIT_CNT,
)
result = vector_res[0] if vector_res[0] else []

from pymilvus.client.search_result import Hit, Hits
if isinstance(result, Hit):
    result = [result]
elif isinstance(result, Hits):
    result = list(result) 

rtn_msg = ""
for idx, item in enumerate(result):
    rtn_msg += textwrap.dedent(f"""
        -------
        News: {item['entity']["News"]}
        Similarity: {(item['distance']*100):0.2f}%
        Topic: {item['entity']["Topic"]}
        Date: {item['entity']["Date"]}
        NewsPaper: {item['entity']["NewsPaper"]}            
        URL: {item['entity']["metadata"]["url"]}
    """)
print(rtn_msg)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-------
News: Jannik Sinner (World No. 1, Italy) clinched the men's singles title at the US Open tennis tournament.
Similarity: 67.33%
Topic: US Open Victory
Date: 20240910
NewsPaper: Yonhap News            
URL: localhost:1234

-------
News: Jannik Sinner (World No. 1, Italy) clinched the men's singles title at the US Open tennis tournament.
Similarity: 67.33%
Topic: 2024 US Open Victory
Date: 20240909
NewsPaper: MBC            
URL: localhost:1234

-------
News: Jannik Sinner dominated men's tennis in 2024, securing the ATP Finals title.
Similarity: 51.68%
Topic: ATP Finals Victory
Date: 20241113
NewsPaper: Chosun Ilbo            
URL: localhost:1234
This post is licensed under CC BY 4.0 by the author.