Understanding `Hit` and `Hits` in PyMilvus
Let's Understand `Hit` and `Hits` in PyMilvus Search Results
When using Milvus through the Python client pymilvus
, one of the most common operations you’ll perform is a vector similarity search. After issuing a search query, the results are returned as Hit
and Hits
objects.
In this post, we’ll clarify what pymilvus.client.search_result.Hit
and Hits
are, how to work with them, and what developers should keep in mind.
What Are Hit and Hits?
In PyMilvus, when you perform a search()
operation, the result is not just a plain list of dictionaries or raw data. Instead, PyMilvus wraps the results in structured classes to help you easily navigate the data.
Hit: A Single Search Result
A Hit
represents a single matched entity in the result of a vector search. It contains:
id
: The primary key (or vector ID) of the matched entity.distance
: The similarity score between the query vector and this result.entity
: The actual data (if you’ve enabled output fields likeoutput_fields=["field_name"]
in your query).
1
2
3
from pymilvus.client.search_result import Hit
print(f"ID: {hit.id}, Distance: {hit.distance}, Entity: {hit.entity}")
Hits: A Collection of Search Results
A Hits
object is a list-like container that holds all Hit
objects corresponding to one query vector. If you search with multiple query vectors, the result will be a list of Hits
- one for each query vector:
1
2
3
4
5
6
results = collection.search(...)
# results: List[Hits]
for hits in results: # Each `hits` corresponds to one query vector
for hit in hits: # Each `hit` is an individual Hit object
print(hit.id, hit.distance)
Checking the type
You may want to normalize the result shape:
1
2
3
4
5
6
from pymilvus.client.search_result import Hit, Hits
if isinstance(result, Hit):
result = [result]
elif isinstance(result, Hits):
result = list(result)
This ensures result
is always a list of Hit
objects - which is easier to process consistently.
Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
PROCESSMAP_COSINE_REF = float(os.environ["MILVUS_RADIUS"])
LIMIT_CNT = 3
vector_res = client.search(
collection_name=COLLECTION_NAME,
data=vector_embedding, # Basic ANN Search
output_fields=fields,
limit=LIMIT_CNT,
)
result = vector_res[0] if vector_res[0] else []
from pymilvus.client.search_result import Hit, Hits
if isinstance(result, Hit):
result = [result]
elif isinstance(result, Hits):
result = list(result)
rtn_msg = ""
for idx, item in enumerate(result):
rtn_msg += textwrap.dedent(f"""
-------
News: {item['entity']["News"]}
Similarity: {(item['distance']*100):0.2f}%
Topic: {item['entity']["Topic"]}
Date: {item['entity']["Date"]}
NewsPaper: {item['entity']["NewsPaper"]}
URL: {item['entity']["metadata"]["url"]}
""")
print(rtn_msg)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-------
News: Jannik Sinner (World No. 1, Italy) clinched the men's singles title at the US Open tennis tournament.
Similarity: 67.33%
Topic: US Open Victory
Date: 20240910
NewsPaper: Yonhap News
URL: localhost:1234
-------
News: Jannik Sinner (World No. 1, Italy) clinched the men's singles title at the US Open tennis tournament.
Similarity: 67.33%
Topic: 2024 US Open Victory
Date: 20240909
NewsPaper: MBC
URL: localhost:1234
-------
News: Jannik Sinner dominated men's tennis in 2024, securing the ATP Finals title.
Similarity: 51.68%
Topic: ATP Finals Victory
Date: 20241113
NewsPaper: Chosun Ilbo
URL: localhost:1234