Skip to content

Conversation

@vishal-bala
Copy link
Collaborator

@vishal-bala vishal-bala commented Dec 5, 2025

This PR adds an interface for using the new FT.HYBRID search operation introduced in Redis Open Source 8.4.0.

Changes

redisvl.query.hybrid

This new module defines query constructors for working with the new hybrid search operation. Attempting to import this module without having redis-py>=7.1.0 will raise an ImportError. It defines a HybridQuery object which can be used to define a hybrid search operation in a format largely similar to what AggregateHybridSearch provides.

Under the hood, redis-py expects three separate configuration objects to run hybrid search - one for the text+vector searches, one for the score combination, and one for post-processing of the results (e.g. aggregations, loading, limiting, etc.). The HybridSearch class manages the definition of all three.

# old
from redisvl.query.aggregate import AggregateHybridQuery

query = AggregateHybridQuery(
    text="medical professional",
    text_field_name="description",
    vector=[0.1, 0.1, 0.5, ...],
    vector_field_name="user_embedding",
    alpha=0.7,  # LINEAR only
    return_fields=["user", "job"]
)

results = index.query(query)

# new
from redisvl.query.hybrid import HybridQuery

query = HybridQuery(
    text="medical professional",
    text_field_name="description",
    vector=[0.1, 0.1, 0.5, ...],
    vector_field_name="user_embedding",
    combination_method="LINEAR",
    linear_alpha=0.3,  # NOTE: The linear combination is defined inconsistently between the two approaches
    return_fields=["user", "job"]
)

results = index.hybrid_search(query)
results = await async_index.hybrid_search(query)

SearchIndex method

For redis-py>=7.1.0, the SearchIndex and AsyncSearchIndex classes define a hybrid_search method that takes a HybridQuery instance as an argument. For redis-py<7.1.0 the method raises a NotImplementedError. The method returns a list of retrieved dictionaries.

Additional Changes

  • Fixed a typo in a test for AggregateHybridQuery
  • redis-py 7.0.0 introduced a change to the type of Query._fields, changing it from a tuple to a list - a test had to be updated to differentiate the expectation based on the redis-py version.

@vishal-bala vishal-bala self-assigned this Dec 5, 2025
Copy link
Collaborator

@nkanu17 nkanu17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have forgotten to export HybridQuery from redisvl.query.hybrid in the redisvl.query __init__.py

from redisvl.query.hybrid import HybridQuery  # this works
from redisvl.query import HybridQuery  # this will fail

Basically because there are now two different HybridQuery classes

  • redisvl.query.aggregate.HybridQuery (old wrapper around AggregateHybridQuery)
  • redisvl.query.hybrid.HybridQuery (new one)

This might cause some confusion and we might have to either choose a different name or find a better solution and deprecate the older one. Might change the name of the older class

Copy link
Collaborator

@nkanu17 nkanu17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall!
Just need to verify how we need to handle the deprecated HybridQuery class
Don't forget to update the api/sphinx documentation and add a guide in ai-resources

@vishal-bala vishal-bala marked this pull request as ready for review December 9, 2025 12:58
except Exception as e:
raise RedisSearchError(f"Unexpected error while searching: {str(e)}") from e

def hybrid_search(self, query: Any, **kwargs) -> List[Dict[str, Any]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working with this a bit more now I'm not sure that I love having a separate method for hybrid queries. I would kind of like to just pass a HybridQuery to the standard .query method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. we should not need to break the searchindex interface

Copy link
Collaborator

@justin-cechmanek justin-cechmanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of great work here! Let's work together to see how we can move this closer to our current query patterns.

except Exception as e:
raise RedisSearchError(f"Unexpected error while searching: {str(e)}") from e

def hybrid_search(self, query: Any, **kwargs) -> List[Dict[str, Any]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the query() method that accepts all our different query objects types. Let's use that as the way to run all our queries. We can do a query type check in query() like we do now and change this to a private method to call from query()

except Exception as e:
raise RedisSearchError(f"Unexpected error while searching: {str(e)}") from e

async def hybrid_search(self, query: Any, **kwargs) -> List[Dict[str, Any]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same change here. Let's instead expand our async query() method to something like

async def query(self, query: Union[BaseQuery, AggregationQuery]) -> List[Dict[str, Any]]:
  if isinstance(query, AggregationQuery):
     return self._aggregate(query)
  elif isinstance(HybridQuery):
     return self._hybrid_search(query)
  else:
     return self._query(query)

)
except ImportError:
raise ImportError(
"Hybrid queries require Redis Open Source >= 8.4.0 and redis-py>=7.1.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Hybrid queries require Redis Open Source >= 8.4.0 and redis-py>=7.1.0"
"Hybrid queries require Redis >= 8.4.0 and redis-py>=7.1.0"

This works on enterprise Redis too :)

text_filter_expression: Optional[Union[str, FilterExpression]] = None,
yield_text_score_as: Optional[str] = None,
vector_search_method: Optional[Literal["KNN", "RANGE"]] = None,
knn_k: Optional[int] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our other query objects, including RangeQuery and VectorRangeQuery we have the num_results parameter. We can add consistency by using the same parameter here.

range_radius: Optional[int] = None,
range_epsilon: Optional[float] = None,
yield_vsim_score_as: Optional[str] = None,
vector_filter_expression: Optional[Union[str, FilterExpression]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having both a text_filter_expression and vector_filter_expression feels redundant. Can we change this to only use one filter_expression parameter like our other Query classes?

Comment on lines +81 to +82
linear_alpha: Optional[float] = None,
linear_beta: Optional[float] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having an alpha and beta parameter looks redundant, and from a user's point of view it's not clear how they would interact. Feels like a foot gun

Comment on lines +303 to +304
linear_alpha: Optional[float] = None,
linear_beta: Optional[float] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my previous comment, can we reduce this to one parameter only?

Comment on lines +341 to +342
text_filter_expression=filter_expression,
vector_filter_expression=filter_expression,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned earlier, but it's not clear to a user how or why to pass two filter expressions.

Comment on lines +790 to +816
def test_hybrid_query_with_both_text_and_vector_filters():
"""Test HybridQuery with both text_filter_expression and vector_filter_expression."""
text_filter = Tag("category") == "movies"
vector_filter = Tag("genre") == "comedy"

hybrid_query = HybridQuery(
text=sample_text,
text_field_name="description",
vector=sample_vector,
vector_field_name="embedding",
text_filter_expression=text_filter,
vector_filter_expression=vector_filter,
)

# Verify both filters are in the query
args = get_query_pieces(hybrid_query)
assert args == [
"SEARCH",
"(~@description:(toon | squad | play | basketball | gang | aliens) AND @category:{movies})",
"SCORER",
"BM25STD",
"VSIM",
"@embedding",
bytes_vector,
"FILTER",
"@genre:{comedy}",
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments about having two filter_expressions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants