Skip to content

feat(search): Add indexes joining #5604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

BagritsevichStepan
Copy link
Contributor

@BagritsevichStepan BagritsevichStepan commented Aug 2, 2025

Adds indexes joining based on hash algorithm. Later we will integrate this into the FT.AGGREGATE command.

This PR includes #5592 and #5603 . So, real number of lines is not so big 😄

dranikpg
dranikpg previously approved these changes Aug 2, 2025
Copy link
Contributor

@dranikpg dranikpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by adding index and field names to the game you made your implementation a lot harder. Just by using some preprocessing (not on the data, but on the meta, read O(1)) you can you can simplify your code and get rid of many (costly) conversions. You can also simplify your containers - suddenly there was no need for all the custom wrappers.

I'll approve and its for you to decide if to go ahead if the strings - I'd personally not

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
@BagritsevichStepan
Copy link
Contributor Author

In the previous PR, I implemented it using indexes, but then switched to strings, thinking it would simplify the logic. However that introduced unnecessary overhead. I have reverted it back, so you can review the changes. MapsList is no longer needed, at least for the join logic.

Also, instead of relying on key names, we decided to use a pair of (shard_id, doc_id). From this information, the key name can be derived when needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants