-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FT.SEARCH gives incomplete result when used with EVAL (Silent error, potentially hard to spot, but hopefully easy to fix) #3706
Comments
@BagritsevichStepan Have you already found the reason why the FT.SEARCH call result is so unexpectedly different depending on whether it is called from lua or sent as direct command? Many thanks for your efforts |
Hi @BagritsevichStepan and @romange, Looking through the code (didn't set it up in IDE to test it yet) I found that in doc_index.cc the search function uses the correct response class "SearchResult" . doc_index.cc (Click here to expand relevant code part of doc_index.cc)SearchResult ShardDocIndex::Search(const OpArgs& op_args, const SearchParams& params,
search::SearchAlgorithm* search_algo) const {
auto& db_slice = op_args.GetDbSlice();
auto search_results = search_algo->Search(&*indices_, params.limit_offset + params.limit_total);
if (!search_results.error.empty())
return SearchResult{facade::ErrorReply{std::move(search_results.error)}};
SearchFieldsList fields_to_load =
ToSV(params.ShouldReturnAllFields() ? params.load_fields : params.return_fields);
vector<SerializedSearchDoc> out;
out.reserve(search_results.ids.size());
size_t expired_count = 0;
for (size_t i = 0; i < search_results.ids.size(); i++) {
auto key = key_index_.Get(search_results.ids[i]);
auto it = db_slice.FindReadOnly(op_args.db_cntx, key, base_->GetObjCode());
if (!it || !IsValid(*it)) { // Item must have expired
expired_count++;
continue;
}
auto accessor = GetAccessor(op_args.db_cntx, (*it)->second);
SearchDocData doc_data;
if (params.ShouldReturnAllFields()) {
/*
In this case we need to load the whole document or loaded fields.
For JSON indexes it would be {"$", <the whole document as string>}
*/
doc_data = accessor->SerializeDocument(base_->schema);
SearchDocData loaded_fields = accessor->Serialize(base_->schema, fields_to_load);
doc_data.insert(std::make_move_iterator(loaded_fields.begin()),
std::make_move_iterator(loaded_fields.end()));
} else {
/* Load only specific fields */
doc_data = accessor->Serialize(base_->schema, fields_to_load);
}
auto score = search_results.scores.empty() ? monostate{} : std::move(search_results.scores[i]);
out.push_back(SerializedSearchDoc{string{key}, std::move(doc_data), std::move(score)});
}
return SearchResult{search_results.total - expired_count, std::move(out),
std::move(search_results.profile)};
} This suggests the problem is likely in the data layer between the FtSearch implementation and lua. The Problem suggests => FT.SEARCH is called correctly:When we execute
However, when we run the command through Lua, we're not getting this structure:
This discrepancy suggests that:
Suggested Changes => Fix for data layer RedisTranslator:Assuming the class RedisTranslator is the responsible layer to make the data correctly accessible from lua, I think this might be where it can be fixed. 1. Handling the Total Count:
Explanation: This change ensures that when the FT.SEARCH command returns, the total number of documents (which is half the length of the result array due to key-value pairs) is set as the first element of the result table in Lua. This matches the expected format where the count is at the start. 2. Preserving the Structure:
This COULD be be the root cause: Disclaimer: I haven't tested the code and didn't set it up for local debugging yet. Please let me know if my assumptions are correct or let me know a more likely reason for the output difference. Thanks a lot |
Hi @CodeToFreedom, thank you for reporting this. I was unable to reproduce the error. Does this error occur consistently? |
Hi @BagritsevichStepan, DF Cloud ->Wrong behaviour-On a fresh test df cloud instance the error (wrong output) occurs consistently. DF local docker instance ->Wrong behaviourOn a fresh local docker instance (using the currently newest docker image 1.24.0) the error occurs consistently. Local Dragonfly Source Build->CORRECT BEHAVIOUR (!!)-When building from source and running it locally (newest version 1.25.0) it works correctly. DB commands to reproduce:
Direct FT.SEARCH command works correctly (see details above): Check the same via lua and it shows the bug:
ConclusionThis leads to the conclusion that either something was fixed in v.1.25.0 or the internal state is handled differently on the local source. When docker v1.25.0 is released and/or deployed to DF cloud instances we'll know for sure whether it got resolved. Btw, is the source build using a debug mode despite using
Please let me know if you continue to have problems reproducing the error and I send a video. (Hopefully the bug doesn't occur anymore in all environments when all environments are testable with 1.25.0) |
I ran version 1.24 locally and caught a crash. With version 1.25, everything works fine. So I believe the issue was fixed in 1.25. Please try version 1.25, and if the error still persists, please reopen the issue |
Hi @BagritsevichStepan, df-lua-bug.mp4Only when building from source the error is not present. Many thanks for your efforts. |
Thank you! Have you tried running |
@BagritsevichStepan Thanks for reopening it. Video: The same tests run on locally built df-instance v1.25.0 df-lua-from-source.mp4 |
Yes, I can reproduce this bug |
@adiholden @BagritsevichStepan Can you share the status of what you found out so far? It seems strange that the bug exists in the docker version and dragonflydb cloud instance but not in the locally built version. |
During our migration tests to dragonflydb we came across a quite dangerous silent error which is quite hard to spot as incomplete result.
In short: The lua-script output of FT.SEARCH is cut off and therefore different from the normal FT.SEARCH output.
This could lead to potentially dangerous scenarios where users rely on the output to be correct which it is not.
Steps to reproduce:
-Create test data index:
FT.CREATE idx:testindex ON JSON PREFIX 1 testindex: SCHEMA $.pk AS pk TAG $.color AS color TAG
-Insert test data:
Without EVAL/lua the FT.SEARCH command gives the correct result in dragonflydb + redis:
Wrong output format and incomplete result:
The same command via EVAL/lua script is incorrect in dragonflydb though:
Expected result (equal to redis and regular df FT.SEARCH command):
The lua script result should be the exact same output that the FT.SEARCH command above gives.
Here the correct output of redis instance:
Dragonfly version used: Most recent version [v1.22.1]
Please let me know in case something is unclear and I try to help as good as possible.
Many thanks for your efforts.
The text was updated successfully, but these errors were encountered: